Citation: | GUAN Qi, SHENG Zihao, XUE Shibei. “HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using Knowledge Distillation”. Chinese Journal of Electronics, vol. 32 no. 1. doi: 10.23919/cje.2021.00.211 |
[1] |
D. Xu, D. Anguelov, and A. Jain, “PointFusion: Deep sensor fusion for 3D bounding box estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp.244–253, 2018.
|
[2] |
Z. Sheng, S. Xue, Y. Xu, et al., “Real-time queue length estimation with trajectory reconstruction using surveillance data,” in Proceedings of 2020 16th International Conference on Control, Automation, Robotics and Vision (ICARCV), Shenzhen, China, pp.124–129, 2020.
|
[3] |
Z. Sheng, L. Liu, S. Xue, et al., “A cooperation-aware lane change method for autonomous vehicles,” arXiv preprint, arXiv: 2201.10746, 2022.
|
[4] |
Z. Sheng, Y. Xu, S. Xue, et al., “Graph-based spatial-temporal convolutional network for vehicle trajectory prediction in autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, early access, DOI: 10.1109/TITS.2022.3155749, 2022.
|
[5] |
E. Marchand, H. Uchiyama, and F. Spindler, “Pose estimation for augmented reality: A hands-on survey,” IEEE Trans. on Visualization and Computer Graphics, vol.22, no.12, pp.2633–2651, 2015. doi: 10.1109/TVCG.2015.2513408
|
[6] |
Y. Xiang, T. Schmidt, V. Narayanan, et al., “PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes,” in Proceedings of 2018 Robotics: Science and Systems Conference, Pittsburgh, Pennsylvania, USA, arXiv:1711.00199, 2018.
|
[7] |
N. Correll, K. E. Bekris, D. Berenson, et al., “Analysis and observations from the first amazon picking challenge,” IEEE Transactions on Automation Science and Engineering, vol.15, no.1, pp.172–188, 2018. doi: 10.1109/TASE.2016.2600527
|
[8] |
C. Wang, D. Xu, Y. Zhu, et al., “DenseFusion: 6D object pose estimation by iterative dense fusion,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, pp.3343–3352, 2019.
|
[9] |
Y. He, W. Sun, H. Huang, et al., “PVN3D: A deep point-wise 3D keypoints voting network for 6DoF pose estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp.11632–11641, 2020.
|
[10] |
David G Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the IEEE International Conference on Computer Vision, Kerkyra, Greece, vol.2, pp.1150–1157, 1999.
|
[11] |
S. Hinterstoisser, C. Cagniart, S. Ilic, et al., “Gradient response maps for real-time detection of textureless objects,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, no.5, pp.876–888, 2011. doi: 10.1109/TPAMI.2011.206
|
[12] |
M. Rad and V. Lepetit, “BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth,” in Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp.3848–3856, 2017.
|
[13] |
B. Tekin, S. N. Sinha, and P. Fua, “Real-time seamless single shot 6D object pose prediction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp.292–301, 2018.
|
[14] |
M. Oberweger, M. Rad, and V. Lepetit, “Making deep heatmaps robust to partial occlusions for 3D object pose estimation,” in Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, pp.119–134, 2018.
|
[15] |
S. Peng, Y. Liu, Q. Huang, et al., “PVNet: Pixel-wise voting network for 6DoF pose estimation,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, pp.4556–4565, 2019.
|
[16] |
V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An accurate O(n) solution to the PnP problem,” International Journal of Computer Vision, vol.81, no.2, pp.155–166, 2009. doi: 10.1007/s11263-008-0152-6
|
[17] |
J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” in Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp.6517–6525, 2017.
|
[18] |
J. Tremblay, T. To, B. Sundaralingam, et al., “Deep object pose estimation for semantic robotic grasping of household objects,” in Proceedings of the 2nd Conference on Robot Learning, Zurich, Switzerland, pp.306–316, 2018.
|
[19] |
G. Du, K. Wang, S. Lian, et al., “Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: A review,” Artificial Intelligence Review, vol.54, no.3, pp.1677–1734, 2021. doi: 10.1007/s10462-020-09888-5
|
[20] |
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint, arXiv: 1503.02531, 2015.
|
[21] |
H. Felix, W. M. Rodrigues, D. Macêdo, et al., “Squeezed deep 6DoF object detection using knowledge distillation,” in Proceedings of the International Joint Conference on Neural Networks, Glasgow, UK, pp.1–7, 2020.
|
[22] |
J. Wang, K. Sun, T. Cheng, et al., “Deep high-resolution representation learning for visual recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.43, no.10, pp.3349–3364, 2020. doi: 10.1109/TPAMI.2020.2983686
|
[23] |
E. Rublee, V. Rabaud, K. Konolige, et al., “ORB: An efficient alternative to SIFT or SURF,” in Proceedings of 2011 International Conference on Computer Vision, Barcelona, Spain, pp.2564–2571, 2011.
|
[24] |
W. Kehl, F. Manhardt, F. Tombari, et al., “SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again,” in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp.1530–1538, 2017.
|
[25] |
S. Zakharov, I. Shugurov, and S. Ilic, “DPOD: 6D pose object detector and refiner,” in Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea (South), pp.1941–1950, 2019.
|
[26] |
C. Song, J. Song, and Q. Huang, “HybridPose: 6D object pose estimation under hybrid representations,” in Proceeding of the Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, pp.428–437, 2020.
|
[27] |
M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol.24, no.6, pp.381–395, 1981. doi: 10.1145/358669.358692
|
[28] |
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proceedings of International Conference on Learning Representations, San Diego, CA, USA, pp.1–14, 2015.
|
[29] |
K. He, X. Zhang, S. Ren, et al., “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.770–778, 2016.
|
[30] |
Y. Zhang, D. Li, B. Jin, et al., “Monocular 3D reconstruction of human body,” in Proceedings of 2019 Chinese Control Conference (CCC), Guangzhou, China, pp.7889–7894, 2019.
|
[31] |
S. Jia, Z. Gan, Y. Xi, et al., “A deep reinforcement learning bidding algorithm on electricity market,” Journal of Thermal Science, vol.29, no.5, pp.1125–1134, 2020. doi: 10.1007/s11630-020-1308-0
|
[32] |
Y. Guan, D. Li, S. Xue, et al., “Feature-fusion-kernel-based gaussian process model for probabilistic long-term load forecasting,” Neurocomputing, vol.426, pp.174–184, 2021. doi: 10.1016/j.neucom.2020.10.043
|
[33] |
L. J. Ba and R. Caruana, “Do deep nets really need to be deep?” in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, Cambridge, MA, USA, pp.2654–2662, 2014.
|
[34] |
A. Romero, N. Ballas, S. E. Kahou, et al., “Fitnets: Hints for thin deep nets,” in Proceedings of International Conference on Learning Representations, San Diego, CA, USA, pp.1–12, 2015.
|
[35] |
S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” in Proceedings of International Conference on Learning Representations, Toulon, France, pp.1–13, 2017.
|
[36] |
J. Yim, D. Joo, J. Bae, et al., “A gift from knowledge distillation: Fast optimization, network minimization and transfer learning,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp.7130–7138, 2017.
|
[37] |
Y. Liu, C. Shu, J. Wang, et al., “Structured knowledge distillation for dense prediction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, early access, DOI: 10.1109/TPAMI.2020.3001940, 2020.
|
[38] |
S. Hinterstoisser, V. Lepetit, S. Ilic, et al., “Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes,” in Proceedings of 11th Asian Conference on Computer Vision, Daejeon, Korea, pp.548–562, 2012.
|
[39] |
J. Xiao, J. Hays, K. A. Ehinger, et al., “Sun database: Large-scale scene recognition from abbey to zoo,” in Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, pp.3485–3492, 2010.
|
[40] |
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proceedings of International Conference on Learning Representations, San Diego, CA, USA, pp 1–15, 2015.
|
[41] |
E. Brachmann, F. Michel, A. Krull, et al., “Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.3364–3372, 2016.
|
[42] |
Z. Li, G. Wang, and X. Ji, “CDPN: Coordinates-based disentangled pose network for real-time RGB-based 6-DoF object pose estimation,” in Proceedings of the International Conference on Computer Vision, Seoul, Korea (South), pp.7677–7686, 2019.
|
[43] |
G. Wang, F. Manhardt, F. Tombari, et al., “GDR-Net: Geometry-guided direct regression network for monocular 6D object pose estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, CA, USA, pp.16611–16621, 2021.
|