Volume 29 Issue 6
Dec.  2020
Turn off MathJax
Article Contents
XU Qianyi, QIN Guihe, SUN Minghui, YAN Jie, JIANG Huiming, ZHANG Zhonghan. Feature Fusion Based Hand Gesture Recognition Method for Automotive Interfaces[J]. Chinese Journal of Electronics, 2020, 29(6): 1153-1164. doi: 10.1049/cje.2020.06.008
Citation: XU Qianyi, QIN Guihe, SUN Minghui, YAN Jie, JIANG Huiming, ZHANG Zhonghan. Feature Fusion Based Hand Gesture Recognition Method for Automotive Interfaces[J]. Chinese Journal of Electronics, 2020, 29(6): 1153-1164. doi: 10.1049/cje.2020.06.008

Feature Fusion Based Hand Gesture Recognition Method for Automotive Interfaces

doi: 10.1049/cje.2020.06.008
Funds:  This work is supported by the National Natural Science Foundation of China under Grant (No.61872164) and the Program of Science and Technology Development Plan of Jilin Province of China under Grant (No.20190302032GX).
More Information
  • Corresponding author: SUN Minghui (corresponding author) received the Ph.D. degree in computer science from Kochi University of Technology, Japan, in 2011. He is currently an assistant professor in the college of computer science and technology in Jilin University, China. He is interested in using HCI methods to solve challenging real world computing problems in many areas, including multimodal interface (tactile modality), pen-based interface and tangible interface. (Email:smh@jlu.edu.cn)
  • Received Date: 2019-10-30
  • Publish Date: 2020-12-25
  • Hand gesture recognition on the depth videos is a promising approach for automotive interfaces because it is less sensitive to light variation and more accurate than other traditional methods. However, video gestures recognition is still a challenging task since lots of interferences are induced by the uncorrelated gesture factors. Considering that if the displays are more relevant, the results will more accurate, so ResNext, a kind of compact and efficient neural network, is firstly used as feature extractor, then an improved weighted frame unification method is adopted to obtain the key frame samples, finally the Discriminant correlation analysis (DCA) is employed to fuse features for static data and dynamic data after conducting Feature embedding branch (FEB) on static data. The public dataset named Depth based gesture recognition database (DGRD) is used in this paper, but the dataset is a little small and the class distribution is largely imbalance, and we find the performance of ResNext degrades badly in the condition of imbalance problem although it achieves excellent result at sufficient training data. In order to conquer the disadvantages of limited dataset, a special loss function scheme combining the softmax loss and dice loss is proposed. Evaluation of the algorithm performances in comparison with other state-of-the-art methods indicates that the proposed method is more practical for gesture recognition and may be widely adopted by automotive interfaces.
  • loading
  • Starner. T, Weaver. J and Pentland. A, "Real-time American sign language recognition using desk and wearable computer based video", IEEE Transactions on Pattern Analysis & Machine Intelligence, Vol.20, No.12, pp.1371-1375, 1998.
    Nanda. H and Fujimura. K, "Visual tracking using depth data", Proceedings of Computer Vision & Pattern Recognition Workshop, Washington, DC, USA, pp.37-37, 2005.
    Yang. C, Jang. Y, Beh. J, et al., "Gesture recognition using depth-based hand tracking for contactless controller application", Proceedings of IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, pp.297-298, 2012.
    Ohn-Bar. E, Tawari. A. Martin. S and Trivedi. M. M, "Predicting driver maneuvers by learning holistic features", Proceedings of IEEE Intelligent Vehicles Symposium, Dearborn, MI, USA, pp.719-724, 2014.
    Tran. C, Trivedi. M. M, Chapter 30 Vision for driver assistance:Looking at people in a vehicle, Visual Analysis of Humans, San Diego, USA, pp.597-614, 2011.
    Mitra. S and Acharya. T, "Gesture Recognition:A Survey", IEEE Transactions on Systems, Vol.37, No.3, pp.311-324, 2007.
    Trindade. P, Lobo. J and Barreto. J. P, "Hand gesture recognition using color and depth images enhanced with hand angular pose data", Proceedings of IEEE Conference on Multisensor Fusion & Integration for Intelligent Systems, Hamburg, Germany, pp.71-76, 2012.
    Jr. Joseph, "An introduction to 3D gestural interfaces", ACM SIGGRAPH 2014 Courses, New York, NY, USA, Vol.25, pp.1-42, 2014.
    S. B. Wang, A. Quattoni, L. P. Morency, et al., "Hidden conditional random fields for gesture recognition", Proceedings of CVPR 2006, New York, NY, USA, pp.1521-1527, 2006.
    N. Dardas and N. D. Georganas, "Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques", IEEE Transactions on Instrumentation and Measurement, Vol.60, No.11, pp.3592-3607, 2011.
    B. Ma, Z. Liu, F. Jiang, Y. Yan, J. Yuan and S. Bu, "Vehicle detection in aerial images using rotation-invariant cascaded forest", IEEE Access, Vol.7, pp.59613-59623, 2019.
    P. Molchanov, S. Gupta, K. Kim and J. Kautz, "Hand gesture recognition with 3D convolutional neural networks", Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, pp.1-7, 2015.
    Althoff. F, Lindl. R, Walchshausl. L, Hoch. S, "Robust multimodal hand-and head gesture recognition for controlling automotive infotainment systems", VDI BERICHTE, Vol.1919, pp.187, 2005.
    Paradaloira. F, Gonzalezagulla. E and Albacastro. J. L, " Hand gestures to control infotainment equipment in cars", Proceedings of the 2014 IEEE Conference on Intelligent Vehicles Symposium, Detroit, MI, USA, pp.1-6, 2014.
    Zobl. M, Nieschulz. R, Geiger. M, Lang. M, Rigoll. G, "Gesture components for natural interaction with in-car devices", Proceedings of Gesture-Based Communication in Human Computer Interaction, Genova, Italy, pp.448-459, 2004.
    Neverova. N, Wolf. C, Taylor. G. W, Nebout. W, "Multiscale deep learning for gesture detection and localization", Proceedings of ECCV 2014, Zurich, Switzerland, pp.474-490, 2014.
    Ohn-Bar. E and Trivedi. M, "Hand gesture recognition in real time for automotive interfaces:A multimodal vision-based approach and evaluations, IEEE Transactions on Intelligent Transportation Systems, Vol.15, No.6, pp.2368-2377, 2014.
    P. Molchanov, S. Gupta, K. Kim and K. Pulli, "Multi-sensor system for driver's hand-gesture recognition", Proceedings of 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, pp.1-8, 2015.
    Ciresan. D. C, Meier. U, Masci. J, et al., "Flexible, High performance convolutional neural networks for image classification", Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, pp.1237-1242, 2011.
    Krizhevsky. A, Sutskever. I, Hinton. G. E, "ImageNet classification with deep convolutional neural networks", Proceedings of NIPS, Los Angeles, USA, pp.1097-1105, 2012.
    Simard. P. Y, Steinkraus D, Platt. J. C, "Best practices for convolutional neural networks applied to visual document analysis", Proceedings of 7th International Conference on Document Analysis and Recognition (ICDAR 2003), Edinburgh, Scotland, UK, pp.958-963, 2003.
    Dan. C, Meier. U and Schmidhuber. J, "Multi-column deep neural networks for image classification", Proceedings of Computer Vision & Pattern Recognition (CVPR), Rhode Island, USA, pp.3642-3649, 2012.
    Karpathy. A, Toderici. G, Shetty. S, et al., "Large-scale video classification with convolutional neural networks", Proceedings of Computer Vision & Pattern Recognition (CVPR), Columbus, USA, pp.1725-1732, 2014.
    Ji. S, Xu. W, Yang. M, et al., "3D convolutional neural networks for human action recognition", Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.35, No.1, pp.221-231, 2013.
    Carreira. J, Zisserman. A, Quo Vadis, "Action recognition? A new model and the kinetics dataset", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii, USA, pp.4724-4733, 2017.
    He. K, Zhang. X, Ren. S, Sun. J, "Deep residual learning for image recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, pp.770-778, 2016.
    Xie. S, Girshick. R, Dollár. Piotr, et al., "Aggregated residual transformations for deep neural networks", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii, USA, pp.1492-1500, 2017.
    He. K, Zhang. X, Ren. S, Sun. J, "Identity mappings in deep residual networks", Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, Netherlands, pp.630-645, 2016.
    Krizhevsky. A, Sutskever. I, Hinton. G. E, "ImageNet classification with deep convolutional neural networks", Proceedings of NIPS, Los Angeles, USA, pp.1090-1098, 2012.
    Miao. Q, Li. Y, Ouyang. W, et al., "Multimodal gesture recognition based on the ResC3D network", Proceedings of 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), Venice, Italy, pp.3047-3055, 2017.
    T. Brox, A. Bruhn, N. Papenberg, J. Weickert, "High accuracy optical flow estimation based on a theory for warping", Proceedings of the European Conference on Computer Vision (ECCV), Prague, Czech Republic, pp.25-36, 2004.
    Mantecón. Tomás, Del Blanco. C. R. Jaureguizar. F, et al., Depth based gesture recognition database, https://sites.google.com./site/depthgestrecog/, 2014.
    Mantec®n. Tom¢s, Del Blanco. C. R. Jaureguizar. F, et al., "New generation of human machine interfaces for controlling UAV through depth-based gesture recognition", Proceedings of Spie Defense + Security and Sensing, Baltimore, MD, USA, pp.1-11, 2014.
    Sinha. R. S, Lee. S. M, Rim. M, Hwang. S. H, "Data augmentation schemes for deep learning in an indoor positioning", Electronics, Vol.8, No.5, pp.554-554, 2019.
    Qionghai. Dai, Jiamin. Wu, Jingtao. Fan, Feng. Xu, Xun. Cao, "Recent advances in computational photography", Chinese Journal of Electronics, Vol.28, No.1, pp.5-9, 2019.
    Amirul. Islam. M, Rochan. M, Bruce. N. D. B, Wang. Y, "Gated feedback refinement network for dense image labeling", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hawaii, USA, pp.3751-3759, 2017.
    Zhang. Z, Zhang. X, Peng. C, et al., "ExFuse:Enhancing feature fusion for semantic segmentation", Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, pp.269-284, 2018.
    Liu. C, H. Wechsler, "A shape-and texture-based enhanced Fisher classifier for face recognition", IEEE Transactions on Image Processing, Vol.10, No.4, pp.598-608, 2001.
    Yang. J, J. Y, Yang, "Generalized K-L transform based combined feature extraction", Pattern Recognition, Vol.35, No.1, pp.295-297, 2002.
    Yang. J, J. Y. Yang, Zhang. D, J. F. Lu, "Feature fusion:Parallel strategy vs. serial strategy", Pattern Recognition, Vol.36, No.6, pp.1369-1381, 2003.
    Q. S. Sun, S. G. Zeng, Liu. Y, P. A. Heng, D. S. Xia, "A new method of feature fusion and its application in image recognition", Pattern Recognition, Vol.38, No.12, pp.2437-2448, 2005.
    J. Yang, X. Zhang, "Feature-level fusion of fingerprint and finger vein for personal identification", Pattern Recognition Letter, Vol.33, No.5, pp.623-628, 2012.
    Y. Bi, M. Lv, Y. Wei, N. Guan, W. Yi, "Multi-feature fusion for thermal face recognition", Infrared Physics & Technology, Vol.77, pp.366-374, 2016.
    Zhang. Z, Sabuncu. M. R, "Generalized cross entropy loss for training deep neural networks with noisy labels", Proceedings of 32nd Conference on Neural Information Processing Systems (NIPS), Montreal, Canada, arXiv:1805.07836, 2018.
    Wang. H, Wang. Y, Zhou. Z, et al., "CosFace:Large margin cosine loss for deep face recognition", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, pp.5265-5274, 2018.
    Lin. T. Y, Goyal. P, Girshick. R, He. K, Dollar. P, "Focal loss for dense object detection", Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp.2980-2988, 2017.
    Shen. C, Roth. H. R, Oda. H, et al., "On the influence of DICE loss function in multi-class organ segmentation of abdominal CT using 3D fully convolutional networks", IEICE, arXiv:1801.05912, 2018.
    Chang. H. H, Zhuang. A. H, Valentino. D. J, et al., "Performance measure characterization for evaluating neuroimage segmentation algorithms", NeuroImage, Vol.47, No.1, pp.122-135, 2009.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (167) PDF downloads(49) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return