Some New Trends of Deep Learning Research
MENG Deyu, SUN Lina
Deep learning has been attracting increasing attention in the recent decade throughout science and engineering due to its wide range of successful applications. In real problems, however, most implementation stages for applying deep learning still require inevitable manual interventions, which naturally conducts difficulty in its availability to general users with less expertise and also deviates from the intelligence of humans. It is thus a challenging while critical issue to enhance the level of automation across all elements of the entire deep learning framework, like input amelioration, model designing and learning, and output adjustment. This paper tries to list several representative issues of this research topic, and briefly describe their recent research progress and some related works proposed along this research line. Some specific challenging problems have also been presented.
Face Liveness Detection Based on the Improved CNN with Context and Texture Information
GAO Chenqiang, LI Xindou, ZHOU Fengshun, MU Song
Face liveness detection, as a key module of real face recognition systems, is to distinguish a fake face from a real one. In this paper, we propose an improved Convolutional neural network (CNN) architecture with two bypass connections to simultaneously utilize low-level detailed information and high-level semantic information. Considering the importance of the texture information for describing face images, texture features are also adopted under the conventional recognition framework of Support vector machine (SVM). The improved CNN and the texture feature based SVM are fused. Context information which is usually neglected by existing methods is well utilized in this paper. Two widely used datasets are used to test the proposed method. Extensive experiments show that our method outperforms the state-of-the-art methods.
Generating Basic Unit Movements with Conditional Generative Adversarial Networks
LUO Dingsheng, NIE Mengxi, WU Xihong
Arm motion control is fundamental for robot accomplishing complicated manipulation tasks. Different movements can be organized by configuring a series of motion units. Our work aims at equipping the robot with the ability to carry out Basic unit movements (BUMs), which are used to constitute various motion sequences so that the robot can drive its hand to a desired position. With the definition of BUMs, we explore a learning approach for the robot to develop such an ability by leveraging deep learning technique. In order to generate the BUM regarding to the current arm state, an internal inverse model is developed. We propose to use Conditional generative adversarial networks (CGANs) to establish the inverse model to generate the BUMs. The experimental results on a humanoid robot PKU-HR6.0II illustrate that CGANs could successfully generate multiple solutions given a BUM, and these BUMs can be used to constitute further reaching movement effectively.
Multi-vision Attention Networks for on-Line Red Jujube Grading
SUN Xiaoye, MA Liyan, LI Gongyan
To solve the red jujube classification problem, this paper designs a convolutional neural network model with low computational cost and high classification accuracy. The architecture of the model is inspired by the multi-visual mechanism of the organism and DenseNet. To further improve our model, we add the attention mechanism of SE-Net. We also construct a dataset which contains 23,735 red jujube images captured by a jujube grading system. According to the appearance of the jujube and the characteristics of the grading system, the dataset is divided into four classes:invalid, rotten, wizened and normal. The numerical experiments show that the classification accuracy of our model reaches to 91.89%, which is comparable to DenseNet-121, InceptionV3, InceptionV4, and Inception-ResNet v2. Our model has real-time performance.
Multi-label Image Classification via Coarse-to-Fine Attention
LYU Fan, LI Linyan, Victor S. Sheng, FU Qiming, HU Fuyuan
Great efforts have been made by using deep neural networks to recognize multi-label images. Since multi-label image classification is very complicated, many studies seek to use the attention mechanism as a kind of guidance. Conventional attention-based methods always analyzed images directly and aggressively, which is difficult to well understand complicated scenes. We propose a global/local attention method that can recognize a multi-label image from coarse to fine by mimicking how human-beings observe images. Our global/local attention method first concentrates on the whole image, and then focuses on its local specific objects. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.