Lightweight Object Detection Networks for UAV Aerial Images Based on YOLO

LI Yanshan; WANG Jiarong; ZHANG Kunhua; YI Jiawei; WEI Miaomiao; ZHENG Lirong; XIE Weixin

doi:10.23919/cje.2022.00.300

Article Contents

Article Navigation > Chinese Journal of Electronics > 2023 > Uncorrected proof

Yanshan LI, Jiarong WANG, Kunhua ZHANG, et al., “Lightweight Object Detection Networks for UAV Aerial Images Based on YOLO,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–13, xxxx doi: 10.23919/cje.2022.00.300

Citation:

Yanshan LI, Jiarong WANG, Kunhua ZHANG, et al., “Lightweight Object Detection Networks for UAV Aerial Images Based on YOLO,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–13, xxxx doi: 10.23919/cje.2022.00.300

Citation:

PDF( 5575 KB)

Lightweight Object Detection Networks for UAV Aerial Images Based on YOLO

doi: 10.23919/cje.2022.00.300

LI Yanshan^{1, 2
,
,},
WANG Jiarong^{1, 2
,},
ZHANG Kunhua^{1, 2
,},
YI Jiawei^{1, 2
,},
WEI Miaomiao^{1, 2
,},
ZHENG Lirong^{1, 2
,},
XIE Weixin^{1, 2
,}

1.
ATR National Key Lab. of Defense Technology, Shenzhen University, Shenzhen 518000, China
2.
Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen 518000, China

More Information

Author Bio:
Yanshan LI is an associate professor with the ATR National Key Laboratory of Defense Technology, Shenzhen University, Shenzhen, China. He received the M.Sc. degree from the Zhejiang University of Technology, Hangzhou, China, in 2005, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 2015. His research interests cover computer vision, machine learning and image analysis. (Email: lys@szu.edu.cn)

Jiarong WANG received the M.S. degree in College of Electronic and Information Engineering, Shenzhen University, Shenzhen, China, in 2022. His research interests include computer vision, deep learning and image processing. (Email: 2015130177@email.szu.edu.cn)

Kunhua ZHANG is an associate professor of College of electronics and information engineering of Shenzhen University, China. She received the Ph.D. degree from the Chinese Academy of Sciences in 2003. Her research interests include computer vision and image analysis.(Email: zhang_kh@szu.edu.cn)

Jiawei YI is is a current graduate student of College of Electronics and Information Engineering of Shenzhen University, China. Her research interests include computer vision, deep learning and image processing. (Email: 15007962908@163.com)

Miaomiao WEI is a current graduate student of College of Electronics and Information Engineering of Shenzhen University, China. Her research interests include computer vision, deep learning and image processing. (Email: 2210434094@email.szu.edu.cn)

Lirong ZHENG received the B.E. degree from College of Electronic and Information Engineering, Shenzhen University, Shenzhen, China, 2019. She is currently pursuing the Ph.D. degree from Shenzhen University, China. She is a member of the ATR National Key Laboratory of Defense Technology, Shenzhen University. Her research interests include intelligent information processing, video rocessing, and pattern recognition. (Email: zhenglirong2021@email.szu.edu.cn)

Weixin XIE received the degree from Xidian University, Xi’an. He was a Faculty Member with Xidian University in 1965. From 1981 to 1983, he was a Visiting Scholar at the University of Pennsylvania, USA. In 1989, he was a Visiting Professor with the University of Pennsylvania. He is currently with the School of Information Engineering, Shen zhen University, China. His research interests include intelligent information processing, fuzzy information processing, image processing, and pattern recognition. (Email: wxxie@szu.edu.cn)
Corresponding author: Email: lys@szu.edu.cn
Received Date: 2022-09-02
Accepted Date: 2023-08-07

Available Online: 2023-11-28

Abstract

Abstract

Existing high-precision object detection algorithms for UAV aerial images often have a large number of parameters and heavy weight, which makes it difficult to be applied to mobile devices. We propose three YOLO-based lightweight object detection networks for UAVs, named YOLO-L, YOLO-S and YOLO-M respectively. In YOLO-L, we adopt a deconvolution approach to explore suitable upsampling rules during training to improve the detection accuracy. Besides, the Convolution-Batch Normalization- SiLU activation function (CBS) structure is replaced with Ghost CBS to reduce the number of parameters and weight, meanwhile Maxpool maximum pooling operation is proposed to replace the CBS structure to avoid generating parameters and weight. YOLO-S greatly reduces the weight of the network by directly introducing CSPGhostNeck residual structures, so that the parameters and weight are respectively decreased by about 15% at the expense of 2.4% mAP. And YOLO-M adopts the CSPGhostNeck residual structure and deconvolution to reduce parameters by 5.6% and weight by 5.7%, while mAP only by 1.8%. The results show that the three lightweight detection networks proposed in this paper have good performance in UAV aerial image object detection task.
- UAV aerial images,
- Object detection,
- Deep learning,
- YOLO,
- Lightweight network

FullText(HTML)

References(22)

References

[1]	B. Rocke, A. Ruffell, and L. Donnelly, “Drone aerial imagery for the simulation of a neonate burial based on the geoforensic search strategy (GSS),” Journal of Forensic Sciences, vol. 66, no. 4, pp. 1506–1519, 2021. doi: 10.1111/1556-4029.14690
[2]	I. K. Hung, D. Unger, D. Kulhavy, et al., “Positional precision analysis of orthomosaics derived from drone captured aerial imagery,” Drones, vol. 3, no. 2, article no. 46, 2019. doi: 10.3390/drones3020046
[3]	U. Andriolo, G. Gonçalves, N. Rangel-Buitrago, et al., “Drones for litter mapping: An inter-operator concordance test in marking beached items on aerial images,” Marine Pollution Bulletin, vol. 169, article no. 112542, 2021. doi: 10.1016/j.marpolbul.2021.112542
[4]	H. Gupta and O. P. Verma, “Monitoring and surveillance of urban road traffic using low altitude drone images: A deep learning approach,” Multimedia Tools and Applications, vol. 81, no. 14, pp. 19683–19703, 2022. doi: 10.1007/s11042-021-11146-x
[5]	Y. S. Li, S. F. Chen, W. H. Luo, et al., “Hyperspectral image super-resolution based on spatial-spectral feature extraction network,” Chinese Journal of Electronics, vol. 32, no. 3, pp. 415–428, 2023. doi: 10.23919/cje.2021.00.081
[6]	A. Jain, R. Ramaprasad, P. Narang, et al., “Ai-enabled object detection in UAVs: Challenges, design choices, and research directions,” IEEE Network, vol. 35, no. 4, pp. 129–135, 2021. doi: 10.1109/MNET.011.2000643
[7]	P. Mittal, R. Singh, and A. Sharma, “Deep learning-based object detection in low-altitude UAV datasets: A survey,” Image and Vision Computing, vol. 104, article no. 104046, 2020. doi: 10.1016/j.imavis.2020.104046
[8]	G. Y. Tian, J. R. Liu, H. Zhao, et al., “Small object detection via dual inspection mechanism for UAV visual images,” Applied Intelligence, vol. 52, no. 4, pp. pp,4244–4257, 2022. doi: 10.1007/s10489-021-02512-1
[9]	R. Walambe, A. Marathe, and K. Kotecha, “Multiscale object detection from drone imagery using ensemble transfer learning,” Drones, vol. 5, no. 3, article no. 66, 2021. doi: 10.3390/drones5030066
[10]	Z. K. Li, X. L. Liu, Y. Zhao, et al., “A lightweight multi-scale aggregated model for detecting aerial images captured by UAVs,” Journal of Visual Communication and Image Representation, vol. 77 article no. 103058, 2021. doi: 10.1016/j.jvcir.2021.103058
[11]	Y. Wang, “Survey on deep multi-modal data analytics: Collaboration, rivalry, and fusion,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 17, no. 1s, article no. 10, 2021. doi: 10.1145/3408317
[12]	M. Sharma, M. Dhanaraj, S. Karnam, et al., “YOLOrs: Object detection in multimodal remote sensing imagery,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14 pp. 1497–1508, 2021. doi: 10.1109/JSTARS.2020.3041316
[13]	Y. S. Li, H. J. Tang, W. X. Xie, et al., “Multidimensional local binary pattern for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60 pp. 1–13, 2022. doi: 10.1109/TGRS.2021.3069505
[14]	A. G. Howard, M. L. Zhu, B. Chen, et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint, arXiv: 1704.04861, 2017.
[15]	M. Sandler, A. Howard, M. L. Zhu, et al., “MobileNetV2: Inverted residuals and linear bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 4510–4520, 2018.
[16]	A. Howard, M. Sandler, B. Chen, et al., “Searching for mobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 1314–1324, 2019.
[17]	X. Y. Zhang, X. Y. Zhou, M. X. Lin, et al., “ShuffleNet: An extremely efficient convolutional neural network for mobile devices,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 6848–6856, 2018.
[18]	N. N. Ma, X. Y. Zhang, H. T. Zheng, et al., “ShuffleNet V2: Practical guidelines for efficient CNN architecture design,” in 15th European Conference on Computer Vision, Munich, Germany, pp. 122–138, 2018.
[19]	Y. S. Li, L. D. Fan, and W. X. Xie, “TGSIFT: Robust SIFT descriptor based on tensor gradient for hyperspectral images,” Chinese Journal of Electronics, vol. 29, no. 5, pp. 916–925, 2020. doi: 10.1049/cje.2020.08.007
[20]	K. Han, Y. H. Wang, Q. Tian, et al., “GhostNet: More features from cheap operations,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 1577–1586, 2020.
[21]	G. Jocher, K. Nishimura, T. Mineeva, et al. “Yolov5,” Available at: https: //github. com/ultralytics/yolov5, 2020.
[22]	Y. S. Li, T. Y. Guo, X. Liu, et al., “Action status based novel relative feature representations for interaction recognition,” Chinese Journal of Electronics, vol. 31, no. 1, pp. 168–180, 2022. doi: 10.1049/cje.2020.00.088