Citation: | HUANG Fei, LI Guangxia, WANG Haichao, et al., “Navigation for UAV Pair-Supported Relaying in Unknown IoT Systems with Deep Reinforcement Learning,” Chinese Journal of Electronics, vol. 31, no. 3, pp. 416-429, 2022, doi: 10.1049/cje.2021.00.305 |
Unmanned aerial vehicles (UAVs) have recently been regarded as a promising technology in Internet of things (IoT). UAVs functioned as intermediate relay nodes are capable of establishing uninterrupted and high-quality communication links between remotely deployed IoT devices and the destination. Multiple UAVs are required to be deployed due to their limited onboard energy. We study a UAV pair-supported relaying in unknown IoT systems, which consists of transmitter and receiver. Our goal is that transmitter gathers the data from each device then transfers the information to receiver, and receiver finally transmits the information to the destination, while meeting the constraint that the amount of information received from each device reaches a certain threshold. This is an optimization problem with highly coupled variables, such as trajectories of transmitter and receiver. On account of no prior knowledge of the environment, a dueling double deep Q network (dueling DDQN) algorithm is proposed to solve the problem. Whether it is in the phase of transmitter’s receiving information or the phase of transmitter’s forwarding information to receiver, the effectiveness and superiority of the proposed algorithm is demonstrated by extensive simulationsin in comparison to some base schemes under different scenarios.
[1] |
S. Hayat, E. Yanmaz, and R. Muzaffar, “Survey on unmanned aerial vehicle networks for civil applications: A communications viewpoint,” IEEE Commun. Surveys Tuts., vol.18, no.4, pp.2624–2661, 2016. doi: 10.1109/COMST.2016.2560343
|
[2] |
Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications with unmanned aerial vehicles: Opportunities and challenges,” IEEE Commun. Mag., vol.54, no.5, pp.36–42, 2016. doi: 10.1109/MCOM.2016.7470933
|
[3] |
X. Xi, X. Cao, P. Yang, et al., “Network resource allocation for eMBB payload and URLLC control information communication multiplexing in a multi-UAV relay network,” IEEE Transactions on Communications, vol.69, no.3, pp.1802–1817, 2021. doi: 10.1109/TCOMM.2020.3042970
|
[4] |
M. Erdelj, E. Natalizio, K. R. Chowdhury, et al., “Help from the sky: Leveraging UAVs for disaster management,” IEEE Pervasive Computing, vol.16, no.1, pp.24–32, 2017. doi: 10.1109/MPRV.2017.11
|
[5] |
P. Grippa, D. A. Behrens, C. Bettstetter, et al., “Job selection in a network of autonomous UAVs for delivery of goods,” arXiv preprint, arXiv:1604.04180, 2017.
|
[6] |
M. Funaki and N. Hirasawa, “Outline of a small unmanned aerial vehicle (Ant-Plane) designed for Antarctic research,” Polar Science, vol.2, no.2, pp.129–142, 2008. doi: 10.1016/j.polar.2008.05.002
|
[7] |
Zhenyu Xiao, Lipeng Zhu, and Xiang-Gen Xia, “UAV communications with millimeter-wave beamforming: Potentials, scenarios, and challenges,” China Communications, vol.17, no.9, pp.147–166, 2020. doi: 10.23919/JCC.2020.09.012
|
[8] |
Wu Q, Ding G, Xu Y, et al., “Cognitive internet of things: A new paradigm beyond connection,” IEEE Internet of Things Journal, vol.1, no.2, pp.129–143, 2014. doi: 10.1109/JIOT.2014.2311513
|
[9] |
R. Lee, J. Manner, Kim J, et al., “White paper: The role of deployable aerial communications architecture in emergency communications and recommended next steps,” Federal Communications Commission, Washington, DC, White Paper, DOC-309742A1, 2011.
|
[10] |
D. Orfanus, E. P. de Freitas, and F. Eliassen, “Self-organization as a supporting paradigm for military UAV relay networks,” IEEE Commun. Lett., vol.20, no.4, pp.804–807, 2016. doi: 10.1109/LCOMM.2016.2524405
|
[11] |
X. Chen, Z. Feng, Z. Wei, et al., “Performance of joint sensing-communication cooperative sensing UAV network,” IEEE Transactions on Vehicular Technology, vol.69, no.12, pp.15545–15556, 2020. doi: 10.1109/TVT.2020.3042466
|
[12] |
Y. Chen, W. Feng, and G. Zheng, “Optimum placement of UAV as relays,” IEEE Commun. Lett., vol.22, no.2, pp.248–251, 2018. doi: 10.1109/LCOMM.2017.2776215
|
[13] |
G. C. S. Cruz and P. M. M. Encarnação, “Obstacle avoidance for unmanned aerial vehicles,” Journal of Intelligent & Robotic Systems, vol.65, no.1, pp.203–217, 2012.
|
[14] |
A. Singla, S. Padakandla, and S. Bhatnagar, “Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge,” IEEE Trans. Intell. Transport. Syst., vol.22, no.1, pp.107–118, 2021. doi: 10.1109/TITS.2019.2954952
|
[15] |
T. Gee, J. James, W. Van Der Mark, et al., “Lidar guided stereo simultaneous localization and mapping (SLAM) for UAV outdoor 3-D scene reconstruction,” in Proc. of 2016 Int. Conf. on Image and Vision Computing New Zealand, Palmerston North, New Zealand, pp.1−6, 2016.
|
[16] |
Li R, Liu J, Zhang L, et al., “LIDAR/MEMS IMU integrated navigation (SLAM) method for a small UAV in indoor environments,” in Proc. of 2014 DGON Inertial Sensors and Systems, Karlsruhe, Germany, pp.1−15, 2014.
|
[17] |
C. Fu, M. A. Olivares-Mendez, R. Suarez-Fernandez, et al., “Monocular visual-inertial SLAM-based collision avoidance strategy for fail-safe UAV using fuzzy logic controllers,” Journal of Intelligent & Robotic Systems, vol.73, no.1, pp.513–533, 2014.
|
[18] |
J. Israelsen, M. Beall, D. Bareiss, et al., “Automatic collision avoidance for manually tele-operated unmanned aerial vehicles,” in Proc. of 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, pp.6638−6643, 2014.
|
[19] |
K. Y. Chee and Z. W. Zhong, “Control, navigation and collision avoidance for an unmanned aerial vehicle,” Sensors and Actuators A: Physical, vol.190, pp.66–76, 2013. doi: 10.1016/j.sna.2012.11.017
|
[20] |
X. Z. Peng, H. Y. Lin, and J. M. Dai, “Path planning and obstacle avoidance for vision guided quadrotor UAV navigation,” in Proc. of 2016 12th IEEE International Conference on Control and Automation (ICCA), Kathmandu, Nepal, pp.984−989, 2016.
|
[21] |
Y. He, Z. Zhang, F. Richard Yu, et al., “Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks,” IEEE Trans. Veh. Technol., vol.66, no.11, pp.10433–10445, 2017. doi: 10.1109/TVT.2017.2751641
|
[22] |
Y. He, F. R. Yu, N. Zhao, et al., “Software-defined networks with mobile edge computing and caching for smart cities: A big data deep reinforcement learning approach,” IEEE Commun. Mag., vol.55, no.12, pp.31–37, 2017. doi: 10.1109/MCOM.2017.1700246
|
[23] |
N. Imanberdiyev, C. Fu, E. Kayacan, et al., “Autonomous navigation of UAV by using real-time model-based reinforcement learning,” in Proc. of 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV), Phuket, Thailand, pp.1−6, 2016.
|
[24] |
A. Faust, I. Palunko, Cruz P, et al., “Automated aerial suspended cargo delivery through reinforcement learning,” Artificial Intelligence, vol.247, pp.381–398, 2017. doi: 10.1016/j.artint.2014.11.009
|
[25] |
S. Ross, N. Melik-Barkhudarov, K. S. Shankar, et al., “Learning monocular reactive uav control in cluttered natural environments,” in Proc. of 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, pp.1765−1772, 2013.
|
[26] |
C. Wang, J. Wang, Y. Shen, et al., “Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach,” IEEE Trans. Veh. Technol., vol.68, no.3, pp.2124–2136, 2019. doi: 10.1109/TVT.2018.2890773
|
[27] |
Z. He and D. Wu, “Resource allocation and performance analysis of wireless video sensors,” IEEE Transactions on Circuits and Systems for Video Technology, vol.16, no.5, pp.590–599, 2006. doi: 10.1109/TCSVT.2006.873154
|
[28] |
G. Ding, Q. Wu, L. Zhang, et al., “An amateur drone surveillance system based on the cognitive internet of things,” IEEE Commun. Mag., vol.56, no.1, pp.29–35, 2018. doi: 10.1109/MCOM.2017.1700452
|
[29] |
Y. Zeng, R. Zhang, and T. J. Lim, “Throughput maximization for UAV-enabled mobile relaying systems,” IEEE Trans. Commun., vol.64, no.12, pp.4983–4996, 2016. doi: 10.1109/TCOMM.2016.2611512
|
[30] |
X. Jiang, Z. Wu, Z. Yin, et al., “Joint power and trajectory design for UAV-relayed wireless systems,” IEEE Wireless Commun. Lett., vol.8, no.3, pp.697–700, 2019. doi: 10.1109/LWC.2018.2885056
|
[31] |
S. Yin, Y. Zhao, L. Li, et al., “UAV-assisted cooperative communications with power-splitting information and power transfer,” IEEE Transactions on Green Communications and Networking, vol.3, no.4, pp.1044–1057, 2019. doi: 10.1109/TGCN.2019.2926131
|
[32] |
H. Wang, J. Wang, G. Ding, et al., “Spectrum sharing planning for full-duplex UAV relaying systems with underlaid D2D communications,” IEEE Journal on Selected Areas in Communications, vol.36, no.9, pp.1986–1999, 2018. doi: 10.1109/JSAC.2018.2864375
|
[33] |
G. Ding, J. Wang, Q. Wu, et al., “Cellular-base-station-assisted device-to-device communications in TV white space,” IEEE J. Select. Areas in Commun., vol.34, no.1, pp.107–121, 2016. doi: 10.1109/JSAC.2015.2452532
|
[34] |
W. G. Aguilar, G. A. Rodríguez, L. Álvarez, et al., “Visual SLAM with a RGB-D camera on a quadrotor UAV using on-board processing,” in Proc. of International Work-Conference on Artificial Neural Networks, Cádiz, Spain, pp.596−606, 2017.
|
[35] |
D. J. Lee, P. Merrell, Z. Wei, et al., “Two-frame structure from motion using optical flow probability distributions for unmanned air vehicle obstacle avoidance,” Machine Vision and Applications, vol.21, no.3, pp.229–240, 2010. doi: 10.1007/s00138-008-0148-9
|
[36] |
H. Alvarez, L. M. Paz, J. Sturm, et al., “Collision avoidance for quadrotors with a monocular camera,” Experimental Robotics, vol.109, pp.195–209, 2016.
|
[37] |
S. Dionisio-Ortega, L. O. Rojas-Perez, J. Martinez-Carranza, et al., “A deep learning approach towards autonomous flight in forest environments,” in Proc. of 2018 International Conference on Electronics, Communications and Computers, Cholula, Mexico, pp.139−144, 2018.
|
[38] |
Y. Zeng, X. Xu, and R. Zhang, “Trajectory design for completion time minimization in UAV-enabled multicasting,” IEEE Transactions on Wireless Communications, vol.17, no.4, pp.2233–2246, 2018. doi: 10.1109/TWC.2018.2790401
|
[39] |
F. Ono, H. Ochiai, and R. Miura, “A wireless relay network based on unmanned aircraft system with rate optimization,” IEEE Transactions on Wireless Communications, vol.15, no.11, pp.7699–7708, 2016. doi: 10.1109/TWC.2016.2606388
|
[40] |
J. Wang, C. Jiang, Z. Wei, et al., “Joint UAV hovering altitude and power control for space-air-ground IoT networks,” IEEE Internet Thing J., vol.6, no.2, pp.1741–1753, 2019. doi: 10.1109/JIOT.2018.2875493
|
[41] |
Q. Wu, G. Ding, J. Wang, et al., “Spatial-temporal opportunity detection for spectrum-heterogeneous cognitive radio networks: Two-dimensional sensing,” IEEE Transactions on Wireless Communications, vol.12, no.2, pp.516–526, 2013. doi: 10.1109/TWC.2012.122212.111638
|
[42] |
C. Zhan and Y. Zeng, “AerialɃground cost tradeoff for multi-UAV-enabled data collection in wireless sensor networks,” IEEE Transactions on Communications, vol.68, no.3, pp.1937–1950, 2020. doi: 10.1109/TCOMM.2019.2962479
|
[43] |
L. Zhu, J. Zhang, Z. Xiao, et al., “Millimeter-wave NOMA with user grouping, power allocation and hybrid beamforming,” IEEE Transactions on Wireless Communications, vol.18, no.11, pp.5065–5079, 5065.
|
[44] |
J. Wang, G. Ding, Q. Wu, et al., “Spatial-temporal spectrum hole discovery: A hybrid spectrum sensing and geolocation database framework,” Chinese Science Bulletin, vol.59, no.16, pp.1896–1902, 2014. doi: 10.1007/s11434-014-0287-5
|
[45] |
G. Caire, “On the ergodic rate lower bounds with applications to massive MIMO,” IEEE Transactions on Wireless Communications, vol.17, no.5, pp.3258–3268, 2018. doi: 10.1109/TWC.2018.2808522
|
[46] |
Y. Zeng, X. Xu, S. Jin, et al., “Simultaneous mavigation and radio mapping for cellular-connected UAV with deep reinforcement learning,” IEEE Transactions on Wireless Communications, vol.20, no.7, pp.4205–4220, 2021. doi: 10.1109/TWC.2021.3056573
|