Achieving Safe Deep Reinforcement Learning via Environment Comprehension Mechanism

PENG Pai; ZHU Fei; LIU Quan; ZHAO Peiyao; WU Wen

doi:10.1049/cje.2021.07.025

PENG Pai, ZHU Fei, LIU Quan, ZHAO Peiyao, WU Wen. Achieving Safe Deep Reinforcement Learning via Environment Comprehension Mechanism[J]. Chinese Journal of Electronics, 2021, 30(6): 1049-1058. DOI: 10.1049/cje.2021.07.025

Citation:

Achieving Safe Deep Reinforcement Learning via Environment Comprehension Mechanism

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Deep reinforcement learning (DRL), which combines deep learning with reinforcement learning, has achieved great success recently. In some cases, however, during the learning process agents may reach states that are worthless and dangerous where the task fails. To address the problem, we propose an algorithm, referred as Environment comprehension mechanism (ECM) for deep reinforcement learning to attain safer decisions. ECM perceives hidden dangerous situations by analyzing object and comprehending the environment, such that the agent bypasses inappropriate actions systematically by setting up constraints dynamically according to states. ECM, which calculates the gradient of the states in Markov tuple, sets up boundary conditions and generates a rule to control the direction of the agent to skip unsafe states. ECM is able to be applied to basic deep reinforcement learning algorithms to guide the selection of actions. The experiment results show that the algorithm promoted safety and stability of the control tasks.