General multi-agent reinforcement learning integrating adaptive manoeuvre strategy for real-time multi-aircraft conflict resolution

Chen, YutongHu, MinghuaYang, LeiXu, YanXie, Hua2023-06-212023-06-212023-04-12Chen Y, Hu M, Yang L, et al., (2023) General multi-agent reinforcement learning integrating adaptive manoeuvre strategy for real-time multi-aircraft conflict resolution, Transportation Research Part C: Emerging Technologies, Volume 151, June 2023, Article Number 1041250968-090Xhttps://doi.org/10.1016/j.trc.2023.104125https://dspace.lib.cranfield.ac.uk/handle/1826/19827Reinforcement learning (RL) techniques are under investigation for resolving conflict in air traffic management (ATM), exploiting their computational capabilities and ability to cope with flight uncertainty. However, the limitations of generalisation make it difficult for existing RL-based conflict resolution (CR) methods to be effective in practice. This paper proposes a general multi-agent reinforcement learning (MARL) method that integrates an adaptive manoeuvre strategy to enhance both the solution’s efficiency and the model’s generalisation in multi-aircraft conflict resolution (MACR). A partial observation approach based on the imminent threat detection sectors is used to gather critical environmental information, enabling the model to be applied in arbitrary scenarios. Agents are trained to provide the correct flight intention (such as increasing speed and yawing to the left), while an adaptive manoeuvre strategy generates the specific manoeuvre (speed and heading parameters) based on the flight intention. To address flight uncertainty and performance challenges caused by the intrinsic non-stationarity in MARL, a warning area for each aircraft is introduced. We employ a state-of-the-art Deep Q-learning Network (DQN) method, Rainbow DQN, to improve the efficiency of the RL algorithm. The multi-agent system is trained and deployed in a distributed manner to adapt to real-world scenarios. A sensitivity analysis of uncertainty levels and warning area sizes is conducted to explore their impact on the proposed method. Simulation experiments confirm the effectiveness of the training and generalisation of the proposed method.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Air traffic managementMulti-aircraft conflict resolutionMulti-agent reinforcement learningDeep q-learning networkGeneralisationUncertaintyGeneral multi-agent reinforcement learning integrating adaptive manoeuvre strategy for real-time multi-aircraft conflict resolutionArticle1879-2359