Swarm intelligence in cooperative environments: n-step dynamic tree search algorithm overview

Espinós Longa, MarcTsourdos, AntoniosInalhan, Gokhan2023-06-152023-06-152023-05-03Espinós Longa M, Tsourdos A, Inalhan G. (2023) Swarm intelligence in cooperative environments: n-step dynamic tree search algorithm overview. Journal of Aerospace Information Systems, Volume 20, Issue 7, July 2023, pp. 418-4252327-3097https://doi.org/10.2514/1.I011086https://dspace.lib.cranfield.ac.uk/handle/1826/19785Reinforcement learning tree-based planning methods have been gaining popularity in the last few years due to their success in single-agent domains, where a perfect simulator model is available: for example, Go and chess strategic board games. This paper pretends to extend tree search algorithms to the multiagent setting in a decentralized structure, dealing with scalability issues and exponential growth of computational resources. The n-step dynamic tree search combines forward planning and direct temporal-difference updates, outperforming markedly conventional tabular algorithms such as Q learning and state-action-reward-state-action (SARSA). Future state transitions and rewards are predicted with a model built and learned from real interactions between agents and the environment. This paper analyzes the developed algorithm in the hunter–pursuit cooperative game against stochastic and intelligent evaders. The n-step dynamic tree search aims to adapt single-agent tree search learning methods to the multiagent boundaries and is demonstrated to be a remarkable advance as compared to conventional temporal-difference techniques.enAttribution 4.0 Internationalhttp://creativecommons.org/licenses/by/4.0/Swarm intelligence in cooperative environments: n-step dynamic tree search algorithm overviewArticle