ROLE-ORIENTED MULTI-AGENT REINFORCEMENT LEARNING FOR AUTONOMOUS TASK ALLOCATION IN A HETEROGENEOUS UAV SWARM

Authors

DOI:

https://doi.org/10.35546/kntu2078-4481.2025.4.2.35

Keywords:

UAV swarm; heterogeneous multi-agent systems; reinforcement learning; role-based coordination; task allocation; cooperative autonomy.

Abstract

The study addresses the problem of developing adaptive methods for coordinating heterogeneous unmanned aerial vehicle (UAV) swarms in cooperative missions that operate under dynamic and partially observable conditions. Traditional task allocation approaches, such as static role assignment or heuristic rule-based systems, demonstrate significant limitations in scalability, adaptability, and robustness when applied to heterogeneous multi-agent environments. To overcome these challenges, the work proposes a role-oriented approach based on multi-agent reinforcement learning (MARL), which introduces a two-layer policy structure separating high-level role selection from low-level action generation. This decomposition enables agents to specialize more effectively, reduces the likelihood of conflicting behaviors, and ensures that the functional diversity of the swarm is used efficiently. The aim of the research is to design and analytically evaluate a MARL-based method that supports autonomous and context-aware role allocation in heterogeneous UAV swarms. The proposed framework integrates role embeddings, centralized training with decentralized execution, and a reward function that accounts for mission efficiency, coverage, energy consumption, collision avoidance, and role-switching stability. Although large-scale simulation experiments are still under preparation, analytical assessment and scenario-based behavioral modeling have been conducted. The results demonstrate that the role-oriented policy is expected to provide more stable coordination, improved adaptability to dynamic mission changes, and higher energy efficiency compared to static allocation, heuristic strategies, and singlelayer MARL approaches. The anticipated performance indicators for subsequent simulation experiments include mission completion time, coverage area, number of successfully completed subtasks, energy consumption, role-switch frequency, and robustness to agent failures. The theoretical modelling performed so far confirms that role decomposition can improve these metrics relative to baseline methods, which aligns with recent research findings in the field.

References

Ekechi C. C., Elfouly T., Alouani A., Khattab T.A Survey on UAV Control with Multi-Agent Reinforcement Learning // Drones. 2025. Vol. 9, No. 7. Article 484. DOI: 10.3390/drones9070484.

Bettini M., Shankar A., Prorok A. Heterogeneous Multi-Robot Reinforcement Learning // Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS). 2023. Р. 1–9.

Liu H., Shao Z., Zhou Q., Tu J., Zhu S. Task Allocation Algorithm for Heterogeneous UAV Swarm with Temporal Task Chains // Drones. 2025. Vol. 9, No. 8. Article 574. DOI: 10.3390/drones9080574.

Wang T., Zhang H. D., Yang J., Zheng W., Wang H., Zhang C.ROMA: Multi-Agent Reinforcement Learning with Emergent Roles // Proceedings of the 38th International Conference on Machine Learning (ICML). 2021. PMLR 139. Р. 10893–10902.

Li X., Chen Y., Xu Y. Adaptive Task Allocation in Heterogeneous UAV Swarms via Deep Reinforcement Learning // Robotics and Autonomous Systems. 2023. Article 104482. DOI: 10.1016/j.robot.2023.104482.

Rahman M. M., Li Y., Mir I. A. Multi-Agent Reinforcement Learning: A Review of Challenges and Applications in UAV Systems // IEEE Access. 2022. Vol. 10. Р. 78934–78958. DOI: 10.1109/ACCESS.2022.3191157.

Published

2025-12-31

Issue

Section

PUBLIC MANAGEMENT AND ADMINISTRATION