15
0

Q-ARDNS-Multi: A Multi-Agent Quantum Reinforcement Learning Framework with Meta-Cognitive Adaptation for Complex 3D Environments

Main:15 Pages
5 Figures
Bibliography:2 Pages
2 Tables
Abstract

This paper presents Q-ARDNS-Multi, an advanced multi-agent quantum reinforcement learning (QRL) framework that extends the ARDNS-FN-Quantum model, where Q-ARDNS-Multi stands for "Quantum Adaptive Reward-Driven Neural Simulator - Multi-Agent". It integrates quantum circuits with RY gates, meta-cognitive adaptation, and multi-agent coordination mechanisms for complex 3D environments. Q-ARDNS-Multi leverages a 2-qubit quantum circuit for action selection, a dual-memory system inspired by human cognition, a shared memory module for agent cooperation, and adaptive exploration strategies modulated by reward variance and intrinsic motivation. Evaluated in a 10×10×310 \times 10 \times 3 GridWorld environment with two agents over 5000 episodes, Q-ARDNS-Multi achieves success rates of 99.6\% and 99.5\% for Agents 0 and 1, respectively, outperforming Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and Soft Actor-Critic (SAC) in terms of success rate, stability, navigation efficiency, and collision avoidance. The framework records mean rewards of 304.2891±756.4636-304.2891 \pm 756.4636 and 295.7622±752.7103-295.7622 \pm 752.7103, averaging 210 steps to goal, demonstrating its robustness in dynamic settings. Comprehensive analyses, including learning curves, reward distributions, statistical tests, and computational efficiency evaluations, highlight the contributions of quantum circuits and meta-cognitive adaptation. By bridging quantum computing, cognitive science, and multi-agent RL, Q-ARDNS-Multi offers a scalable, human-like approach for applications in robotics, autonomous navigation, and decision-making under uncertainty.

View on arXiv
@article{sousa2025_2506.03205,
  title={ Q-ARDNS-Multi: A Multi-Agent Quantum Reinforcement Learning Framework with Meta-Cognitive Adaptation for Complex 3D Environments },
  author={ Umberto Gonçalves de Sousa },
  journal={arXiv preprint arXiv:2506.03205},
  year={ 2025 }
}
Comments on this paper