Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning

1 December 2024

Abstract

Designing efficient algorithms for multi-agent reinforcement learning (MARL) is fundamentally challenging because the size of the joint state and action spaces grows exponentially in the number of agents. These difficulties are exacerbated when balancing sequential global decision-making with local agent interactions. In this work, we propose a new algorithm $\texttt{SUBSAMPLE-MFQ}$ ( $\textbf{Subsample}$ - $\textbf{M}$ ean- $\textbf{F}$ ield- $\textbf{Q}$ -learning) and a decentralized randomized policy for a system with $n$ agents. For any $k\leq n$ , our algorithm learns a policy for the system in time polynomial in $k$ . We prove that this learned policy converges to the optimal policy on the order of $\tilde{O}(1/\sqrt{k})$ as the number of subsampled agents $k$ increases. In particular, this bound is independent of the number of agents $n$ .

View on arXiv

@article{anand2025_2412.00661,
  title={ Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning },
  author={ Emile Anand and Ishani Karmarkar and Guannan Qu },
  journal={arXiv preprint arXiv:2412.00661},
  year={ 2025 }
}

Comments on this paper