Q-Learning Algorithm for Mean-Field Controls, with Convergence and Complexity Analysis

SIAM Journal on Mathematics of Data Science (SIMODS), 2020

10 February 2020

Abstract

This paper studies multi-agent reinforcement learning (MARL) collaborative games under a mean-field control (MFC) approximation framework. It develops a model-free kernel-based Q-learning algorithm (MFC-K-Q) on a probability measure space and shows that the convergence rate and the sample complexity of MFC-K-Q are independent of the number of agents $N$ . Empirical studies on the network traffic congestion problem demonstrate that MFC-K-Q outperforms existing MARL algorithms (when $N$ is large) and MFC algorithms.

View on arXiv

Comments on this paper