v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19 (latest)

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

6 February 2021

Siyang Jiang

ArXiv (abs)PDF HTML Github (666★)

Abstract

In recent years, Multi-Agent Deep Reinforcement Learning (MADRL) has been successfully applied to various complex scenarios such as computer games and robot swarms. We investigate the impact of "implementation tricks" of state-of-the-art (SOTA) QMIX-based algorithms. Firstly, we find that such tricks, described as auxiliary details to the core algorithm, seemingly of secondary importance, have a major impact. Our finding demonstrates that, after minimal tuning, QMIX attains extraordinarily high win rates and achieves SOTA in the StarCraft Multi-Agent Challenge (SMAC). Furthermore, we find QMIX's monotonicity condition improves sample efficiency in some cooperative tasks. We propose a new policy-based algorithm, called RIIT, to verify the importance of the monotonicity condition. RIIT also achieves SOTA in policy-based algorithms. At last, we prove theoretically that the Purely Cooperative Tasks can be represented by the monotonic mixing networks. We open-sourced the code at \url{https://github.com/hijkzzz/pymarl2}.

View on arXiv

Comments on this paper