v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19 (latest)

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

6 February 2021

Siyang Jiang

ArXiv (abs)PDF HTML Github (666★)

Abstract

In recent years, Multi-Agent Deep Reinforcement Learning (MADRL) has been successfully applied to various complex scenarios such as computer games and robot swarms. We investigate the impact of "implementation tricks" of state-of-the-art (SOTA) cooperative QMIX-based algorithms. Firstly, we find that such tricks described as auxiliary details to the core algorithm, seemingly of secondary importance, have a major impact. This finding demonstrates that, after modest tuning, the QMIX attains extraordinarily high win rates and achieves SOTA in the StarCraft Multi-Agent Challenge (SMAC). Furthermore, we find that the consideration of QMIX's monotonicity condition is critical for cooperative tasks. Based on the above findings, we propose a new algorithm called: RIIT, which achieves SOTA among policy-based algorithms (allowing for convenient complex action space modeling). We open-sourced the code at \url{https://github.com/hijkzzz/pymarl2}.

View on arXiv

Comments on this paper