v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19 (latest)

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

6 February 2021

Haibin Wu

ArXiv (abs)PDF HTML Github (666★)

Abstract

Recent years have seen revolutionary breakthroughs in the field of Multi-Agent Deep Reinforcement Learning (MADRL), with its successful applications to various complex scenarios such as computer games and robot swarms. We investigate the impact of "implementation tricks" of state-of-the-art (SOTA) QMIX-based algorithms. First, we find that applied tricks that are described as auxiliary details to the core algorithm, seemingly of secondary importance, in fact, have an enormous impact on performance. Our finding demonstrates that, after minimal tuning, QMIX attains extraordinarily high win rates and achieves SOTA in the StarCraft Multi-Agent Challenge (SMAC). Furthermore, we find QMIX's monotonicity constraint improves sample efficiency in certain cooperative tasks. We propose a new policy-based algorithm to verify the importance of the monotonicity constraint: RIIT. RIIT successfully achieves SOTA in policy-based algorithms. Finally, we prove that the Purely Cooperative Tasks can be represented by the monotonic mixing networks. We open-source the code at \url{https://github.com/hijkzzz/pymarl2}.

View on arXiv

Comments on this paper