v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19 (latest)

RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

6 February 2021

Haibin Wu

ArXiv (abs)PDF HTML Github (666★)

Abstract

Recent years have seen revolutionary breakthroughs in the field of Multi-Agent Deep Reinforcement Learning (MADRL), with its successful applications to various complex scenarios such as computer games and robot swarms. We investigate the impact of "implementation tricks" of state-of-the-art (SOTA) QMIX-based algorithms. First, we find that after minimal tuning, QMIX attains extraordinarily high win rates and achieves SOTA in the StarCraft Multi-Agent Challenge (SMAC). Furthermore, we find QMIX's monotonicity constraint improves sample efficiency in certain cooperative tasks. We propose a new policy-based algorithm to verify the importance of the monotonicity constraint: RIIT. RIIT successfully achieves SOTA in policy-based algorithms. Finally, we prove that the Purely Cooperative Tasks can be represented by the monotonic mixing networks. We open-source the code at \url{https://github.com/hijkzzz/pymarl2}.

View on arXiv

Comments on this paper