Hierarchical Critics Assignment for Multi-agent Reinforcement Learning

In this study, we investigate the use of global information to speed up the learning process and increase the cumulative rewards of multi-agent reinforcement learning (MARL) tasks. Within the actor-critic MARL, we introduce multiple cooperative critics from two levels of the hierarchy and propose a hierarchical critic-based MARL algorithm. In our approach, the agent is allowed to receive the value information from local and global critics in a competition task, indicating that each agent not only receives low-level details but also considers coordination from high levels to obtain global information for improving the training performance. Here, we define multiple cooperative critics in a top-down hierarchy, called the Hierarchical Critic Assignment (HCA) framework. Our three experiments from tennis and soccer competition tasks performed in the Unity environment used to test the HCA framework in the benchmark algorithm, Proximal Policy Optimization (PPO). The results showed that the HCA framework outperforms the benchmark algorithm on three MARL tasks.
View on arXiv