In this paper, we investigate the use of global information to speed up the learning process and increase the cumulative rewards of multi-agent reinforcement learning (MARL) tasks. Within the actor-critic MARL, we introduce multiple cooperative critics from two levels of the hierarchy and propose a hierarchical critic-based MARL algorithm. In our approach, the agent is allowed to receive information from local and global critics in a competition task. The agent not only receives low-level details but also considers coordination from high levels to obtain global information for increasing operational performance. Here, we define multiple cooperative critics in a top-down hierarchy, called the Hierarchical Critic Assignment (HCA) framework. Our experiment, a two-player tennis competition task performed in the Unity environment, tested the HCA multi-agent framework based on the Asynchronous Advantage Actor-Critic (A3C) with Proximal Policy Optimization (PPO) algorithm. The results showed that the HCA framework outperforms the non-hierarchical critic baseline method on MARL tasks.
View on arXiv