Communication plays a vital role for coordination in Multi-Agent Reinforcement Learning (MARL) systems. However, misaligned agents can exploit other agents' trust and delegated power to the communication medium. In this paper, we propose power regularization as a method to limit the adverse effects of communication by misaligned agents, specifically communication which impairs the performance of cooperative agents. Power is a measure of the influence one agent's actions have over another agent's policy. By introducing power regularization, we aim to allow designers to control or reduce agents' dependency on communication when appropriate, and make them more resilient to performance deterioration due to misuses of communication. We investigate several environments in which power regularization can be a valuable capability for learning different policies that reduce the effect of power dynamics between agents during communication.
View on arXiv