ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21985
22
0

Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning

28 May 2025
Naoto Yoshida
Tadahiro Taniguchi
ArXiv (abs)PDFHTML
Main:12 Pages
9 Figures
Bibliography:3 Pages
Abstract

In multi-agent reinforcement learning (MARL), effective communication improves agent performance, particularly under partial observability. We propose MARL-CPC, a framework that enables communication among fully decentralized, independent agents without parameter sharing. MARL-CPC incorporates a message learning model based on collective predictive coding (CPC) from emergent communication research. Unlike conventional methods that treat messages as part of the action space and assume cooperation, MARL-CPC links messages to state inference, supporting communication in non-cooperative, reward-independent settings. We introduce two algorithms -Bandit-CPC and IPPO-CPC- and evaluate them in non-cooperative MARL tasks. Benchmarks show that both outperform standard message-as-action approaches, establishing effective communication even when messages offer no direct benefit to the sender. These results highlight MARL-CPC's potential for enabling coordination in complex, decentralized environments.

View on arXiv
@article{yoshida2025_2505.21985,
  title={ Reward-Independent Messaging for Decentralized Multi-Agent Reinforcement Learning },
  author={ Naoto Yoshida and Tadahiro Taniguchi },
  journal={arXiv preprint arXiv:2505.21985},
  year={ 2025 }
}
Comments on this paper