ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.07450
10
0

Efficient Generation of Diverse Cooperative Agents with World Models

9 June 2025
Yi Loo
Akshunn Trivedi
Malika Meghjani
ArXiv (abs)PDFHTML
Main:9 Pages
12 Figures
Bibliography:3 Pages
14 Tables
Appendix:12 Pages
Abstract

A major bottleneck in the training process for Zero-Shot Coordination (ZSC) agents is the generation of partner agents that are diverse in collaborative conventions. Current Cross-play Minimization (XPM) methods for population generation can be very computationally expensive and sample inefficient as the training objective requires sampling multiple types of trajectories. Each partner agent in the population is also trained from scratch, despite all of the partners in the population learning policies of the same coordination task. In this work, we propose that simulated trajectories from the dynamics model of an environment can drastically speed up the training process for XPM methods. We introduce XPM-WM, a framework for generating simulated trajectories for XPM via a learned World Model (WM). We show XPM with simulated trajectories removes the need to sample multiple trajectories. In addition, we show our proposed method can effectively generate partners with diverse conventions that match the performance of previous methods in terms of SP population training reward as well as training partners for ZSC agents. Our method is thus, significantly more sample efficient and scalable to a larger number of partners.

View on arXiv
@article{loo2025_2506.07450,
  title={ Efficient Generation of Diverse Cooperative Agents with World Models },
  author={ Yi Loo and Akshunn Trivedi and Malika Meghjani },
  journal={arXiv preprint arXiv:2506.07450},
  year={ 2025 }
}
Comments on this paper