97
4

Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Abstract

We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights. We represent multi-LLM systems as directed acyclic graphs (DAGs) of LLMs with topological message passing for collaborative generation. Given a pool of LLM experts and a utility function, Heterogeneous Swarms employs two iterative steps: role-step and weight-step. For role-step, we interpret model roles as learning a DAG that specifies the flow of inputs and outputs between LLMs. Starting from a swarm of random continuous adjacency matrices, we decode them into discrete DAGs, call the LLMs in topological order, evaluate on the utility function (e.g. accuracy on a task), and optimize the adjacency matrices with particle swarm optimization based on the utility score. For weight-step, we assess the contribution of individual LLMs in the multi-LLM systems and optimize model weights with swarm intelligence. We propose JFK-score to quantify the individual contribution of each LLM in the best-found DAG of the role-step, then optimize model weights with particle swarm optimization based on the JFK-score. Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks. Further analysis reveals that Heterogeneous Swarms discovers multi-LLM systems with heterogeneous model roles and substantial collaborative gains, and benefits from the diversity of language models.

View on arXiv
@article{feng2025_2502.04510,
  title={ Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems },
  author={ Shangbin Feng and Zifeng Wang and Palash Goyal and Yike Wang and Weijia Shi and Huang Xia and Hamid Palangi and Luke Zettlemoyer and Yulia Tsvetkov and Chen-Yu Lee and Tomas Pfister },
  journal={arXiv preprint arXiv:2502.04510},
  year={ 2025 }
}
Comments on this paper