Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.02718
Cited By
Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems
3 June 2025
Guanzhong Chen
Shaoxiong Yang
Chao Li
Wei Liu
Jian Luan
Zenglin Xu
Author Contacts:
muxichenz@outlook.com
yangshaoxiong@xiaomi.com
lichao75@xiaomi.com
liuwei40@xiaomi.com
luanjian@xiaomi.com
zenglinxu@fudan.edu.cn
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems"
16 / 16 papers shown
Title
MARFT: Multi-Agent Reinforcement Fine-Tuning
Junwei Liao
Muning Wen
Jun Wang
Weinan Zhang
OffRL
112
4
0
21 Apr 2025
Why Do Multi-Agent LLM Systems Fail?
Mert Cemri
Melissa Z. Pan
Shuyi Yang
Lakshya A Agrawal
Bhavya Chopra
...
Dan Klein
Kannan Ramchandran
Matei A. Zaharia
Joseph E. Gonzalez
Ion Stoica
LLMAG
Presented at
ResearchTrend Connect | LLMAG
on
23 Apr 2025
212
31
0
17 Mar 2025
ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA
Zhao Xinjie
Fan Gao
Rui Yang
Yingjian Chen
Yuyang Wang
Ying Zhu
Jiacheng Tang
Irene Li
Y. Matsuo
Irene Li
KELM
LRM
91
1
0
10 Mar 2025
Talk to Right Specialists: Routing and Planning in Multi-agent System for Question Answering
Feijie Wu
Zitao Li
Fei Wei
Yaliang Li
Bolin Ding
Jing Gao
55
4
0
14 Jan 2025
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
138
1,119
0
05 Feb 2024
Reinforcement Learning for Optimizing RAG for Domain Chatbots
Mandar Kulkarni
Praveen Tangarajan
Kyung Kim
Anusua Trivedi
OffRL
RALM
SILM
56
30
0
10 Jan 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DV
RALM
177
1,776
1
18 Dec 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
302
11,894
0
18 Jul 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,247
0
27 Feb 2023
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
195
907
0
16 Dec 2021
MuSiQue: Multihop Questions via Single-hop Question Composition
H. Trivedi
Niranjan Balasubramanian
Tushar Khot
Ashish Sabharwal
LRM
110
278
0
02 Aug 2021
The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games
Chao Yu
Akash Velu
Eugene Vinitsky
Jiaxuan Gao
Yu Wang
Alexandre M. Bayen
Yi Wu
OffRL
137
1,252
0
02 Mar 2021
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
Xanh Ho
A. Nguyen
Saku Sugawara
Akiko Aizawa
RALM
LRM
78
451
0
02 Nov 2020
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
174
2,655
0
25 Sep 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
517
19,065
0
20 Jul 2017
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
101
3,414
0
08 Jun 2015
1