ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.14555
  4. Cited By
V-Learning -- A Simple, Efficient, Decentralized Algorithm for
  Multiagent RL

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL

27 October 2021
Chi Jin
Qinghua Liu
Yuanhao Wang
Tiancheng Yu
    OffRL
ArXivPDFHTML

Papers citing "V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL"

41 / 41 papers shown
Title
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Yuheng Zhang
Dian Yu
Tao Ge
Linfeng Song
Zhichen Zeng
Haitao Mi
Nan Jiang
Dong Yu
95
4
0
24 Feb 2025
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games
Tong Yang
Bo Dai
Lin Xiao
Yuejie Chi
OffRL
90
2
0
13 Feb 2025
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Emile Anand
Ishani Karmarkar
Guannan Qu
109
2
0
01 Dec 2024
The Bandit Whisperer: Communication Learning for Restless Bandits
The Bandit Whisperer: Communication Learning for Restless Bandits
Yunfan Zhao
Tonghan Wang
Dheeraj M. Nagaraj
Aparna Taneja
Milind Tambe
81
5
0
11 Aug 2024
Learning to Steer Markovian Agents under Model Uncertainty
Learning to Steer Markovian Agents under Model Uncertainty
Jiawei Huang
Vinzenz Thoma
Zebang Shen
H. Nax
Niao He
78
2
0
14 Jul 2024
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang
Dian Yu
Baolin Peng
Linfeng Song
Ye Tian
Mingyue Huo
Nan Jiang
Haitao Mi
Dong Yu
113
16
0
30 Jun 2024
Independent RL for Cooperative-Competitive Agents: A Mean-Field Perspective
Independent RL for Cooperative-Competitive Agents: A Mean-Field Perspective
Muhammad Aneeq uz Zaman
Alec Koppel
Mathieu Laurière
Tamer Basar
52
3
0
17 Mar 2024
Provably Efficient Reinforcement Learning in Decentralized General-Sum
  Markov Games
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
Weichao Mao
Tamer Basar
50
66
0
12 Oct 2021
When Can We Learn General-Sum Markov Games with a Large Number of
  Players Sample-Efficiently?
When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?
Ziang Song
Song Mei
Yu Bai
86
67
0
08 Oct 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Chi Jin
Qinghua Liu
Tiancheng Yu
50
50
0
07 Jun 2021
Decentralized Q-Learning in Zero-sum Markov Games
Decentralized Q-Learning in Zero-sum Markov Games
M. O. Sayin
Kai Zhang
David S. Leslie
Tamer Basar
Asuman Ozdaglar
38
83
0
04 Jun 2021
Last-iterate Convergence of Decentralized Optimistic Gradient
  Descent/Ascent in Infinite-horizon Competitive Markov Games
Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games
Chen-Yu Wei
Chung-Wei Lee
Mengxiao Zhang
Haipeng Luo
40
82
0
08 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and
  Sample-Efficient Algorithms
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
72
216
0
01 Feb 2021
Independent Policy Gradient Methods for Competitive Reinforcement
  Learning
Independent Policy Gradient Methods for Competitive Reinforcement Learning
C. Daskalakis
Dylan J. Foster
Noah Golowich
143
161
0
11 Jan 2021
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
58
122
0
04 Oct 2020
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
  Sample Complexity
Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
Kai Zhang
Sham Kakade
Tamer Bacsar
Lin F. Yang
84
122
0
15 Jul 2020
Near-Optimal Reinforcement Learning with Self-Play
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
119
131
0
22 Jun 2020
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium
A. Celli
A. Marchesi
Gabriele Farina
N. Gatti
72
46
0
01 Apr 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
63
222
0
29 Feb 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function
  Approximation and Correlated Equilibrium
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Qiaomin Xie
Yudong Chen
Zhaoran Wang
Zhuoran Yang
105
125
0
17 Feb 2020
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Yu Bai
Chi Jin
SSL
119
149
0
10 Feb 2020
Emergent Tool Use From Multi-Agent Autocurricula
Emergent Tool Use From Multi-Agent Autocurricula
Bowen Baker
I. Kanitscheider
Todor Markov
Yi Wu
Glenn Powell
Bob McGrew
Igor Mordatch
LRM
62
649
0
17 Sep 2019
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time
  and Sample Complexity
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity
Aaron Sidford
Mengdi Wang
Lin F. Yang
Yinyu Ye
45
70
0
29 Aug 2019
Provably Efficient Reinforcement Learning with Linear Function
  Approximation
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
76
549
0
11 Jul 2019
Feature-Based Q-Learning for Two-Player Stochastic Games
Feature-Based Q-Learning for Two-Player Stochastic Games
Zeyu Jia
Lin F. Yang
Mengdi Wang
53
45
0
02 Jun 2019
QTRAN: Learning to Factorize with Transformation for Cooperative
  Multi-Agent Reinforcement Learning
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Kyunghwan Son
Daewoo Kim
Wan Ju Kang
D. Hostallero
Yung Yi
OffRL
50
799
0
14 May 2019
Learning to Collaborate in Markov Decision Processes
Learning to Collaborate in Markov Decision Processes
Goran Radanović
R. Devidze
David C. Parkes
Adish Singla
60
33
0
23 Jan 2019
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Shariq Iqbal
Fei Sha
57
743
0
05 Oct 2018
Is Q-learning Provably Efficient?
Is Q-learning Provably Efficient?
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
OffRL
52
801
0
10 Jul 2018
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent
  Reinforcement Learning
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid
Mikayel Samvelyan
Christian Schroeder de Witt
Gregory Farquhar
Jakob N. Foerster
Shimon Whiteson
118
1,662
0
30 Mar 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked
  Agents
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kai Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
66
584
0
23 Feb 2018
Online Reinforcement Learning in Stochastic Games
Online Reinforcement Learning in Stochastic Games
Chen-Yu Wei
Yi-Te Hong
Chi-Jen Lu
OffRL
27
120
0
02 Dec 2017
Value-Decomposition Networks For Cooperative Multi-Agent Learning
Value-Decomposition Networks For Cooperative Multi-Agent Learning
P. Sunehag
Guy Lever
A. Gruslys
Wojciech M. Czarnecki
V. Zambaldi
...
Marc Lanctot
Nicolas Sonnerat
Joel Z Leibo
K. Tuyls
T. Graepel
64
997
0
16 Jun 2017
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Ryan J. Lowe
Yi Wu
Aviv Tamar
J. Harb
Pieter Abbeel
Igor Mordatch
116
4,441
0
07 Jun 2017
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
60
307
0
22 Mar 2017
Minimax Regret Bounds for Reinforcement Learning
Minimax Regret Bounds for Reinforcement Learning
M. G. Azar
Ian Osband
Rémi Munos
65
771
0
16 Mar 2017
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
90
417
0
29 Oct 2016
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
Shai Shalev-Shwartz
Shaked Shammah
Amnon Shashua
40
830
0
11 Oct 2016
On Lower Bounds for Regret in Reinforcement Learning
On Lower Bounds for Regret in Reinforcement Learning
Ian Osband
Benjamin Van Roy
63
101
0
09 Aug 2016
Explore no more: Improved high-probability regret bounds for
  non-stochastic bandits
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
Gergely Neu
181
182
0
10 Jun 2015
Generalization and Exploration via Randomized Value Functions
Generalization and Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Zheng Wen
67
314
0
04 Feb 2014
1