Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.02738
Cited By
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
5 March 2023
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback"
12 / 12 papers shown
Title
Sensor Scheduling in Intrusion Detection Games with Uncertain Payoffs
Jayanth Bhargav
Shreyas Sundaram
Mahsa Ghasemi
20
0
0
20 Apr 2025
Decentralized Online Learning in General-Sum Stackelberg Games
Yaolong Yu
Haipeng Chen
27
0
0
06 May 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
Tadashi Kozuno
Masahiro Kato
Yuki Ichihara
Soichiro Nishimori
Akiyoshi Sannai
Sho Sonoda
Wataru Kumagai
Yutaka Matsuo
42
2
0
31 Jan 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
OffRL
35
94
0
08 Jan 2024
Multi-Player Zero-Sum Markov Games with Networked Separable Interactions
Chanwoo Park
Kaipeng Zhang
Asuman Ozdaglar
30
8
0
13 Jul 2023
Doubly Optimal No-Regret Learning in Monotone Games
Yang Cai
Weiqiang Zheng
38
11
0
30 Jan 2023
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
63
18
0
04 Oct 2022
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Shicong Cen
Yuejie Chi
S. Du
Lin Xiao
51
35
0
03 Oct 2022
O
(
T
−
1
)
O(T^{-1})
O
(
T
−
1
)
Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Yuepeng Yang
Cong Ma
37
14
0
26 Sep 2022
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
22
1
0
10 Nov 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
41
18
0
17 Feb 2021
Independent Policy Gradient Methods for Competitive Reinforcement Learning
C. Daskalakis
Dylan J. Foster
Noah Golowich
62
159
0
11 Jan 2021
1