Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.01907
Cited By
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games
4 October 2022
Wei Xiong
Han Zhong
Chengshuai Shi
Cong Shen
Tong Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games"
13 / 13 papers shown
Title
Cat-and-Mouse Satellite Dynamics: Divergent Adversarial Reinforcement Learning for Contested Multi-Agent Space Operations
Cameron Mehlman
Joseph Abramov
Gregory Falco
AAML
30
0
0
26 Sep 2024
Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning
Qiaosheng Zhang
Chenjia Bai
Shuyue Hu
Zhen Wang
Xuelong Li
39
1
0
30 Apr 2024
Feel-Good Thompson Sampling for Contextual Dueling Bandits
Xuheng Li
Heyang Zhao
Quanquan Gu
42
9
0
09 Apr 2024
Refined Sample Complexity for Markov Games with Independent Linear Function Approximation
Yan Dai
Qiwen Cui
S. S. Du
44
1
0
11 Feb 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Thanh Nguyen-Tang
Raman Arora
OffRL
33
3
0
06 Jan 2024
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint
Wei Xiong
Hanze Dong
Chen Ye
Ziqi Wang
Han Zhong
Heng Ji
Nan Jiang
Tong Zhang
OffRL
38
157
0
18 Dec 2023
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games
Songtao Feng
Ming Yin
Yu-Xiang Wang
J. Yang
Yitao Liang
36
0
0
17 Aug 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
32
22
0
29 May 2023
On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation
Jiawei Huang
Batuhan Yardim
Niao He
39
10
0
18 May 2023
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
Yang Cai
Haipeng Luo
Chen-Yu Wei
Weiqiang Zheng
26
17
0
05 Mar 2023
Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation
Yuanhao Wang
Qinghua Liu
Yunru Bai
Chi Jin
24
28
0
13 Feb 2023
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback
Yunchang Yang
Hangshi Zhong
Tianhao Wu
B. Liu
Liwei Wang
S. Du
OffRL
27
8
0
03 Feb 2023
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
27
18
0
23 Oct 2020
1