Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.12849
Cited By
Provably Efficient Q-Learning with Low Switching Cost
30 May 2019
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Provably Efficient Q-Learning with Low Switching Cost"
29 / 29 papers shown
Title
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
14
0
0
20 May 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
73
0
0
01 Jul 2024
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
39
1
0
27 Feb 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
39
3
0
08 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
Ruiqi Zhang
Andrea Zanette
OffRL
OnRL
42
7
0
10 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
37
7
0
24 May 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
46
10
0
31 Jan 2023
Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
39
1
0
26 Jan 2023
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kaipeng Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
31
16
0
01 Jun 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
Yifei Min
Tianhao Wang
Ruitu Xu
Zhaoran Wang
Michael I. Jordan
Zhuoran Yang
38
21
0
07 Mar 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Dan Qiao
Ming Yin
Ming Min
Yu Wang
43
28
0
13 Feb 2022
Improved Regret for Differentially Private Exploration in Linear MDP
Dung Daniel Ngo
G. Vietri
Zhiwei Steven Wu
29
8
0
02 Feb 2022
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
Batched Thompson Sampling for Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
27
4
0
15 Aug 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
Tengyang Xie
Nan Jiang
Huan Wang
Caiming Xiong
Yu Bai
OffRL
OnRL
44
162
0
09 Jun 2021
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Anshumali Shrivastava
Zhao Song
Zhaozhuo Xu
19
22
0
18 May 2021
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Wenhao Li
Xiangfeng Wang
Bo Jin
Junjie Sheng
H. Zha
36
7
0
21 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
55
75
0
12 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
122
167
0
06 Jan 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
34
104
0
28 Sep 2020
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
Yufei Ruan
Jiaqi Yang
Yuanshuo Zhou
OffRL
102
51
0
04 Jul 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
39
125
0
26 May 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
32
159
0
01 Mar 2020
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
18
91
0
26 Oct 2019
1