ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.12849
  4. Cited By
Provably Efficient Q-Learning with Low Switching Cost

Provably Efficient Q-Learning with Low Switching Cost

30 May 2019
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
ArXivPDFHTML

Papers citing "Provably Efficient Q-Learning with Low Switching Cost"

29 / 29 papers shown
Title
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
Runze Zhao
Yue Yu
Adams Yiyue Zhu
Chen Yang
Dongruo Zhou
12
0
0
20 May 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
To Switch or Not to Switch? Balanced Policy Switching in Offline Reinforcement Learning
Tao Ma
Xuzhi Yang
Zoltan Szabo
OffRL
73
0
0
01 Jul 2024
Batched Nonparametric Contextual Bandits
Batched Nonparametric Contextual Bandits
Rong Jiang
Cong Ma
OffRL
39
1
0
27 Feb 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy
  Coverage Suffices
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
39
3
0
08 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
Ruiqi Zhang
Andrea Zanette
OffRL
OnRL
42
7
0
10 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
37
7
0
24 May 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Communication-Efficient Collaborative Regret Minimization in Multi-Armed
  Bandits
Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
39
1
0
26 Jan 2023
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kaipeng Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
29
16
0
01 Jun 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Learn to Match with No Regret: Reinforcement Learning in Markov Matching
  Markets
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
Yifei Min
Tianhao Wang
Ruitu Xu
Zhaoran Wang
Michael I. Jordan
Zhuoran Yang
38
21
0
07 Mar 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Dan Qiao
Ming Yin
Ming Min
Yu Wang
43
28
0
13 Feb 2022
Improved Regret for Differentially Private Exploration in Linear MDP
Improved Regret for Differentially Private Exploration in Linear MDP
Dung Daniel Ngo
G. Vietri
Zhiwei Steven Wu
29
8
0
02 Feb 2022
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
Batched Thompson Sampling for Multi-Armed Bandits
Batched Thompson Sampling for Multi-Armed Bandits
Nikolai Karpov
Qin Zhang
27
4
0
15 Aug 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online
  Reinforcement Learning
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
Tengyang Xie
Nan Jiang
Huan Wang
Caiming Xiong
Yu Bai
OffRL
OnRL
44
162
0
09 Jun 2021
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Anshumali Shrivastava
Zhao Song
Zhaozhuo Xu
19
22
0
18 May 2021
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Wenhao Li
Xiangfeng Wang
Bo Jin
Junjie Sheng
H. Zha
36
7
0
21 Feb 2021
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
Gen Li
Changxiao Cai
Ee
Yuting Wei
Yuejie Chi
OffRL
55
75
0
12 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function
  Approximation Under Adaptivity Constraints
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
122
167
0
06 Jan 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
34
104
0
28 Sep 2020
Linear Bandits with Limited Adaptivity and Learning Distributional
  Optimal Design
Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
Yufei Ruan
Jiaqi Yang
Yuanshuo Zhou
OffRL
102
51
0
04 Jul 2020
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
  with a Generative Model
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
39
125
0
26 May 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
29
159
0
01 Mar 2020
Convergent Policy Optimization for Safe Reinforcement Learning
Convergent Policy Optimization for Safe Reinforcement Learning
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
18
91
0
26 Oct 2019
1