ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.09118
  4. Cited By
$Q$-learning with Logarithmic Regret

QQQ-learning with Logarithmic Regret

16 June 2020
Kunhe Yang
Lin F. Yang
S. Du
ArXivPDFHTML

Papers citing "$Q$-learning with Logarithmic Regret"

24 / 24 papers shown
Title
Automatic Reward Shaping from Confounded Offline Data
Automatic Reward Shaping from Confounded Offline Data
Mingxuan Li
Junzhe Zhang
Elias Bareinboim
OffRL
OnRL
33
1
0
16 May 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
75
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
67
1
0
11 Jun 2024
Reinforcement Learning from Human Feedback with Active Queries
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
24
17
0
14 Feb 2024
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
21
0
25 Jul 2023
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
  with Short Burn-In Time
Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time
Xiang Ji
Gen Li
OffRL
32
7
0
24 May 2023
Provably Efficient Reinforcement Learning via Surprise Bound
Provably Efficient Reinforcement Learning via Surprise Bound
Hanlin Zhu
Ruosong Wang
Jason D. Lee
OffRL
20
5
0
22 Feb 2023
Improved Regret Bounds for Linear Adversarial MDPs via Linear
  Optimization
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
Fang-yuan Kong
Xiangcheng Zhang
Baoxiang Wang
Shuai Li
26
12
0
14 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
55
15
0
23 Nov 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient
  Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
38
5
0
01 Jun 2022
No-regret Learning in Repeated First-Price Auctions with Budget
  Constraints
No-regret Learning in Repeated First-Price Auctions with Budget Constraints
Rui Ai
Chang Wang
Chenchen Li
Jinshan Zhang
Wenhan Huang
Xiaotie Deng
30
10
0
29 May 2022
Logarithmic regret bounds for continuous-time average-reward Markov
  decision processes
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
33
8
0
23 May 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
21
33
0
25 Mar 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
28
21
0
24 Mar 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
78
40
0
14 Mar 2022
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
19
20
0
01 Nov 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
45
50
0
09 Oct 2021
Provably Efficient Black-Box Action Poisoning Attacks Against
  Reinforcement Learning
Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning
Guanlin Liu
Lifeng Lai
AAML
32
34
0
09 Oct 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
27
12
0
11 Aug 2021
The best of both worlds: stochastic and adversarial episodic MDPs with
  unknown transition
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Tiancheng Jin
Longbo Huang
Haipeng Luo
19
40
0
08 Jun 2021
Sample-Efficient Reinforcement Learning Is Feasible for Linearly
  Realizable MDPs with Limited Revisiting
Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
Gen Li
Yuxin Chen
Yuejie Chi
Yuantao Gu
Yuting Wei
OffRL
26
28
0
17 May 2021
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant
  Suboptimality Gap
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap
Yuanhao Wang
Ruosong Wang
Sham Kakade
OffRL
39
43
0
23 Mar 2021
Fast Rates for the Regret of Offline Reinforcement Learning
Fast Rates for the Regret of Offline Reinforcement Learning
Yichun Hu
Nathan Kallus
Masatoshi Uehara
OffRL
11
29
0
31 Jan 2021
1