ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.03765
  4. Cited By
Is Q-learning Provably Efficient?

Is Q-learning Provably Efficient?

10 July 2018
Chi Jin
Zeyuan Allen-Zhu
Sébastien Bubeck
Michael I. Jordan
    OffRL
ArXivPDFHTML

Papers citing "Is Q-learning Provably Efficient?"

50 / 225 papers shown
Title
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
Tiancheng Jin
Tal Lancewicki
Haipeng Luo
Yishay Mansour
Aviv A. Rosenberg
74
21
0
31 Jan 2022
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee
Tianhao Wu
Yunchang Yang
Han Zhong
Liwei Wang
S. Du
Jiantao Jiao
60
14
0
21 Dec 2021
Provable Hierarchical Lifelong Learning with a Sketch-based Modular
  Architecture
Provable Hierarchical Lifelong Learning with a Sketch-based Modular Architecture
Zihao Deng
Zee Fryer
Brendan Juba
Rina Panigrahy
Xin Wang
27
2
0
21 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
34
168
0
08 Dec 2021
Differentially Private Exploration in Reinforcement Learning with Linear
  Representation
Differentially Private Exploration in Reinforcement Learning with Linear Representation
Paul Luyo
Evrard Garcelon
A. Lazaric
Matteo Pirotta
60
11
0
02 Dec 2021
Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
Simone Parisi
Victoria Dean
Deepak Pathak
Abhinav Gupta
LM&Ro
44
50
0
25 Nov 2021
A Free Lunch from the Noise: Provable and Practical Exploration for
  Representation Learning
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning
Tongzheng Ren
Tianjun Zhang
Csaba Szepesvári
Bo Dai
39
19
0
22 Nov 2021
Uncoupled Bandit Learning towards Rationalizability: Benchmarks,
  Barriers, and Algorithms
Uncoupled Bandit Learning towards Rationalizability: Benchmarks, Barriers, and Algorithms
Jibang Wu
Haifeng Xu
Fan Yao
42
1
0
10 Nov 2021
Exponential Bellman Equation and Improved Regret Bounds for
  Risk-Sensitive Reinforcement Learning
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
Yingjie Fei
Zhuoran Yang
Yudong Chen
Zhaoran Wang
55
47
0
06 Nov 2021
Perturbational Complexity by Distribution Mismatch: A Systematic
  Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space
Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space
Jihao Long
Jiequn Han
34
6
0
05 Nov 2021
Decentralized Cooperative Reinforcement Learning with Hierarchical
  Information Structure
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure
Hsu Kao
Chen-Yu Wei
V. Subramanian
36
12
0
01 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
32
20
0
01 Nov 2021
Adaptive Discretization in Online Reinforcement Learning
Adaptive Discretization in Online Reinforcement Learning
Sean R. Sinclair
Siddhartha Banerjee
Chao Yu
OffRL
47
15
0
29 Oct 2021
Learning Stochastic Shortest Path with Linear Function Approximation
Learning Stochastic Shortest Path with Linear Function Approximation
Steffen Czolbe
Jiafan He
Adrian Dalca
Quanquan Gu
53
30
0
25 Oct 2021
Optimistic Policy Optimization is Provably Efficient in Non-stationary
  MDPs
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs
Han Zhong
Zhuoran Yang
Zhaoran Wang
Csaba Szepesvári
49
21
0
18 Oct 2021
On Improving Model-Free Algorithms for Decentralized Multi-Agent
  Reinforcement Learning
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Weichao Mao
Lin F. Yang
Kai Zhang
Tamer Bacsar
46
57
0
12 Oct 2021
Provably Efficient Reinforcement Learning in Decentralized General-Sum
  Markov Games
Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games
Weichao Mao
Tamer Basar
39
66
0
12 Oct 2021
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
49
51
0
09 Oct 2021
Provably Efficient Black-Box Action Poisoning Attacks Against
  Reinforcement Learning
Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning
Guanlin Liu
Lifeng Lai
AAML
32
34
0
09 Oct 2021
When Can We Learn General-Sum Markov Games with a Large Number of
  Players Sample-Efficiently?
When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?
Ziang Song
Song Mei
Yu Bai
76
67
0
08 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CML
FAtt
OffRL
21
2
0
06 Oct 2021
Exploration in Deep Reinforcement Learning: From Single-Agent to
  Multiagent Domain
Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Jianye Hao
Tianpei Yang
Hongyao Tang
Chenjia Bai
Jinyi Liu
Zhaopeng Meng
Peng Liu
Zhen Wang
OffRL
41
94
0
14 Sep 2021
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Primal-Dual Approach
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Qinbo Bai
Amrit Singh Bedi
Mridul Agarwal
Alec Koppel
Vaneet Aggarwal
107
56
0
13 Sep 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
38
12
0
12 Sep 2021
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with
  an Arbitrary Opponent
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
41
5
0
08 Sep 2021
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum
  Stochastic Games
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games
Xiaotie Deng
Ningyuan Li
D. Mguni
Jun Wang
Yaodong Yang
31
46
0
04 Sep 2021
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
39
80
0
01 Sep 2021
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
35
12
0
11 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games
Towards General Function Approximation in Zero-Sum Markov Games
Baihe Huang
Jason D. Lee
Zhaoran Wang
Zhuoran Yang
38
47
0
30 Jul 2021
Strategically Efficient Exploration in Competitive Multi-agent
  Reinforcement Learning
Strategically Efficient Exploration in Competitive Multi-agent Reinforcement Learning
R. Loftin
Aadirupa Saha
Sam Devlin
Katja Hofmann
30
5
0
30 Jul 2021
Policy Optimization in Adversarial MDPs: Improved Exploration via
  Dilated Bonuses
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo
Chen-Yu Wei
Chung-Wei Lee
43
44
0
18 Jul 2021
Sublinear Regret for Learning POMDPs
Sublinear Regret for Learning POMDPs
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
33
25
0
08 Jul 2021
Concentration of Contractive Stochastic Approximation and Reinforcement
  Learning
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Siddharth Chandak
Vivek Borkar
Parth Dodhia
48
17
0
27 Jun 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
43
15
0
15 Jun 2021
Randomized Exploration for Reinforcement Learning with General Value
  Function Approximation
Randomized Exploration for Reinforcement Learning with General Value Function Approximation
Haque Ishfaq
Qiwen Cui
V. Nguyen
Alex Ayoub
Zhuoran Yang
Zhaoran Wang
Doina Precup
Lin F. Yang
40
43
0
15 Jun 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online
  Reinforcement Learning
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
Tengyang Xie
Nan Jiang
Huan Wang
Caiming Xiong
Yu Bai
OffRL
OnRL
44
162
0
09 Jun 2021
The best of both worlds: stochastic and adversarial episodic MDPs with
  unknown transition
The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition
Tiancheng Jin
Longbo Huang
Haipeng Luo
27
40
0
08 Jun 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces
Chi Jin
Qinghua Liu
Tiancheng Yu
31
50
0
07 Jun 2021
Heuristic-Guided Reinforcement Learning
Heuristic-Guided Reinforcement Learning
Ching-An Cheng
Andrey Kolobov
Adith Swaminathan
OffRL
40
61
0
05 Jun 2021
A Provably-Efficient Model-Free Algorithm for Constrained Markov
  Decision Processes
A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes
Honghao Wei
Xin Liu
Lei Ying
32
21
0
03 Jun 2021
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
Anshumali Shrivastava
Zhao Song
Zhaozhuo Xu
24
22
0
18 May 2021
Sample-Efficient Reinforcement Learning Is Feasible for Linearly
  Realizable MDPs with Limited Revisiting
Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
Gen Li
Yuxin Chen
Yuejie Chi
Yuantao Gu
Yuting Wei
OffRL
37
28
0
17 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
39
19
0
13 May 2021
Principled Exploration via Optimistic Bootstrapping and Backward
  Induction
Principled Exploration via Optimistic Bootstrapping and Backward Induction
Chenjia Bai
Lingxiao Wang
Lei Han
Jianye Hao
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
31
38
0
13 May 2021
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise
  Rollouts
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts
Weinan Zhang
Xihuai Wang
Jian Shen
Ming Zhou
30
35
0
07 May 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards
  Horizon-Free Regret
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech
Runlong Zhou
S. Du
Matteo Pirotta
M. Valko
A. Lazaric
70
35
0
22 Apr 2021
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
32
49
0
25 Mar 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
37
53
0
24 Mar 2021
Reinforcement Learning, Bit by Bit
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
Previous
12345
Next