ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.09703
  4. Cited By
Near-Optimal Randomized Exploration for Tabular Markov Decision
  Processes

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

19 February 2021
Zhihan Xiong
Ruoqi Shen
Qiwen Cui
Maryam Fazel
S. Du
ArXivPDFHTML

Papers citing "Near-Optimal Randomized Exploration for Tabular Markov Decision Processes"

11 / 11 papers shown
Title
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
133
22
0
25 Jul 2023
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
  Reinforcement Learning
Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning
Gen Li
Laixi Shi
Yuxin Chen
Yuejie Chi
OffRL
56
51
0
09 Oct 2021
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value
  Iteration
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Priyank Agrawal
Jinglin Chen
Nan Jiang
55
19
0
23 Oct 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
57
59
0
16 Jun 2020
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon
  MDP
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan Dong
Yuanhao Wang
Xiaoyu Chen
Liwei Wang
OffRL
42
95
0
27 Jan 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
93
274
0
01 Jan 2019
Near Optimal Exploration-Exploitation in Non-Communicating Markov
  Decision Processes
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
Ronan Fruit
Matteo Pirotta
A. Lazaric
36
61
0
06 Jul 2018
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement
  Learning
Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
Christoph Dann
Tor Lattimore
Emma Brunskill
67
307
0
22 Mar 2017
Deep Exploration via Randomized Value Functions
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
79
302
0
22 Mar 2017
Why is Posterior Sampling Better than Optimism for Reinforcement
  Learning?
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Ian Osband
Benjamin Van Roy
BDL
76
257
0
01 Jul 2016
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
Thompson Sampling: An Asymptotically Optimal Finite Time Analysis
E. Kaufmann
N. Korda
Rémi Munos
119
585
0
18 May 2012
1