ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05110
  4. Cited By
Regret Minimization for Reinforcement Learning by Evaluating the Optimal
  Bias Function

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

12 June 2019
Zihan Zhang
Xiangyang Ji
ArXivPDFHTML

Papers citing "Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function"

27 / 27 papers shown
Title
Optimistic Q-learning for average reward and episodic reinforcement learning
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
56
4
0
18 Jul 2024
Reinforcement Learning and Regret Bounds for Admission Control
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber
A. Busic
Jiamin Zhu
35
0
0
07 Jun 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
39
3
0
21 Feb 2024
Restarted Bayesian Online Change-point Detection for Non-Stationary
  Markov Decision Processes
Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes
Réda Alami
Mohammed Mahfoud
Eric Moulines
24
2
0
01 Apr 2023
Reinforcement Learning in a Birth and Death Process: Breaking the
  Dependence on the State Space
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
32
2
0
21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
46
10
0
31 Jan 2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
  Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
56
55
0
12 Dec 2022
Near Sample-Optimal Reduction-based Policy Learning for Average Reward
  MDP
Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP
Jinghan Wang
Meng-Xian Wang
Lin F. Yang
39
16
0
01 Dec 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Provably Efficient Kernelized Q-Learning
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
34
4
0
21 Apr 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
32
21
0
24 Mar 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with
  Constraints
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
Liyu Chen
R. Jain
Haipeng Luo
64
25
0
31 Jan 2022
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
38
82
0
08 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
29
20
0
01 Nov 2021
Learning Stochastic Shortest Path with Linear Function Approximation
Learning Stochastic Shortest Path with Linear Function Approximation
Steffen Czolbe
Jiafan He
Adrian Dalca
Quanquan Gu
44
30
0
25 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with
  an Arbitrary Opponent
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
41
5
0
08 Sep 2021
Sublinear Regret for Learning POMDPs
Sublinear Regret for Learning POMDPs
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
29
25
0
08 Jul 2021
Online Learning for Unknown Partially Observable MDPs
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
36
20
0
25 Feb 2021
Causal Markov Decision Processes: Learning Good Interventions
  Efficiently
Causal Markov Decision Processes: Learning Good Interventions Efficiently
Yangyi Lu
A. Meisami
Ambuj Tewari
23
10
0
15 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear
  Mixture MDP
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal
  Algorithm Escaping the Curse of Horizon
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
36
104
0
28 Sep 2020
A Provably Efficient Sample Collection Strategy for Reinforcement
  Learning
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
27
16
0
13 Jul 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The
  Blessing of (More) Optimism
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
22
93
0
24 Jun 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
30
31
0
20 Apr 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
27
221
0
29 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward
  Markov Decision Processes
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
107
100
0
15 Oct 2019
1