Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.05110
Cited By
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
12 June 2019
Zihan Zhang
Xiangyang Ji
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function"
27 / 27 papers shown
Title
Optimistic Q-learning for average reward and episodic reinforcement learning
Priyank Agrawal
Shipra Agrawal
56
4
0
18 Jul 2024
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber
A. Busic
Jiamin Zhu
35
0
0
07 Jun 2024
Dealing with unbounded gradients in stochastic saddle-point optimization
Gergely Neu
Nneka Okolo
39
3
0
21 Feb 2024
Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes
Réda Alami
Mohammed Mahfoud
Eric Moulines
24
2
0
01 Apr 2023
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space
Jonatha Anselmi
B. Gaujal
Louis-Sébastien Rebuffi
29
2
0
21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
53
55
0
12 Dec 2022
Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP
Jinghan Wang
Meng-Xian Wang
Lin F. Yang
37
16
0
01 Dec 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
29
4
0
21 Apr 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
Liyu Chen
R. Jain
Haipeng Luo
64
25
0
31 Jan 2022
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
38
82
0
08 Nov 2021
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
27
20
0
01 Nov 2021
Learning Stochastic Shortest Path with Linear Function Approximation
Steffen Czolbe
Jiafan He
Adrian Dalca
Quanquan Gu
44
30
0
25 Oct 2021
Understanding Domain Randomization for Sim-to-real Transfer
Xiaoyu Chen
Jiachen Hu
Chi Jin
Lihong Li
Liwei Wang
24
112
0
07 Oct 2021
A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
41
5
0
08 Sep 2021
Sublinear Regret for Learning POMDPs
Yi Xiong
Ningyuan Chen
Xuefeng Gao
Xiang Zhou
29
25
0
08 Jul 2021
Online Learning for Unknown Partially Observable MDPs
Mehdi Jafarnia-Jahromi
Rahul Jain
A. Nayyar
34
20
0
25 Feb 2021
Causal Markov Decision Processes: Learning Good Interventions Efficiently
Yangyi Lu
A. Meisami
Ambuj Tewari
23
10
0
15 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
38
0
29 Jan 2021
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Zihan Zhang
Xiangyang Ji
S. Du
OffRL
34
104
0
28 Sep 2020
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
Jean Tarbouriech
Matteo Pirotta
Michal Valko
A. Lazaric
OffRL
25
16
0
13 Jul 2020
Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Wang Chi Cheung
D. Simchi-Levi
Ruihao Zhu
OffRL
22
93
0
24 Jun 2020
Tightening Exploration in Upper Confidence Reinforcement Learning
Hippolyte Bourel
Odalric-Ambrym Maillard
M. S. Talebi
27
31
0
20 Apr 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
27
221
0
29 Feb 2020
Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes
Chen-Yu Wei
Mehdi Jafarnia-Jahromi
Haipeng Luo
Hiteshi Sharma
R. Jain
107
100
0
15 Oct 2019
1