Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.00284
Cited By
v1
v2
v3
v4 (latest)
Selective Uncertainty Propagation in Offline RL
1 February 2023
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
Branislav Kveton
A. Rangi
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Selective Uncertainty Propagation in Offline RL"
26 / 26 papers shown
Title
Foundations of Reinforcement Learning and Interactive Decision Making
Dylan J. Foster
Alexander Rakhlin
OffRL
45
14
0
27 Dec 2023
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes
Kelly W. Zhang
Omer Gottesman
Finale Doshi-Velez
OffRL
36
1
0
30 Jul 2022
Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation
Dylan J. Foster
A. Krishnamurthy
D. Simchi-Levi
Yunzong Xu
OffRL
164
63
0
21 Nov 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
154
82
0
17 Oct 2021
Epistemic Neural Networks
Ian Osband
Zheng Wen
M. Asghari
Vikranth Dwaracherla
M. Ibrahimi
Xiyuan Lu
Benjamin Van Roy
UQCV
BDL
138
109
0
19 Jul 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
193
360
0
30 Dec 2020
The Importance of Pessimism in Fixed-Dataset Policy Optimization
Jacob Buckman
Carles Gelada
Marc G. Bellemare
OffRL
125
139
0
15 Sep 2020
Model-based Reinforcement Learning: A Survey
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
132
49
0
30 Jun 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
130
679
0
12 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
586
2,052
0
04 May 2020
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang
Ruosong Wang
S. Du
A. Krishnamurthy
191
137
0
09 Dec 2019
Orthogonal Statistical Learning
Dylan J. Foster
Vasilis Syrgkanis
177
174
0
25 Jan 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
145
276
0
01 Jan 2019
Adversarial Domain Adaptation for Stable Brain-Machine Interfaces
A. Farshchian
J. A. Gallego
Joseph Paul Cohen
Yoshua Bengio
L. Miller
S. Solla
OOD
86
76
0
28 Sep 2018
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
150
45
0
12 Mar 2018
Quasi-Oracle Estimation of Heterogeneous Treatment Effects
Xinkun Nie
Stefan Wager
CML
233
658
0
13 Dec 2017
Count-Based Exploration in Feature Space for Reinforcement Learning
Jarryd Martin
S. N. Sasikumar
Tom Everitt
Marcus Hutter
76
124
0
25 Jun 2017
Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning
Sören R. Künzel
Jasjeet Sekhon
Peter J. Bickel
Bin Yu
CML
263
935
0
12 Jun 2017
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
142
307
0
22 Mar 2017
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
195
1,487
0
06 Jun 2016
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
224
290
0
10 Mar 2015
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
474
510
0
04 Feb 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
249
1,007
0
15 Sep 2012
Agnostic System Identification for Model-Based Reinforcement Learning
Stéphane Ross
Drew Bagnell
94
146
0
05 Mar 2012
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien Garivier
Olivier Cappé
252
616
0
12 Feb 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
495
2,963
0
28 Feb 2010
1