v1v2v3v4 (latest)

Selective Uncertainty Propagation in Offline RL

1 February 2023

Sanath Kumar Krishnamurthy

Papers citing "Selective Uncertainty Propagation in Offline RL"

26 / 26 papers shown

Title
Foundations of Reinforcement Learning and Interactive Decision Making Dylan J. Foster Alexander Rakhlin OffRL 45 14 0 27 Dec 2023
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes Kelly W. Zhang Omer Gottesman Finale Doshi-Velez OffRL 36 1 0 30 Jul 2022
Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation Dylan J. Foster A. Krishnamurthy D. Simchi-Levi Yunzong Xu OffRL 164 63 0 21 Nov 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism Ming Yin Yu Wang OffRL 154 82 0 17 Oct 2021
Epistemic Neural Networks Ian Osband Zheng Wen M. Asghari Vikranth Dwaracherla M. Ibrahimi Xiyuan Lu Benjamin Van Roy UQCV BDL 138 109 0 19 Jul 2021
Is Pessimism Provably Efficient for Offline RL? Ying Jin Zhuoran Yang Zhaoran Wang OffRL 193 360 0 30 Dec 2020
The Importance of Pessimism in Fixed-Dataset Policy Optimization Jacob Buckman Carles Gelada Marc G. Bellemare OffRL 125 139 0 15 Sep 2020
Model-based Reinforcement Learning: A Survey Thomas M. Moerland Joost Broekens Aske Plaat Catholijn M. Jonker OffRL 132 49 0 30 Jun 2020
MOReL : Model-Based Offline Reinforcement Learning Rahul Kidambi Aravind Rajeswaran Praneeth Netrapalli Thorsten Joachims OffRL 130 679 0 12 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 586 2,052 0 04 May 2020
Optimism in Reinforcement Learning with Generalized Linear Function Approximation Yining Wang Ruosong Wang S. Du A. Krishnamurthy 191 137 0 09 Dec 2019
Orthogonal Statistical Learning Dylan J. Foster Vasilis Syrgkanis 177 174 0 25 Jan 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds Andrea Zanette Emma Brunskill OffRL 145 276 0 01 Jan 2019
Adversarial Domain Adaptation for Stable Brain-Machine Interfaces A. Farshchian J. A. Gallego Joseph Paul Cohen Yoshua Bengio L. Miller S. Solla OOD 86 76 0 28 Sep 2018
Semiparametric Contextual Bandits A. Krishnamurthy Zhiwei Steven Wu Vasilis Syrgkanis 150 45 0 12 Mar 2018
Quasi-Oracle Estimation of Heterogeneous Treatment Effects Xinkun Nie Stefan Wager CML 233 658 0 13 Dec 2017
Count-Based Exploration in Feature Space for Reinforcement Learning Jarryd Martin S. N. Sasikumar Tom Everitt Marcus Hutter 76 124 0 25 Jun 2017
Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning Sören R. Künzel Jasjeet Sekhon Peter J. Bickel Bin Yu CML 263 935 0 12 Jun 2017
Deep Exploration via Randomized Value Functions Ian Osband Benjamin Van Roy Daniel Russo Zheng Wen 142 307 0 22 Mar 2017
Unifying Count-Based Exploration and Intrinsic Motivation Marc G. Bellemare S. Srinivasan Georg Ostrovski Tom Schaul D. Saxton Rémi Munos 195 1,487 0 06 Jun 2016
Doubly Robust Policy Evaluation and Optimization Miroslav Dudík D. Erhan John Langford Lihong Li OffRL 224 290 0 10 Mar 2015
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits Alekh Agarwal Daniel J. Hsu Satyen Kale John Langford Lihong Li Robert Schapire OffRL 474 510 0 04 Feb 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs Shipra Agrawal Navin Goyal 249 1,007 0 15 Sep 2012
Agnostic System Identification for Model-Based Reinforcement Learning Stéphane Ross Drew Bagnell 94 146 0 05 Mar 2012
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond Aurélien Garivier Olivier Cappé 252 616 0 12 Feb 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong Li Wei Chu John Langford Robert Schapire 495 2,963 0 28 Feb 2010