Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.02293
Cited By
On the Optimality of Batch Policy Optimization Algorithms
6 April 2021
Chenjun Xiao
Yifan Wu
Tor Lattimore
Bo Dai
Jincheng Mei
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Optimality of Batch Policy Optimization Algorithms"
23 / 23 papers shown
Title
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction
Anni Zhou
Raheem Beyah
Rishikesan Kamaleswaran
46
0
0
20 Mar 2025
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
67
24
0
20 Feb 2025
Importance-Weighted Offline Learning Done Right
Germano Gabbianelli
Gergely Neu
Matteo Papini
OffRL
21
5
0
27 Sep 2023
Fast and Regret Optimal Best Arm Identification: Fundamental Limits and Low-Complexity Algorithms
Qining Zhang
Lei Ying
72
4
0
01 Sep 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
35
76
0
26 Jun 2023
Optimal Best-Arm Identification in Bandits with Access to Offline Data
Shubhada Agrawal
Sandeep Juneja
Karthikeyan Shanmugam
A. Suggala
24
5
0
15 Jun 2023
Bayesian Regret Minimization in Offline Bandits
Marek Petrik
Guy Tennenholtz
Mohammad Ghavamzadeh
OffRL
36
0
0
02 Jun 2023
Offline Primal-Dual Reinforcement Learning for Linear MDPs
Germano Gabbianelli
Gergely Neu
Nneka Okolo
Matteo Papini
OffRL
29
7
0
22 May 2023
The In-Sample Softmax for Offline Reinforcement Learning
Chenjun Xiao
Han Wang
Yangchen Pan
Adam White
Martha White
OffRL
29
26
0
28 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
48
5
0
24 Feb 2023
Adversarial Model for Offline Reinforcement Learning
M. Bhardwaj
Tengyang Xie
Byron Boots
Nan Jiang
Ching-An Cheng
AAML
OffRL
48
26
0
21 Feb 2023
Model-based Offline Reinforcement Learning with Local Misspecification
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
18
4
0
26 Jan 2023
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data
Tengyang Xie
M. Bhardwaj
Nan Jiang
Ching-An Cheng
OffRL
28
9
0
08 Nov 2022
Online Learning with Off-Policy Feedback
Germano Gabbianelli
Matteo Papini
Gergely Neu
OffRL
30
4
0
18 Jul 2022
Pessimism for Offline Linear Contextual Bandits using
ℓ
p
\ell_p
ℓ
p
Confidence Sets
Gen Li
Cong Ma
Nathan Srebro
OffRL
36
12
0
21 May 2022
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
Ming Yin
Yaqi Duan
Mengdi Wang
Yu Wang
OffRL
34
66
0
11 Mar 2022
Model Selection in Batch Policy Optimization
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
OffRL
27
12
0
23 Dec 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
34
29
0
27 Nov 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
29
82
0
17 Oct 2021
The Curse of Passive Data Collection in Batch Reinforcement Learning
Chenjun Xiao
Ilbin Lee
Bo Dai
Dale Schuurmans
Csaba Szepesvári
OffRL
33
1
0
18 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
32
19
0
13 May 2021
Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics
Wenhao Yang
Liangyu Zhang
Zhihua Zhang
28
33
0
09 May 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
343
1,968
0
04 May 2020
1