Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1903.08738
Cited By
Batch Policy Learning under Constraints
20 March 2019
Hoang Minh Le
Cameron Voloshin
Yisong Yue
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Batch Policy Learning under Constraints"
40 / 90 papers shown
Title
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
31
31
0
14 Oct 2021
Offline RL With Resource Constrained Online Deployment
Jayanth Reddy Regatti
A. Deshmukh
Frank Cheng
Young Hun Jung
Abhishek Gupta
Ürün Dogan
OffRL
23
2
0
07 Oct 2021
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CML
FAtt
OffRL
21
2
0
06 Oct 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
38
12
0
12 Sep 2021
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review
Florian Tambon
Gabriel Laberge
Le An
Amin Nikanjam
Paulina Stevia Nouwou Mindom
Y. Pequignot
Foutse Khomh
G. Antoniol
E. Merlo
François Laviolette
42
66
0
26 Jul 2021
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings
Shengpu Tang
Jenna Wiens
OffRL
26
78
0
23 Jul 2021
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
Haoran Xu
Xianyuan Zhan
Xiangyu Zhu
OffRL
16
86
0
19 Jul 2021
A Simple Reward-free Approach to Constrained Reinforcement Learning
Sobhan Miryoosefi
Chi Jin
18
29
0
12 Jul 2021
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
37
5
0
03 Jul 2021
Safe Reinforcement Learning Using Advantage-Based Intervention
Nolan Wagener
Byron Boots
Ching-An Cheng
39
52
0
16 Jun 2021
On Instrumental Variable Regression for Deep Offline Policy Evaluation
Yutian Chen
Liyuan Xu
Çağlar Gülçehre
T. Paine
Arthur Gretton
Nando de Freitas
Arnaud Doucet
OffRL
56
18
0
21 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
39
19
0
13 May 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
30
36
0
10 May 2021
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Michael Ruogu Zhang
T. Paine
Ofir Nachum
Cosmin Paduraru
George Tucker
Ziyun Wang
Mohammad Norouzi
OffRL
30
45
0
28 Apr 2021
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
35
100
0
30 Mar 2021
Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning
Yaqi Duan
Chi Jin
Zhiyuan Li
OffRL
36
48
0
25 Mar 2021
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks
Thanh Nguyen-Tang
Sunil R. Gupta
Hung The Tran
Svetha Venkatesh
OffRL
70
7
0
11 Mar 2021
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rongjun Qin
Songyi Gao
Xingyuan Zhang
Zhen Xu
Shengkai Huang
Zewen Li
Weinan Zhang
Yang Yu
OffRL
142
0
0
01 Feb 2021
Inverse Constrained Reinforcement Learning
Usman Anwar
Shehryar Malik
Alireza Aghasi
Ali Ahmed
18
58
0
19 Nov 2020
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
49
146
0
17 Jul 2020
Provably Good Batch Reinforcement Learning Without Great Exploration
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
27
105
0
16 Jul 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
44
31
0
07 Jul 2020
Safe Reinforcement Learning via Curriculum Induction
M. Turchetta
Andrey Kolobov
S. Shah
Andreas Krause
Alekh Agarwal
23
91
0
22 Jun 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
Nathan Kallus
Masatoshi Uehara
OffRL
19
15
0
06 Jun 2020
A Distributional View on Multi-Objective Policy Optimization
A. Abdolmaleki
Sandy H. Huang
Leonard Hasenclever
Michael Neunert
H. F. Song
Martina Zambelli
M. Martins
N. Heess
R. Hadsell
Martin Riedmiller
26
74
0
15 May 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
24
63
0
12 Mar 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
35
159
0
01 Mar 2020
Provable Representation Learning for Imitation Learning via Bi-level Optimization
Sanjeev Arora
S. Du
Sham Kakade
Yuping Luo
Nikunj Saunshi
23
60
0
24 Feb 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Omer Gottesman
Joseph D. Futoma
Yao Liu
Soanli Parbhoo
Leo Anthony Celi
Emma Brunskill
Finale Doshi-Velez
OffRL
147
56
0
10 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
C. Shi
Runzhe Wan
R. Song
Wenbin Lu
Ling Leng
28
37
0
05 Feb 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
35
152
0
15 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
31
184
0
28 Oct 2019
IPO: Interior-point Policy Optimization under Constraints
Yongshuai Liu
J. Ding
Xin Liu
24
176
0
21 Oct 2019
Regret Bounds for Batched Bandits
Hossein Esfandiari
Amin Karbasi
Abbas Mehrabian
Vahab Mirrokni
41
61
0
11 Oct 2019
Off-Policy Evaluation in Partially Observable Environments
Guy Tennenholtz
Shie Mannor
Uri Shalit
OffRL
22
85
0
09 Sep 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
52
183
0
22 Aug 2019
Reinforcement Learning with Convex Constraints
Sobhan Miryoosefi
Kianté Brantley
Hal Daumé
Miroslav Dudík
Robert Schapire
25
90
0
21 Jun 2019
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
29
89
0
23 May 2019
Control Regularization for Reduced Variance Reinforcement Learning
Richard Cheng
Abhinav Verma
G. Orosz
Swarat Chaudhuri
Yisong Yue
J. W. Burdick
OffRL
28
77
0
14 May 2019
Previous
1
2