Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08977
Cited By
v1
v2 (latest)
Off-Policy Risk Assessment in Contextual Bandits
18 April 2021
Audrey Huang
Liu Leqi
Zachary Chase Lipton
Kamyar Azizzadenesheli
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Off-Policy Risk Assessment in Contextual Bandits"
26 / 26 papers shown
Title
Universal Off-Policy Evaluation
Yash Chandak
S. Niekum
Bruno C. da Silva
Erik Learned-Miller
Emma Brunskill
Philip S. Thomas
OffRL
ELM
77
53
0
26 Apr 2021
High-Confidence Off-Policy (or Counterfactual) Variance Estimation
Yash Chandak
Shiv Shankar
Philip S. Thomas
OffRL
31
8
0
25 Jan 2021
Risk-Constrained Thompson Sampling for CVaR Bandits
Joel Q. L. Chang
Qiuyu Zhu
Vincent Y. F. Tan
54
13
0
16 Nov 2020
Learning Bounds for Risk-sensitive Learning
Jaeho Lee
Sejun Park
Jinwoo Shin
62
47
0
15 Jun 2020
Statistical Learning with Conditional Value at Risk
Tasuku Soma
Yuichi Yoshida
82
38
0
14 Feb 2020
Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Ramtin Keramati
Christoph Dann
Alex Tamkin
Emma Brunskill
80
75
0
05 Nov 2019
Adaptive Sampling for Stochastic Risk-Averse Learning
Sebastian Curi
Kfir Y. Levy
Stefanie Jegelka
Andreas Krause
118
54
0
28 Oct 2019
Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards
Anmol Kagrecha
Jayakrishnan Nair
Krishna Jagannathan
58
47
0
03 Jun 2019
X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks
Léonard Torossian
Aurélien Garivier
Victor Picheny
37
18
0
17 Apr 2019
Learning Models with Uniform Performance via Distributionally Robust Optimization
John C. Duchi
Hongseok Namkoong
OOD
72
424
0
20 Oct 2018
Implicit Quantile Networks for Distributional Reinforcement Learning
Will Dabney
Georg Ostrovski
David Silver
Rémi Munos
OffRL
139
532
0
14 Jun 2018
Best Arm Identification for Contaminated Bandits
Jason M. Altschuler
Victor-Emmanuel Brunel
Alan Malek
41
45
0
26 Feb 2018
Distributional Reinforcement Learning with Quantile Regression
Will Dabney
Mark Rowland
Marc G. Bellemare
Rémi Munos
94
764
0
27 Oct 2017
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
101
1,506
0
21 Jul 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
123
222
0
04 Dec 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
432
577
0
04 Apr 2016
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Yinlam Chow
Mohammad Ghavamzadeh
Lucas Janson
Marco Pavone
91
517
0
05 Dec 2015
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
217
624
0
11 Nov 2015
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control
A. PrashanthL.
Cheng Jie
Michael Fu
Steve Marcus
Csaba Szepesvári
83
91
0
08 Jun 2015
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
192
290
0
10 Mar 2015
Policy Gradient for Coherent Risk Measures
Aviv Tamar
Yinlam Chow
Mohammad Ghavamzadeh
Shie Mannor
62
120
0
13 Feb 2015
Generalized Risk-Aversion in Stochastic Multi-Armed Bandits
Alexander Zimin
Rasmus Ibsen-Jensen
K. Chatterjee
54
38
0
05 May 2014
Mean-Variance Optimization in Markov Decision Processes
Shie Mannor
J. Tsitsiklis
101
126
0
29 Apr 2011
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
347
698
0
23 Mar 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
473
2,957
0
28 Feb 2010
Importance Weighted Active Learning
A. Beygelzimer
S. Dasgupta
John Langford
95
364
0
29 Dec 2008
1