Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.09066
Cited By
From Importance Sampling to Doubly Robust Policy Gradient
20 October 2019
Jiawei Huang
Nan Jiang
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Importance Sampling to Doubly Robust Policy Gradient"
9 / 9 papers shown
Title
Multi-Fidelity Policy Gradient Algorithms
Xinjie Liu
Cyrus Neary
Kushagra Gupta
Christian Ellis
Ufuk Topcu
David Fridovich-Keil
OffRL
188
0
0
07 Mar 2025
Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions
Kai Xu
Farid Tajaddodianfar
Ben Allison
21
0
0
16 Jun 2024
Recent Advances in the Foundations and Applications of Unbiased Learning to Rank
Shashank Gupta
Philipp Hager
Jin Huang
Ali Vardasbi
Harrie Oosterhuis
OffRL
35
5
0
04 May 2023
Decision-Focused Evaluation: Analyzing Performance of Deployed Restless Multi-Arm Bandits
Paritosh Verma
Shresth Verma
Aditya Mate
Aparna Taneja
Milind Tambe
16
0
0
19 Jan 2023
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
25
5
0
01 Jul 2022
Dealing with the Unknown: Pessimistic Offline Reinforcement Learning
Jinning Li
Chen Tang
Masayoshi Tomizuka
Wei Zhan
OffRL
16
21
0
09 Nov 2021
Adaptive Importance Sampling meets Mirror Descent: a Bias-variance tradeoff
Anna Korba
Franccois Portier
28
12
0
29 Oct 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
Tengyu Xu
Zhuoran Yang
Zhaoran Wang
Yingbin Liang
OffRL
44
24
0
23 Feb 2021
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
21
87
0
12 Sep 2019
1