Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04328
Cited By
Importance Resampling for Off-policy Prediction
11 June 2019
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Importance Resampling for Off-policy Prediction"
28 / 28 papers shown
Title
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CML
OffRL
82
1
0
08 Dec 2024
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
Haanvid Lee
Tri Wahyu Guntara
Jongmin Lee
Yung-Kyun Noh
Kee-Eung Kim
OffRL
29
1
0
29 May 2024
Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation
Jeff Guo
Philippe Schwaller
Mamba
58
7
0
27 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
36
5
0
22 Feb 2024
Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Wei Wen
Kuang-Hung Liu
Igor Fedorov
Xin Zhang
Hang Yin
...
Fangqiu Han
Jiyan Yang
Yuchen Hao
Liang Xiong
Wen-Yen Chen
44
2
0
14 Nov 2023
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation
Daiki E. Matsunaga
Jongmin Lee
Jaeseok Yoon
Stefanos Leonardos
Pieter Abbeel
Kee-Eung Kim
OODD
OffRL
30
3
0
03 Nov 2023
K
K
K
-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control
Michael Giegrich
Roel Oomen
C. Reisinger
OffRL
27
2
0
07 Jun 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
21
14
0
29 Dec 2022
Actor Prioritized Experience Replay
Baturay Saglam
Furkan B. Mutlu
Dogan C. Cicek
Suleyman Serdar Kozat
25
23
0
01 Sep 2022
Conformal Off-policy Prediction
Yingying Zhang
C. Shi
Shuang Luo
OffRL
36
10
0
14 Jun 2022
Variance Reduction based Partial Trajectory Reuse to Accelerate Policy Gradient Optimization
Hua Zheng
Wei Xie
22
3
0
06 May 2022
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
50
5
0
06 Nov 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Variational Actor-Critic Algorithms
Yuhua Zhu
Lexing Ying
OffRL
15
0
0
03 Aug 2021
Scalable Safety-Critical Policy Evaluation with Accelerated Rare Event Sampling
Mengdi Xu
Peide Huang
Fengpei Li
Jiacheng Zhu
Xuewei Qi
K. Oguchi
Zhiyuan Huang
H. Lam
Ding Zhao
11
4
0
19 Jun 2021
Statistical Testing under Distributional Shifts
Nikolaj Thams
Sorawit Saengkyongam
Niklas Pfister
J. Peters
OOD
64
9
0
22 May 2021
Learning robust driving policies without online exploration
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
Jun Luo
OffRL
11
2
0
15 Mar 2021
Revisiting Prioritized Experience Replay: A Value Perspective
Ang Li
Zongqing Lu
Chenglin Miao
22
9
0
05 Feb 2021
Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning
Jun Jin
D. Graves
Cameron Haigh
Jun Luo
Martin Jägersand
SSL
OffRL
14
6
0
11 Nov 2020
Affordance as general value function: A computational model
D. Graves
Johannes Günther
Jun Luo
AI4CE
21
6
0
27 Oct 2020
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients
Jing An
Lexing Ying
Yuhua Zhu
46
38
0
28 Sep 2020
Revisiting Fundamentals of Experience Replay
W. Fedus
Prajit Ramachandran
Rishabh Agarwal
Yoshua Bengio
Hugo Larochelle
Mark Rowland
Will Dabney
KELM
OffRL
30
233
0
13 Jul 2020
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
Scott Fujimoto
D. Meger
Doina Precup
21
56
0
12 Jul 2020
Learning predictive representations in autonomous driving to improve deep reinforcement learning
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
SSL
24
12
0
26 Jun 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration
Guy Van den Broeck
Yitao Liang
Mathias Niepert
OffRL
14
3
0
25 Feb 2020
Adaptive Experience Selection for Policy Gradient
S. Mohamad
Giovanni Montana
36
0
0
17 Feb 2020
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
25
4
0
24 Nov 2019
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Raksha Kumaraswamy
M. Schlegel
Adam White
Martha White
OffRL
12
12
0
15 Nov 2018
1