ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04328
  4. Cited By
Importance Resampling for Off-policy Prediction

Importance Resampling for Off-policy Prediction

11 June 2019
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
    OffRL
ArXivPDFHTML

Papers citing "Importance Resampling for Off-policy Prediction"

28 / 28 papers shown
Title
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement
  Learning
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CML
OffRL
82
1
0
08 Dec 2024
Kernel Metric Learning for In-Sample Off-Policy Evaluation of
  Deterministic RL Policies
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
Haanvid Lee
Tri Wahyu Guntara
Jongmin Lee
Yung-Kyun Noh
Kee-Eung Kim
OffRL
29
1
0
29 May 2024
Saturn: Sample-efficient Generative Molecular Design using Memory
  Manipulation
Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation
Jeff Guo
Philippe Schwaller
Mamba
58
7
0
27 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
36
5
0
22 Feb 2024
Rankitect: Ranking Architecture Search Battling World-class Engineers at
  Meta Scale
Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale
Wei Wen
Kuang-Hung Liu
Igor Fedorov
Xin Zhang
Hang Yin
...
Fangqiu Han
Jiyan Yang
Yuchen Hao
Liang Xiong
Wen-Yen Chen
44
2
0
14 Nov 2023
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline
  Multi-Agent RL via Alternating Stationary Distribution Correction Estimation
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation
Daiki E. Matsunaga
Jongmin Lee
Jaeseok Yoon
Stefanos Leonardos
Pieter Abbeel
Kee-Eung Kim
OODD
OffRL
30
3
0
03 Nov 2023
$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic
  Control
KKK-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control
Michael Giegrich
Roel Oomen
C. Reisinger
OffRL
27
2
0
07 Jun 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
21
14
0
29 Dec 2022
Actor Prioritized Experience Replay
Actor Prioritized Experience Replay
Baturay Saglam
Furkan B. Mutlu
Dogan C. Cicek
Suleyman Serdar Kozat
25
23
0
01 Sep 2022
Conformal Off-policy Prediction
Conformal Off-policy Prediction
Yingying Zhang
C. Shi
Shuang Luo
OffRL
36
10
0
14 Jun 2022
Variance Reduction based Partial Trajectory Reuse to Accelerate Policy
  Gradient Optimization
Variance Reduction based Partial Trajectory Reuse to Accelerate Policy Gradient Optimization
Hua Zheng
Wei Xie
22
3
0
06 May 2022
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
50
5
0
06 Nov 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
39
2
0
17 Oct 2021
Variational Actor-Critic Algorithms
Variational Actor-Critic Algorithms
Yuhua Zhu
Lexing Ying
OffRL
15
0
0
03 Aug 2021
Scalable Safety-Critical Policy Evaluation with Accelerated Rare Event
  Sampling
Scalable Safety-Critical Policy Evaluation with Accelerated Rare Event Sampling
Mengdi Xu
Peide Huang
Fengpei Li
Jiacheng Zhu
Xuewei Qi
K. Oguchi
Zhiyuan Huang
H. Lam
Ding Zhao
11
4
0
19 Jun 2021
Statistical Testing under Distributional Shifts
Statistical Testing under Distributional Shifts
Nikolaj Thams
Sorawit Saengkyongam
Niklas Pfister
J. Peters
OOD
64
9
0
22 May 2021
Learning robust driving policies without online exploration
Learning robust driving policies without online exploration
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
Jun Luo
OffRL
11
2
0
15 Mar 2021
Revisiting Prioritized Experience Replay: A Value Perspective
Revisiting Prioritized Experience Replay: A Value Perspective
Ang Li
Zongqing Lu
Chenglin Miao
22
9
0
05 Feb 2021
Offline Learning of Counterfactual Predictions for Real-World Robotic
  Reinforcement Learning
Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning
Jun Jin
D. Graves
Cameron Haigh
Jun Luo
Martin Jägersand
SSL
OffRL
14
6
0
11 Nov 2020
Affordance as general value function: A computational model
Affordance as general value function: A computational model
D. Graves
Johannes Günther
Jun Luo
AI4CE
21
6
0
27 Oct 2020
Why resampling outperforms reweighting for correcting sampling bias with
  stochastic gradients
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients
Jing An
Lexing Ying
Yuhua Zhu
46
38
0
28 Sep 2020
Revisiting Fundamentals of Experience Replay
Revisiting Fundamentals of Experience Replay
W. Fedus
Prajit Ramachandran
Rishabh Agarwal
Yoshua Bengio
Hugo Larochelle
Mark Rowland
Will Dabney
KELM
OffRL
30
233
0
13 Jul 2020
An Equivalence between Loss Functions and Non-Uniform Sampling in
  Experience Replay
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
Scott Fujimoto
D. Meger
Doina Precup
21
56
0
12 Jul 2020
Learning predictive representations in autonomous driving to improve
  deep reinforcement learning
Learning predictive representations in autonomous driving to improve deep reinforcement learning
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
SSL
24
12
0
26 Jun 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled
  Exploration
Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration
Guy Van den Broeck
Yitao Liang
Mathias Niepert
OffRL
14
3
0
25 Feb 2020
Adaptive Experience Selection for Policy Gradient
Adaptive Experience Selection for Policy Gradient
S. Mohamad
Giovanni Montana
36
0
0
17 Feb 2020
Merging Deterministic Policy Gradient Estimations with Varied
  Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
25
4
0
24 Nov 2019
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Raksha Kumaraswamy
M. Schlegel
Adam White
Martha White
OffRL
12
12
0
15 Nov 2018
1