ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.08473
  4. Cited By
Off-Policy Policy Gradient with State Distribution Correction

Off-Policy Policy Gradient with State Distribution Correction

17 April 2019
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Off-Policy Policy Gradient with State Distribution Correction"

31 / 31 papers shown
Title
On The Statistical Complexity of Offline Decision-Making
On The Statistical Complexity of Offline Decision-Making
Thanh Nguyen-Tang
R. Arora
OffRL
48
1
0
10 Jan 2025
On Sample-Efficient Offline Reinforcement Learning: Data Diversity,
  Posterior Sampling, and Beyond
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Thanh Nguyen-Tang
Raman Arora
OffRL
35
3
0
06 Jan 2024
A General Offline Reinforcement Learning Framework for Interactive
  Recommendation
A General Offline Reinforcement Learning Framework for Interactive Recommendation
Teng Xiao
Donglin Wang
OffRL
34
73
0
01 Oct 2023
Reinforcement Learning Tutor Better Supported Lower Performers in a Math
  Task
Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
...
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
28
18
0
11 Apr 2023
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
Hsin-En Su
Yen-Ju Chen
Ping-Chun Hsieh
Xi Liu
OffRL
26
0
0
10 Dec 2022
Behavior Estimation from Multi-Source Data for Offline Reinforcement
  Learning
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
Guoxi Zhang
H. Kashima
OffRL
29
2
0
29 Nov 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
58
16
0
23 Nov 2022
Offline Policy Optimization with Eligible Actions
Offline Policy Optimization with Eligible Actions
Yao Liu
Yannis Flet-Berliac
Emma Brunskill
OffRL
25
5
0
01 Jul 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu Wang
OffRL
30
4
0
10 Jun 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral
  Cloning?
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
45
77
0
12 Apr 2022
Model-Based Offline Meta-Reinforcement Learning with Regularization
Model-Based Offline Meta-Reinforcement Learning with Regularization
Sen Lin
Jialin Wan
Tengyu Xu
Yingbin Liang
Junshan Zhang
OffRL
33
17
0
07 Feb 2022
A Temporal-Difference Approach to Policy Gradient Estimation
A Temporal-Difference Approach to Policy Gradient Estimation
Samuele Tosatto
Andrew Patterson
Martha White
A. R. Mahmood
OffRL
27
2
0
04 Feb 2022
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
32
10
0
04 Nov 2021
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Raghuram Bharadwaj Diddigi
Prateek Jain
P. J
S. Bhatnagar
CML
OffRL
19
3
0
19 Oct 2021
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
46
8
0
29 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network Approach
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
42
36
0
05 Aug 2021
Characterizing the Gap Between Actor-Critic and Policy Gradient
Characterizing the Gap Between Actor-Critic and Policy Gradient
Junfeng Wen
Saurabh Kumar
Ramki Gummadi
Dale Schuurmans
34
15
0
13 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with
  Density-Ratio Correction
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
19
5
0
02 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
65
29
0
26 May 2021
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
32
49
0
25 Mar 2021
Harnessing Distribution Ratio Estimators for Learning Agents with
  Quality and Diversity
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
Tanmay Gangwani
Jian Peng
Yuanshuo Zhou
29
10
0
05 Nov 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
OffRL
34
81
0
23 Jul 2020
Off-policy Bandits with Deficient Support
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
38
75
0
16 Jun 2020
A Survey of Deep Learning for Scientific Discovery
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
42
120
0
26 Mar 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
27
32
0
24 Mar 2020
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved
  Confounding
Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding
Hongseok Namkoong
Ramtin Keramati
Steve Yadlowsky
Emma Brunskill
OffRL
24
63
0
12 Mar 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
41
17
0
06 Feb 2020
All-Action Policy Gradient Methods: A Numerical Integration Approach
All-Action Policy Gradient Methods: A Numerical Integration Approach
Benjamin Petit
Loren Amdahl-Culleton
Yao Liu
Jimmy T.H. Smith
Pierre-Luc Bacon
24
9
0
21 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
Empirical Likelihood for Contextual Bandits
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
23
9
0
07 Jun 2019
Learning When-to-Treat Policies
Learning When-to-Treat Policies
Xinkun Nie
Emma Brunskill
Stefan Wager
CML
OffRL
11
89
0
23 May 2019
1