ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.13368
  4. Cited By
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

17 April 2025
Haoran Xu
Shuozhe Li
Harshit S. Sikchi
S. Niekum
Amy Zhang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning"

35 / 35 papers shown
Title
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement
  Learning
Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning
Liyuan Mao
Haoran Xu
Weinan Zhang
Xianyuan Zhan
Amy Zhang
OffRL
104
18
0
29 Jul 2024
SMORE: Score Models for Offline Goal-Conditioned Reinforcement Learning
SMORE: Score Models for Offline Goal-Conditioned Reinforcement Learning
Harshit S. Sikchi
Rohan Chitnis
Ahmed Touati
A. Geramifard
Amy Zhang
S. Niekum
OffRL
95
9
0
03 Nov 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value
  Regularization
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Haoran Xu
Li Jiang
Jianxiong Li
Zhuoran Yang
Zhaoran Wang
Victor Chan
Xianyuan Zhan
OffRL
86
84
0
28 Mar 2023
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
94
76
0
13 Dec 2022
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online
  Reinforcement Learning
When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning
Haoyi Niu
Shubham Sharma
Yiwen Qiu
Ming Li
Guyue Zhou
Jianming Hu
Xianyuan Zhan
OffRLOnRL
95
52
0
27 Jun 2022
How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via
  $f$-Advantage Regression
How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via fff-Advantage Regression
Yecheng Jason Ma
Jason Yan
Dinesh Jayaraman
Osbert Bastani
OffRL
80
58
0
07 Jun 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral
  Cloning?
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
110
82
0
12 Apr 2022
Dealing with the Unknown: Pessimistic Offline Reinforcement Learning
Dealing with the Unknown: Pessimistic Offline Reinforcement Learning
Jinning Li
Chen Tang
Masayoshi Tomizuka
Wei Zhan
OffRL
73
22
0
09 Nov 2021
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
301
924
0
12 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified
  Q-Ensemble
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
169
283
0
04 Oct 2021
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
Constraints Penalized Q-learning for Safe Offline Reinforcement Learning
Haoran Xu
Xianyuan Zhan
Xiangyu Zhu
OffRL
65
91
0
19 Jul 2021
OptiDICE: Offline Policy Optimization via Stationary Distribution
  Correction Estimation
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
170
100
0
21 Jun 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
132
828
0
12 Jun 2021
MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale
MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale
Dmitry Kalashnikov
Jacob Varley
Yevgen Chebotar
Benjamin Swanson
Rico Jonschkowski
Chelsea Finn
Sergey Levine
Karol Hausman
OffRL
112
280
0
16 Apr 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale
  of Pessimism
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
225
290
0
22 Mar 2021
Off-Policy Imitation Learning from Observations
Off-Policy Imitation Learning from Observations
Zhuangdi Zhu
Kaixiang Lin
Bo Dai
Jiayu Zhou
OffRL
53
86
0
25 Feb 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffMSyDa
353
6,566
0
26 Nov 2020
Off-Policy Evaluation via the Regularized Lagrangian
Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang
Ofir Nachum
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
41
118
0
07 Jul 2020
Critic Regularized Regression
Critic Regularized Regression
Ziyun Wang
Alexander Novikov
Konrad Zolna
Jost Tobias Springenberg
Scott E. Reed
...
Noah Y. Siegel
J. Merel
Çağlar Gülçehre
N. Heess
Nando de Freitas
OffRL
158
330
0
26 Jun 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
689
18,310
0
19 Jun 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRLOnRL
111
615
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
143
1,831
0
08 Jun 2020
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
78
773
0
27 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
229
1,381
0
15 Apr 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary
  Values
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Shangtong Zhang
Bo Liu
Shimon Whiteson
OffRL
62
103
0
29 Jan 2020
Imitation Learning via Off-Policy Distribution Matching
Imitation Learning via Off-Policy Distribution Matching
Ilya Kostrikov
Ofir Nachum
Jonathan Tompson
OODOffRL
158
205
0
10 Dec 2019
AlgaeDICE: Policy Gradient from Arbitrary Experience
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
161
244
0
04 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
97
689
0
26 Nov 2019
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement
  Learning
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
Ziyi Wang
Che Wang
Yanqiu Wu
George Andriopoulos
OffRL
90
124
0
27 Oct 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
338
0
10 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
134
1,066
0
03 Jun 2019
Deep Residual Reinforcement Learning
Deep Residual Reinforcement Learning
Shangtong Zhang
Wendelin Bohmer
Shimon Whiteson
58
32
0
03 May 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRLBDL
243
1,624
0
07 Dec 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
189
5,212
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,406
0
04 Jan 2018
1