ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.04251
  4. Cited By
State Advantage Weighting for Offline RL
v1v2 (latest)

State Advantage Weighting for Offline RL

9 October 2022
Jiafei Lyu
Aicheng Gong
Le Wan
Zongqing Lu
Xiu Li
    OffRL
ArXiv (abs)PDFHTML

Papers citing "State Advantage Weighting for Offline RL"

44 / 44 papers shown
Title
Diffusion Policies as an Expressive Policy Class for Offline
  Reinforcement Learning
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Zhendong Wang
Jonathan J. Hunt
Mingyuan Zhou
OffRL
100
383
0
12 Aug 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
75
110
0
09 Jun 2022
IL-flOw: Imitation Learning from Observation using Normalizing Flows
IL-flOw: Imitation Learning from Observation using Normalizing Flows
Wei-Di Chang
J. A. G. Higuera
Scott Fujimoto
David Meger
Gregory Dudek
63
9
0
19 May 2022
Learning Value Functions from Undirected State-only Experience
Learning Value Functions from Undirected State-only Experience
Matthew Chang
Arjun Gupta
Saurabh Gupta
OffRL
51
8
0
26 Apr 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement
  Learning
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Zhuoran Yang
Zhihong Deng
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
95
139
0
23 Feb 2022
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning
  Leveraging Planning
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning
Christopher P. Diehl
Timo Sievernich
Martin Krüger
F. Hoffmann
Torsten Bertram
OffRL
92
27
0
22 Nov 2021
Offline Reinforcement Learning with Value-based Episodic Memory
Offline Reinforcement Learning with Value-based Episodic Memory
Xiaoteng Ma
Yiqin Yang
Haotian Hu
Qihan Liu
Jun Yang
Chongjie Zhang
Qianchuan Zhao
Bin Liang
OffRL
63
43
0
19 Oct 2021
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
296
922
0
12 Oct 2021
Uncertainty-Based Offline Reinforcement Learning with Diversified
  Q-Ensemble
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An
Seungyong Moon
Jang-Hyun Kim
Hyun Oh Song
OffRL
163
283
0
04 Oct 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
88
169
0
16 Jun 2021
A Minimalist Approach to Offline Reinforcement Learning
A Minimalist Approach to Offline Reinforcement Learning
Scott Fujimoto
S. Gu
OffRL
130
827
0
12 Jun 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Michael Janner
Qiyang Li
Sergey Levine
OffRL
156
684
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
136
1,656
0
02 Jun 2021
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu
Shuangfei Zhai
Nitish Srivastava
J. Susskind
Jian Zhang
Ruslan Salakhutdinov
Hanlin Goh
EDLOffRLOnRL
64
189
0
17 May 2021
Offline Reinforcement Learning with Fisher Divergence Critic
  Regularization
Offline Reinforcement Learning with Fisher Divergence Critic Regularization
Ilya Kostrikov
Jonathan Tompson
Rob Fergus
Ofir Nachum
OffRL
134
305
0
14 Mar 2021
Continuous Doubly Constrained Batch Reinforcement Learning
Continuous Doubly Constrained Batch Reinforcement Learning
Rasool Fakoor
Jonas W. Mueller
Kavosh Asadi
Pratik Chaudhari
Alex Smola
OffRL
256
27
0
18 Feb 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
270
433
0
16 Feb 2021
PLAS: Latent Action Space for Offline Reinforcement Learning
PLAS: Latent Action Space for Offline Reinforcement Learning
Wenxuan Zhou
Sujay Bajracharya
David Held
OffRL
84
160
0
14 Nov 2020
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline
  and Online RL
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
280
122
0
21 Jul 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRLOnRL
107
612
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRLOnRL
140
1,824
0
08 Jun 2020
State Action Separable Reinforcement Learning
State Action Separable Reinforcement Learning
Ziyao Zhang
Liang Ma
K. Leung
Konstantinos Poularakis
Mudhakar Srivatsa
56
2
0
05 Jun 2020
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
76
772
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
96
673
0
12 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRLGP
561
2,040
0
04 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
226
1,377
0
15 Apr 2020
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
Ashley D. Edwards
Himanshu Sahni
Rosanne Liu
Jane Hung
Ankit Jain
Rui Wang
Adrien Ecoffet
Thomas Miconi
Charles Isbell
J. Yosinski
OffRL
37
18
0
21 Feb 2020
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
92
689
0
26 Nov 2019
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement
  Learning
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
Ziyi Wang
Che Wang
Yanqiu Wu
George Andriopoulos
OffRL
90
124
0
27 Oct 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy
  Reinforcement Learning
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
147
569
0
01 Oct 2019
Sample-efficient Adversarial Imitation Learning from Observation
Sample-efficient Adversarial Imitation Learning from Observation
F. Torabi
S. Geiger
Garrett A. Warnell
Peter Stone
53
13
0
18 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
151
337
0
10 Jun 2019
Can You Trust Your Model's Uncertainty? Evaluating Predictive
  Uncertainty Under Dataset Shift
Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
Yaniv Ovadia
Emily Fertig
Jie Jessie Ren
Zachary Nado
D. Sculley
Sebastian Nowozin
Joshua V. Dillon
Balaji Lakshminarayanan
Jasper Snoek
UQCV
175
1,704
0
06 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
132
1,066
0
03 Jun 2019
Recent Advances in Imitation Learning from Observation
Recent Advances in Imitation Learning from Observation
F. Torabi
Garrett A. Warnell
Peter Stone
72
164
0
30 May 2019
Provably Efficient Imitation Learning from Observation Alone
Provably Efficient Imitation Learning from Observation Alone
Wen Sun
Anirudh Vemula
Byron Boots
J. Andrew Bagnell
163
107
0
27 May 2019
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate
  Shift
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada
Marc G. Bellemare
OffRL
64
99
0
27 Jan 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRLBDL
234
1,624
0
07 Dec 2018
Addressing Function Approximation Error in Actor-Critic Methods
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
180
5,204
0
26 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
314
8,396
0
04 Jan 2018
A Distributional Perspective on Reinforcement Learning
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
98
1,506
0
21 Jul 2017
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
223
5,085
0
05 Jun 2016
Dueling Network Architectures for Deep Reinforcement Learning
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
91
3,764
0
20 Nov 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference
  Learning
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
91
272
0
14 Mar 2015
1