ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.05533
  4. Cited By
Overcoming Model Bias for Robust Offline Deep Reinforcement Learning
v1v2v3v4 (latest)

Overcoming Model Bias for Robust Offline Deep Reinforcement Learning

12 August 2020
Phillip Swazinna
Steffen Udluft
Thomas Runkler
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Overcoming Model Bias for Robust Offline Deep Reinforcement Learning"

35 / 35 papers shown
Title
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
479
0
0
10 Feb 2025
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
78
773
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
101
676
0
12 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GPOffRL
229
1,381
0
15 Apr 2020
Keep Doing What Worked: Behavioral Modelling Priors for Offline
  Reinforcement Learning
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning
Noah Y. Siegel
Jost Tobias Springenberg
Felix Berkenkamp
A. Abdolmaleki
Michael Neunert
Thomas Lampe
Roland Hafner
Nicolas Heess
Martin Riedmiller
OffRL
62
283
0
19 Feb 2020
Benchmarking Batch Deep Reinforcement Learning Algorithms
Benchmarking Batch Deep Reinforcement Learning Algorithms
Shih-Han Chou
Wen-Yen Chang
W. Hsu
Jianlong Fu
OffRL
63
185
0
03 Oct 2019
An Optimistic Perspective on Offline Reinforcement Learning
An Optimistic Perspective on Offline Reinforcement Learning
Rishabh Agarwal
Dale Schuurmans
Mohammad Norouzi
OffRLOnRL
76
70
0
10 Jul 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
130
343
0
30 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRLOnRL
137
1,066
0
03 Jun 2019
Exploring the Limitations of Behavior Cloning for Autonomous Driving
Exploring the Limitations of Behavior Cloning for Autonomous Driving
Felipe Codevilla
Eder Santana
Antonio M. López
Adrien Gaidon
62
546
0
18 Apr 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRLBDL
251
1,625
0
07 Dec 2018
Implicit Quantile Networks for Distributional Reinforcement Learning
Implicit Quantile Networks for Distributional Reinforcement Learning
Will Dabney
Georg Ostrovski
David Silver
Rémi Munos
OffRL
139
532
0
14 Jun 2018
Model-Ensemble Trust-Region Policy Optimization
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
84
452
0
28 Feb 2018
Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Vitchyr H. Pong
S. Gu
Murtaza Dalal
Sergey Levine
OffRL
116
240
0
25 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
317
8,420
0
04 Jan 2018
Safe Policy Improvement with Baseline Bootstrapping
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
66
201
0
19 Dec 2017
Interpretable Policies for Reinforcement Learning by Genetic Programming
Interpretable Policies for Reinforcement Learning by Genetic Programming
D. Hein
Steffen Udluft
Thomas Runkler
OffRL
59
134
0
12 Dec 2017
Overcoming Exploration in Reinforcement Learning with Demonstrations
Overcoming Exploration in Reinforcement Learning with Demonstrations
Ashvin Nair
Bob McGrew
Marcin Andrychowicz
Wojciech Zaremba
Pieter Abbeel
OffRL
102
789
0
28 Sep 2017
A Benchmark Environment Motivated by Industrial Control Problems
A Benchmark Environment Motivated by Industrial Control Problems
D. Hein
Stefan Depeweg
Michel Tokic
Steffen Udluft
A. Hentschel
Thomas Runkler
V. Sterzing
OffRL
104
59
0
27 Sep 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with
  Model-Free Fine-Tuning
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
104
975
0
08 Aug 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
541
19,296
0
20 Jul 2017
Hindsight Experience Replay
Hindsight Experience Replay
Marcin Andrychowicz
Dwight Crow
Alex Ray
Jonas Schneider
Rachel Fong
Peter Welinder
Bob McGrew
Joshua Tobin
Pieter Abbeel
Wojciech Zaremba
OffRL
278
2,339
0
05 Jul 2017
DART: Noise Injection for Robust Imitation Learning
DART: Noise Injection for Robust Imitation Learning
Michael Laskey
Jonathan Lee
Roy Fox
Anca Dragan
Ken Goldberg
193
249
0
27 Mar 2017
Deep Exploration via Randomized Value Functions
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
102
307
0
22 Mar 2017
Sample Efficient Actor-Critic with Experience Replay
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
106
762
0
03 Nov 2016
Generative Adversarial Imitation Learning
Generative Adversarial Imitation Learning
Jonathan Ho
Stefano Ermon
GAN
159
3,119
0
10 Jun 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
179
1,484
0
06 Jun 2016
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
223
5,086
0
05 Jun 2016
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian
  Neural Networks
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks
Stefan Depeweg
José Miguel Hernández-Lobato
Finale Doshi-Velez
Steffen Udluft
BDL
70
160
0
23 May 2016
Weight Normalization: A Simple Reparameterization to Accelerate Training
  of Deep Neural Networks
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Tim Salimans
Diederik P. Kingma
ODL
196
1,945
0
25 Feb 2016
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
327
13,289
0
09 Sep 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
279
6,801
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,364
0
22 Dec 2014
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
455
16,923
0
20 Dec 2013
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
129
12,269
0
19 Dec 2013
1