Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02039
Cited By
Offline Reinforcement Learning as One Big Sequence Modeling Problem
3 June 2021
Michael Janner
Qiyang Li
Sergey Levine
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Offline Reinforcement Learning as One Big Sequence Modeling Problem"
39 / 89 papers shown
Title
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
99
611
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
137
1,812
0
08 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
755
41,932
0
28 May 2020
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
76
768
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
93
671
0
12 May 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
221
1,368
0
15 Apr 2020
Adaptive Transformers in RL
Shakti Kumar
Jerrod Parker
Panteha Naderian
OffRL
AI4CE
21
13
0
08 Apr 2020
Reward-Conditioned Policies
Aviral Kumar
Xue Bin Peng
Sergey Levine
60
96
0
31 Dec 2019
Training Agents using Upside-Down Reinforcement Learning
R. Srivastava
Pranav Shyam
Filipe Wall Mutz
Wojciech Ja'skowski
Jürgen Schmidhuber
OffRL
65
126
0
05 Dec 2019
Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions
J. Schmidhuber
53
131
0
05 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
493
42,407
0
03 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
89
687
0
26 Nov 2019
Stabilizing Transformers for Reinforcement Learning
Emilio Parisotto
H. F. Song
Jack W. Rae
Razvan Pascanu
Çağlar Gülçehre
...
Aidan Clark
Seb Noury
M. Botvinick
N. Heess
R. Hadsell
OffRL
81
364
0
13 Oct 2019
Deep Dynamics Models for Learning Dexterous Manipulation
Anusha Nagabandi
K. Konolige
Sergey Levine
Vikash Kumar
216
414
0
25 Sep 2019
Exploring Model-based Planning with Policy Networks
Tingwu Wang
Jimmy Ba
77
149
0
20 Jun 2019
Calibrated Model-Based Deep Reinforcement Learning
Ali Malik
Volodymyr Kuleshov
Jiaming Song
Danny Nemer
Harlan Seymour
Stefano Ermon
145
55
0
19 Jun 2019
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
95
951
0
19 Jun 2019
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Yiding Jiang
S. Gu
Kevin Patrick Murphy
Chelsea Finn
OffRL
53
225
0
18 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
121
1,056
0
03 Jun 2019
Insertion-based Decoding with automatically Inferred Generation Order
Jiatao Gu
Qi Liu
Kyunghyun Cho
52
109
0
04 Feb 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
226
1,608
0
07 Dec 2018
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
Jacob Buckman
Danijar Hafner
George Tucker
E. Brevdo
Honglak Lee
91
332
0
04 Jul 2018
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings
John D. Co-Reyes
YuXuan Liu
Abhishek Gupta
Benjamin Eysenbach
Pieter Abbeel
Sergey Levine
SSL
BDL
AIFin
57
145
0
07 Jun 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Kurtland Chua
Roberto Calandra
R. McAllister
Sergey Levine
BDL
224
1,277
0
30 May 2018
Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review
Sergey Levine
AI4CE
BDL
78
672
0
02 May 2018
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
84
452
0
28 Feb 2018
Hindsight policy gradients
Paulo E. Rauber
Avinash Ummadisingu
Filipe Wall Mutz
J. Schmidhuber
56
68
0
16 Nov 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
91
973
0
08 Aug 2017
Hindsight Experience Replay
Marcin Andrychowicz
Dwight Crow
Alex Ray
Jonas Schneider
Rachel Fong
Peter Welinder
Bob McGrew
Joshua Tobin
Pieter Abbeel
Wojciech Zaremba
OffRL
248
2,326
0
05 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
698
131,526
0
12 Jun 2017
Recurrent Environment Simulators
Silvia Chiappa
S. Racanière
Daan Wierstra
S. Mohamed
65
208
0
07 Apr 2017
Predicting Personal Traits from Facial Images using Convolutional Neural Networks Augmented with Facial Landmark Information
Yoad Lewenberg
Valliappa Chockalingam
Satinder Singh
Honglak Lee
CVBM
62
304
0
29 May 2016
Memory-based control with recurrent neural networks
N. Heess
Jonathan J. Hunt
Timothy Lillicrap
David Silver
84
302
0
14 Dec 2015
Learning Continuous Control Policies by Stochastic Value Gradients
N. Heess
Greg Wayne
David Silver
Timothy Lillicrap
Yuval Tassa
Tom Erez
97
560
0
30 Oct 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
Explaining and Harnessing Adversarial Examples
Ian Goodfellow
Jonathon Shlens
Christian Szegedy
AAML
GAN
277
19,049
0
20 Dec 2014
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
434
20,553
0
10 Sep 2014
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Stéphane Ross
Geoffrey J. Gordon
J. Andrew Bagnell
OffRL
216
3,216
0
02 Nov 2010
Reinforcement Learning by Value Gradients
Michael Fairbank
SSL
108
28
0
25 Mar 2008
Previous
1
2