Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.11448
Cited By
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning
23 February 2021
DiJia Su
Jason D. Lee
John M. Mulvey
H. Vincent Poor
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning"
27 / 27 papers shown
Title
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
269
120
0
21 Jul 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
88
608
0
16 Jun 2020
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
T. Matsushima
Hiroki Furuta
Y. Matsuo
Ofir Nachum
S. Gu
OffRL
50
149
0
05 Jun 2020
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
74
767
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
91
669
0
12 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
546
2,022
0
04 May 2020
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
185
174
0
21 Feb 2020
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
144
1,822
0
13 Dec 2019
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
153
242
0
04 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
89
685
0
26 Nov 2019
Entity Abstraction in Visual Model-Based Reinforcement Learning
Rishi Veerapaneni
John D. Co-Reyes
Michael Chang
Michael Janner
Chelsea Finn
Jiajun Wu
J. Tenenbaum
Sergey Levine
OCL
OffRL
69
189
0
28 Oct 2019
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
113
1,226
0
16 Oct 2019
Benchmarking Model-Based Reinforcement Learning
Tingwu Wang
Xuchan Bao
I. Clavera
Jerrick Hoang
Yeming Wen
Eric D. Langlois
Matthew Shunshi Zhang
Guodong Zhang
Pieter Abbeel
Jimmy Ba
OffRL
59
363
0
03 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
92
950
0
19 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
141
336
0
10 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
115
1,055
0
03 Jun 2019
Provably Efficient Q-Learning with Low Switching Cost
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
63
93
0
30 May 2019
Model-Based Reinforcement Learning for Atari
Lukasz Kaiser
Mohammad Babaeizadeh
Piotr Milos
B. Osinski
R. Campbell
...
Sergey Levine
Afroz Mohiuddin
Ryan Sepassi
George Tucker
Henryk Michalewski
OffRL
117
860
0
01 Mar 2019
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
219
1,607
0
07 Dec 2018
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
Yuping Luo
Huazhe Xu
Yuanzhi Li
Yuandong Tian
Trevor Darrell
Tengyu Ma
OffRL
98
226
0
10 Jul 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Kurtland Chua
Roberto Calandra
R. McAllister
Sergey Levine
BDL
221
1,277
0
30 May 2018
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
Vladimir Feinberg
Alvin Wan
Ion Stoica
Michael I. Jordan
Joseph E. Gonzalez
Sergey Levine
OffRL
56
317
0
28 Feb 2018
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
78
451
0
28 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
292
8,329
0
04 Jan 2018
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
...
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
139
1,769
0
05 Dec 2017
Prediction and Control with Temporal Segment Models
Nikhil Mishra
Pieter Abbeel
Igor Mordatch
BDL
51
64
0
12 Mar 2017
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,764
0
19 Feb 2015
1