ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.11448
  4. Cited By
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch
  Optimization for Deployment Constrained Reinforcement Learning

MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning

23 February 2021
DiJia Su
Jason D. Lee
John M. Mulvey
H. Vincent Poor
    OffRL
ArXivPDFHTML

Papers citing "MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning"

27 / 27 papers shown
Title
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline
  and Online RL
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
269
120
0
21 Jul 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
88
608
0
16 Jun 2020
Deployment-Efficient Reinforcement Learning via Model-Based Offline
  Optimization
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
T. Matsushima
Hiroki Furuta
Y. Matsuo
Ofir Nachum
S. Gu
OffRL
50
149
0
05 Jun 2020
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
74
767
0
27 May 2020
MOReL : Model-Based Offline Reinforcement Learning
MOReL : Model-Based Offline Reinforcement Learning
Rahul Kidambi
Aravind Rajeswaran
Praneeth Netrapalli
Thorsten Joachims
OffRL
91
669
0
12 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
546
2,022
0
04 May 2020
GenDICE: Generalized Offline Estimation of Stationary Values
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
185
174
0
21 Feb 2020
Dota 2 with Large Scale Deep Reinforcement Learning
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
144
1,822
0
13 Dec 2019
AlgaeDICE: Policy Gradient from Arbitrary Experience
AlgaeDICE: Policy Gradient from Arbitrary Experience
Ofir Nachum
Bo Dai
Ilya Kostrikov
Yinlam Chow
Lihong Li
Dale Schuurmans
OffRL
153
242
0
04 Dec 2019
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
89
685
0
26 Nov 2019
Entity Abstraction in Visual Model-Based Reinforcement Learning
Entity Abstraction in Visual Model-Based Reinforcement Learning
Rishi Veerapaneni
John D. Co-Reyes
Michael Chang
Michael Janner
Chelsea Finn
Jiajun Wu
J. Tenenbaum
Sergey Levine
OCL
OffRL
69
189
0
28 Oct 2019
Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
113
1,226
0
16 Oct 2019
Benchmarking Model-Based Reinforcement Learning
Benchmarking Model-Based Reinforcement Learning
Tingwu Wang
Xuchan Bao
I. Clavera
Jerrick Hoang
Yeming Wen
Eric D. Langlois
Matthew Shunshi Zhang
Guodong Zhang
Pieter Abbeel
Jimmy Ba
OffRL
59
363
0
03 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
92
950
0
19 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
141
336
0
10 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
115
1,055
0
03 Jun 2019
Provably Efficient Q-Learning with Low Switching Cost
Provably Efficient Q-Learning with Low Switching Cost
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
63
93
0
30 May 2019
Model-Based Reinforcement Learning for Atari
Model-Based Reinforcement Learning for Atari
Lukasz Kaiser
Mohammad Babaeizadeh
Piotr Milos
B. Osinski
R. Campbell
...
Sergey Levine
Afroz Mohiuddin
Ryan Sepassi
George Tucker
Henryk Michalewski
OffRL
117
860
0
01 Mar 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
219
1,607
0
07 Dec 2018
Algorithmic Framework for Model-based Deep Reinforcement Learning with
  Theoretical Guarantees
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
Yuping Luo
Huazhe Xu
Yuanzhi Li
Yuandong Tian
Trevor Darrell
Tengyu Ma
OffRL
98
226
0
10 Jul 2018
Deep Reinforcement Learning in a Handful of Trials using Probabilistic
  Dynamics Models
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Kurtland Chua
Roberto Calandra
R. McAllister
Sergey Levine
BDL
221
1,277
0
30 May 2018
Model-Based Value Estimation for Efficient Model-Free Reinforcement
  Learning
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
Vladimir Feinberg
Alvin Wan
Ion Stoica
Michael I. Jordan
Joseph E. Gonzalez
Sergey Levine
OffRL
56
317
0
28 Feb 2018
Model-Ensemble Trust-Region Policy Optimization
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
78
451
0
28 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
292
8,329
0
04 Jan 2018
Mastering Chess and Shogi by Self-Play with a General Reinforcement
  Learning Algorithm
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
...
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
139
1,769
0
05 Dec 2017
Prediction and Control with Temporal Segment Models
Prediction and Control with Temporal Segment Models
Nikhil Mishra
Pieter Abbeel
Igor Mordatch
BDL
51
64
0
12 Mar 2017
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,764
0
19 Feb 2015
1