Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.07432
Cited By
Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
14 October 2022
Albert Wilcox
Ashwin Balakrishna
Jules Dedieu
Wyame Benslimane
Daniel S. Brown
Ken Goldberg
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations"
25 / 25 papers shown
Title
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
92
11
0
11 Oct 2024
LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Sparse Reward Iterative Tasks
Albert Wilcox
Ashwin Balakrishna
Brijen Thananjeyan
Joseph E. Gonzalez
Ken Goldberg
47
12
0
10 Jul 2021
Learning Dense Rewards for Contact-Rich Manipulation Tasks
Zheng Wu
Wenzhao Lian
Vaibhav Unhelkar
Masayoshi Tomizuka
S. Schaal
103
37
0
17 Nov 2020
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones
Brijen Thananjeyan
Ashwin Balakrishna
Suraj Nair
Michael Luo
K. Srinivasan
M. Hwang
Joseph E. Gonzalez
Julian Ibarz
Chelsea Finn
Ken Goldberg
OffRL
36
221
0
29 Oct 2020
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Yuke Zhu
J. Wong
Ajay Mandlekar
Roberto Martín-Martín
Abhishek Joshi
Soroush Nasiriany
Yifeng Zhu
Soroush Nasiriany
Yifeng Zhu
150
438
0
25 Sep 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
77
601
0
16 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
104
1,780
0
08 Jun 2020
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
201
191
0
08 May 2020
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance
Mingxuan Jing
Xiaojian Ma
Wenbing Huang
F. Sun
Chao Yang
Bin Fang
Huaping Liu
35
60
0
16 Nov 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
113
548
0
01 Oct 2019
Deep Reinforcement Learning and the Deadly Triad
H. V. Hasselt
Yotam Doron
Florian Strub
Matteo Hessel
Nicolas Sonnerat
Joseph Modayil
OffRL
63
226
0
06 Dec 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
157
5,121
0
26 Feb 2018
Reinforcement Learning from Imperfect Demonstrations
Yang Gao
Huazhe Xu
Ji Lin
Feng Yu
Sergey Levine
Trevor Darrell
60
201
0
14 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
227
8,236
0
04 Jan 2018
Overcoming Exploration in Reinforcement Learning with Demonstrations
Ashvin Nair
Bob McGrew
Marcin Andrychowicz
Wojciech Zaremba
Pieter Abbeel
OffRL
77
777
0
28 Sep 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
109
1,079
0
28 Sep 2017
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Matej Vecerík
Todd Hester
Jonathan Scholz
Fumin Wang
Olivier Pietquin
Bilal Piot
N. Heess
Thomas Rothörl
Thomas Lampe
Martin Riedmiller
OffRL
57
661
0
27 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
241
18,685
0
20 Jul 2017
Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning
Sahil Sharma
J. GirishRaguvir
S. Ramesh
Balaraman Ravindran
23
6
0
21 May 2017
Count-Based Exploration with Neural Density Models
Georg Ostrovski
Marc G. Bellemare
Aaron van den Oord
Rémi Munos
74
616
0
03 Mar 2017
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
162
1,465
0
06 Jun 2016
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
134
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
207
13,174
0
09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
45
3,368
0
08 Jun 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
889
149,474
0
22 Dec 2014
1