Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.05522
Cited By
BRPO: Batch Residual Policy Optimization
8 February 2020
Kentaro Kanamori
Yinlam Chow
Takuya Takagi
Hiroki Arimura
Honglak Lee
Ken Kobayashi
Craig Boutilier
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BRPO: Batch Residual Policy Optimization"
20 / 20 papers shown
Title
Benchmarking Batch Deep Reinforcement Learning Algorithms
Shih-Han Chou
Wen-Yen Chang
W. Hsu
Jianlong Fu
OffRL
44
182
0
03 Oct 2019
Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Ofir Nachum
Michael Ahn
Hugo Ponte
S. Gu
Vikash Kumar
48
90
0
13 Aug 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
72
338
0
30 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
76
1,044
0
03 Jun 2019
Large-Scale Markov Decision Problems via the Linear Programming Dual
Yasin Abbasi-Yadkori
Peter L. Bartlett
Xi Chen
Alan Malek
23
13
0
06 Jan 2019
Residual Policy Learning
Tom Silver
Kelsey R. Allen
J. Tenenbaum
L. Kaelbling
OffRL
45
174
0
15 Dec 2018
Residual Reinforcement Learning for Robot Control
T. Johannink
Shikhar Bahl
Ashvin Nair
Jianlan Luo
Avinash Kumar
M. Loskyll
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
55
412
0
07 Dec 2018
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
148
1,586
0
07 Dec 2018
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
J. Gauci
Edoardo Conti
Yitao Liang
Kittipat Virochsiri
Yuchen He
Zachary Kaden
Vivek Narayanan
Xiaohui Ye
Zhengxing Chen
Scott Fujimoto
42
139
0
01 Nov 2018
Natural Gradient Deep Q-learning
Ethan Knight
Osher Lerner
28
10
0
20 Mar 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
194
8,236
0
04 Jan 2018
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
48
198
0
19 Dec 2017
OptNet: Differentiable Optimization as a Layer in Neural Networks
Brandon Amos
J. Zico Kolter
130
952
0
01 Mar 2017
Combining policy gradient and Q-learning
Brendan O'Donoghue
Rémi Munos
Koray Kavukcuoglu
Volodymyr Mnih
OffRL
OnRL
53
139
0
05 Nov 2016
Safe Policy Improvement by Minimizing Robust Baseline Regret
Marek Petrik
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
32
134
0
13 Jul 2016
Policy Distillation
Andrei A. Rusu
Sergio Gomez Colmenarejo
Çağlar Gülçehre
Guillaume Desjardins
J. Kirkpatrick
Razvan Pascanu
Volodymyr Mnih
Koray Kavukcuoglu
R. Hadsell
45
685
0
19 Nov 2015
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
125
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
176
13,174
0
09 Sep 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
237
6,722
0
19 Feb 2015
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
89
12,163
0
19 Dec 2013
1