ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05522
  4. Cited By
BRPO: Batch Residual Policy Optimization

BRPO: Batch Residual Policy Optimization

8 February 2020
Kentaro Kanamori
Yinlam Chow
Takuya Takagi
Hiroki Arimura
Honglak Lee
Ken Kobayashi
Craig Boutilier
    OffRL
ArXivPDFHTML

Papers citing "BRPO: Batch Residual Policy Optimization"

20 / 20 papers shown
Title
Benchmarking Batch Deep Reinforcement Learning Algorithms
Benchmarking Batch Deep Reinforcement Learning Algorithms
Shih-Han Chou
Wen-Yen Chang
W. Hsu
Jianlong Fu
OffRL
44
182
0
03 Oct 2019
Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real
Ofir Nachum
Michael Ahn
Hugo Ponte
S. Gu
Vikash Kumar
48
90
0
13 Aug 2019
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human
  Preferences in Dialog
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
72
338
0
30 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
76
1,044
0
03 Jun 2019
Large-Scale Markov Decision Problems via the Linear Programming Dual
Large-Scale Markov Decision Problems via the Linear Programming Dual
Yasin Abbasi-Yadkori
Peter L. Bartlett
Xi Chen
Alan Malek
23
13
0
06 Jan 2019
Residual Policy Learning
Residual Policy Learning
Tom Silver
Kelsey R. Allen
J. Tenenbaum
L. Kaelbling
OffRL
45
174
0
15 Dec 2018
Residual Reinforcement Learning for Robot Control
Residual Reinforcement Learning for Robot Control
T. Johannink
Shikhar Bahl
Ashvin Nair
Jianlan Luo
Avinash Kumar
M. Loskyll
J. A. Ojea
Eugen Solowjow
Sergey Levine
OffRL
55
412
0
07 Dec 2018
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
148
1,586
0
07 Dec 2018
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
Horizon: Facebook's Open Source Applied Reinforcement Learning Platform
J. Gauci
Edoardo Conti
Yitao Liang
Kittipat Virochsiri
Yuchen He
Zachary Kaden
Vivek Narayanan
Xiaohui Ye
Zhengxing Chen
Scott Fujimoto
42
139
0
01 Nov 2018
Natural Gradient Deep Q-learning
Natural Gradient Deep Q-learning
Ethan Knight
Osher Lerner
28
10
0
20 Mar 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
194
8,236
0
04 Jan 2018
Safe Policy Improvement with Baseline Bootstrapping
Safe Policy Improvement with Baseline Bootstrapping
Romain Laroche
P. Trichelair
Rémi Tachet des Combes
OffRL
48
198
0
19 Dec 2017
OptNet: Differentiable Optimization as a Layer in Neural Networks
OptNet: Differentiable Optimization as a Layer in Neural Networks
Brandon Amos
J. Zico Kolter
130
952
0
01 Mar 2017
Combining policy gradient and Q-learning
Combining policy gradient and Q-learning
Brendan O'Donoghue
Rémi Munos
Koray Kavukcuoglu
Volodymyr Mnih
OffRL
OnRL
53
139
0
05 Nov 2016
Safe Policy Improvement by Minimizing Robust Baseline Regret
Safe Policy Improvement by Minimizing Robust Baseline Regret
Marek Petrik
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
32
134
0
13 Jul 2016
Policy Distillation
Policy Distillation
Andrei A. Rusu
Sergio Gomez Colmenarejo
Çağlar Gülçehre
Guillaume Desjardins
J. Kirkpatrick
Razvan Pascanu
Volodymyr Mnih
Koray Kavukcuoglu
R. Hadsell
45
685
0
19 Nov 2015
Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
125
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
176
13,174
0
09 Sep 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
237
6,722
0
19 Feb 2015
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
89
12,163
0
19 Dec 2013
1