Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.03359
Cited By
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
9 May 2018
Joshua Romoff
Peter Henderson
Alexandre Piché
Vincent François-Lavet
Joelle Pineau
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reward Estimation for Variance Reduction in Deep Reinforcement Learning"
27 / 27 papers shown
Title
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
52
2,482
0
19 Apr 2019
Combined Reinforcement Learning via Abstract Representations
Vincent François-Lavet
Yoshua Bengio
Doina Precup
Joelle Pineau
OffRL
47
89
0
12 Sep 2018
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
Justin Fu
Avi Singh
Dibya Ghosh
Larry Yang
Sergey Levine
BDL
35
125
0
29 May 2018
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
Vladimir Feinberg
Alvin Wan
Ion Stoica
Michael I. Jordan
Joseph E. Gonzalez
Sergey Levine
OffRL
50
317
0
28 Feb 2018
Learning the Reward Function for a Misspecified Model
Erik Talvitie
43
10
0
29 Jan 2018
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Peter Henderson
T. Doan
Riashat Islam
David Meger
BDL
35
13
0
06 Dec 2017
Inverse Reward Design
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
64
394
0
08 Nov 2017
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
84
300
0
31 Oct 2017
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
GAN
44
73
0
20 Sep 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
103
1,940
0
19 Sep 2017
Mean Actor Critic
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
46
44
0
01 Sep 2017
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
69
1,497
0
21 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
234
18,685
0
20 Jul 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
67
552
0
19 Jul 2017
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
42
57
0
15 Jun 2017
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
91
3,243
0
12 Jun 2017
Reinforcement Learning with a Corrupted Reward Channel
Tom Everitt
Victoria Krakovna
Laurent Orseau
Marcus Hutter
Shane Legg
70
100
0
23 May 2017
Model-Based Planning with Discrete and Continuous Actions
Mikael Henaff
William F. Whitney
Yann LeCun
53
16
0
19 May 2017
The Predictron: End-To-End Learning and Planning
David Silver
H. V. Hasselt
Matteo Hessel
Tom Schaul
A. Guez
...
Gabriel Dulac-Arnold
David P. Reichert
Neil C. Rabinowitz
André Barreto
T. Degris
47
289
0
28 Dec 2016
Unsupervised Perceptual Rewards for Imitation Learning
P. Sermanet
Kelvin Xu
Sergey Levine
SSL
58
158
0
20 Dec 2016
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
43
1,225
0
16 Nov 2016
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
174
5,056
0
05 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
166
8,805
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
40
3,368
0
08 Jun 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
815
149,474
0
22 Dec 2014
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
78
2,992
0
19 Jul 2012
Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
Jonathan Sorg
Satinder Singh
Richard L. Lewis
OffRL
68
69
0
15 Mar 2012
1