ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.03359
  4. Cited By
Reward Estimation for Variance Reduction in Deep Reinforcement Learning

Reward Estimation for Variance Reduction in Deep Reinforcement Learning

9 May 2018
Joshua Romoff
Peter Henderson
Alexandre Piché
Vincent François-Lavet
Joelle Pineau
ArXivPDFHTML

Papers citing "Reward Estimation for Variance Reduction in Deep Reinforcement Learning"

27 / 27 papers shown
Title
On the Convergence of Adam and Beyond
On the Convergence of Adam and Beyond
Sashank J. Reddi
Satyen Kale
Surinder Kumar
52
2,482
0
19 Apr 2019
Combined Reinforcement Learning via Abstract Representations
Combined Reinforcement Learning via Abstract Representations
Vincent François-Lavet
Yoshua Bengio
Doina Precup
Joelle Pineau
OffRL
47
89
0
12 Sep 2018
Variational Inverse Control with Events: A General Framework for
  Data-Driven Reward Definition
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
Justin Fu
Avi Singh
Dibya Ghosh
Larry Yang
Sergey Levine
BDL
35
125
0
29 May 2018
Model-Based Value Estimation for Efficient Model-Free Reinforcement
  Learning
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
Vladimir Feinberg
Alvin Wan
Ion Stoica
Michael I. Jordan
Joseph E. Gonzalez
Sergey Levine
OffRL
50
317
0
28 Feb 2018
Learning the Reward Function for a Misspecified Model
Learning the Reward Function for a Misspecified Model
Erik Talvitie
43
10
0
29 Jan 2018
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Bayesian Policy Gradients via Alpha Divergence Dropout Inference
Peter Henderson
T. Doan
Riashat Islam
David Meger
BDL
35
13
0
06 Dec 2017
Inverse Reward Design
Inverse Reward Design
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
64
394
0
08 Nov 2017
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
84
300
0
31 Oct 2017
OptionGAN: Learning Joint Reward-Policy Options using Generative
  Adversarial Inverse Reinforcement Learning
OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
Wei-Di Chang
Pierre-Luc Bacon
David Meger
Joelle Pineau
Doina Precup
GAN
44
73
0
20 Sep 2017
Deep Reinforcement Learning that Matters
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
103
1,940
0
19 Sep 2017
Mean Actor Critic
Mean Actor Critic
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
46
44
0
01 Sep 2017
A Distributional Perspective on Reinforcement Learning
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
69
1,497
0
21 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
234
18,685
0
20 Jul 2017
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
67
552
0
19 Jul 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
42
57
0
15 Jun 2017
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
91
3,243
0
12 Jun 2017
Reinforcement Learning with a Corrupted Reward Channel
Reinforcement Learning with a Corrupted Reward Channel
Tom Everitt
Victoria Krakovna
Laurent Orseau
Marcus Hutter
Shane Legg
70
100
0
23 May 2017
Model-Based Planning with Discrete and Continuous Actions
Model-Based Planning with Discrete and Continuous Actions
Mikael Henaff
William F. Whitney
Yann LeCun
53
16
0
19 May 2017
The Predictron: End-To-End Learning and Planning
The Predictron: End-To-End Learning and Planning
David Silver
H. V. Hasselt
Matteo Hessel
Tom Schaul
A. Guez
...
Gabriel Dulac-Arnold
David P. Reichert
Neil C. Rabinowitz
André Barreto
T. Degris
47
289
0
28 Dec 2016
Unsupervised Perceptual Rewards for Imitation Learning
Unsupervised Perceptual Rewards for Imitation Learning
P. Sermanet
Kelvin Xu
Sergey Levine
SSL
58
158
0
20 Dec 2016
Reinforcement Learning with Unsupervised Auxiliary Tasks
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
43
1,225
0
16 Nov 2016
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
174
5,056
0
05 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
166
8,805
0
04 Feb 2016
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
40
3,368
0
08 Jun 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
815
149,474
0
22 Dec 2014
The Arcade Learning Environment: An Evaluation Platform for General
  Agents
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
78
2,992
0
19 Jul 2012
Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
Variance-Based Rewards for Approximate Bayesian Reinforcement Learning
Jonathan Sorg
Satinder Singh
Richard L. Lewis
OffRL
68
69
0
15 Mar 2012
1