Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.11591
Cited By
Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
28 May 2019
Yufei Wang
Qiwei Ye
Tie-Yan Liu
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Exponentially Discounted Sum: Automatic Learning of Return Function"
28 / 28 papers shown
Title
Discovering Reinforcement Learning Algorithms
Junhyuk Oh
Matteo Hessel
Wojciech M. Czarnecki
Zhongwen Xu
H. V. Hasselt
Satinder Singh
David Silver
59
129
0
17 Jul 2020
Meta-Gradient Reinforcement Learning with an Objective Discovered Online
Zhongwen Xu
H. V. Hasselt
Matteo Hessel
Junhyuk Oh
Satinder Singh
David Silver
68
77
0
16 Jul 2020
A Self-Tuning Actor-Critic Algorithm
Tom Zahavy
Zhongwen Xu
Vivek Veeriah
Matteo Hessel
Junhyuk Oh
H. V. Hasselt
David Silver
Satinder Singh
69
13
0
28 Feb 2020
Discovery of Useful Questions as Auxiliary Tasks
Vivek Veeriah
Matteo Hessel
Zhongwen Xu
Richard L. Lewis
Janarthanan Rajendran
Junhyuk Oh
H. V. Hasselt
David Silver
Satinder Singh
LLMAG
61
85
0
10 Sep 2019
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
Silviu Pitis
OffRL
55
50
0
08 Feb 2019
Separating value functions across time-scales
Joshua Romoff
Peter Henderson
Ahmed Touati
Emma Brunskill
Joelle Pineau
Yann Ollivier
52
25
0
05 Feb 2019
RUDDER: Return Decomposition for Delayed Rewards
Jose A. Arjona-Medina
Michael Gillhofer
Michael Widrich
Thomas Unterthiner
Johannes Brandstetter
Sepp Hochreiter
60
217
0
20 Jun 2018
Relational Deep Reinforcement Learning
V. Zambaldi
David Raposo
Adam Santoro
V. Bapst
Yujia Li
...
Victoria Langston
Razvan Pascanu
M. Botvinick
Oriol Vinyals
Peter W. Battaglia
OffRL
130
220
0
05 Jun 2018
Meta-Gradient Reinforcement Learning
Zhongwen Xu
H. V. Hasselt
David Silver
104
324
0
24 May 2018
On Learning Intrinsic Rewards for Policy Gradient Methods
Zeyu Zheng
Junhyuk Oh
Satinder Singh
57
205
0
17 Apr 2018
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Marlos C. Machado
Marc G. Bellemare
Erik Talvitie
J. Veness
Matthew J. Hausknecht
Michael Bowling
71
552
0
18 Sep 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
444
18,931
0
20 Jul 2017
Count-Based Exploration in Feature Space for Reinforcement Learning
Jarryd Martin
S. N. Sasikumar
Tom Everitt
Marcus Hutter
51
123
0
25 Jun 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
622
130,942
0
12 Jun 2017
Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU
Mohammad Babaeizadeh
I. Frosio
Stephen Tyree
Jason Clemons
Jan Kautz
OffRL
54
259
0
18 Nov 2016
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
74
1,228
0
16 Nov 2016
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Haoran Tang
Rein Houthooft
Davis Foote
Adam Stooke
Xi Chen
Yan Duan
John Schulman
F. Turck
Pieter Abbeel
OffRL
89
767
0
15 Nov 2016
Learning to Navigate in Complex Environments
Piotr Wojciech Mirowski
Razvan Pascanu
Fabio Viola
Hubert Soyer
Andy Ballard
...
Ross Goroshin
Laurent Sifre
Koray Kavukcuoglu
D. Kumaran
R. Hadsell
91
880
0
11 Nov 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
167
1,473
0
06 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
189
8,833
0
04 Feb 2016
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Vincent François-Lavet
R. Fonteneau
D. Ernst
49
111
0
07 Dec 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
294
13,214
0
09 Sep 2015
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
Bradly C. Stadie
Sergey Levine
Pieter Abbeel
86
504
0
03 Jul 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.4K
149,842
0
22 Dec 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
499
27,263
0
01 Sep 2014
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
114
12,201
0
19 Dec 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
106
3,002
0
19 Jul 2012
Time Consistent Discounting
Tor Lattimore
Marcus Hutter
66
17
0
27 Jul 2011
1