Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.05144
Cited By
v1
v2 (latest)
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
17 August 2017
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (16299★)
Papers citing
"Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation"
20 / 20 papers shown
Title
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
134
17
0
21 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
139
2
0
06 Apr 2025
Explore and Exploit with Heterotic Line Bundle Models
Magdalena Larfors
Robin Schneider
68
38
0
10 Mar 2020
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
Zhang-Wei Hong
Tzu-Yun Shann
Shih-Yang Su
Yi-Hsiang Chang
Chun-Yi Lee
84
124
0
13 Feb 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
571
19,315
0
20 Jul 2017
Emergence of Locomotion Behaviours in Rich Environments
N. Heess
TB Dhruva
S. Sriram
Jay Lemmon
J. Merel
...
Tom Erez
Ziyun Wang
S. M. Ali Eslami
Martin Riedmiller
David Silver
212
938
0
07 Jul 2017
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
111
1,229
0
16 Nov 2016
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
100
345
0
07 Nov 2016
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
109
762
0
03 Nov 2016
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRL
ODL
225
5,087
0
05 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
210
8,882
0
04 Feb 2016
A Kronecker-factored approximate Fisher matrix for convolution layers
Roger C. Grosse
James Martens
ODL
108
264
0
03 Feb 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
307
5,539
0
23 Nov 2015
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
93
3,771
0
20 Nov 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
330
13,295
0
09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
135
3,442
0
08 Jun 2015
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
James Martens
Roger C. Grosse
ODL
109
1,024
0
19 Mar 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
283
6,807
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,433
0
22 Dec 2014
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
120
3,021
0
19 Jul 2012
1