Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
Deep Reinforcement Learning for Cyber Security
Thanh Thi Nguyen
Vijay Janapa Reddi
OffRL
AI4CE
10
314
0
13 Jun 2019
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application
Arip Asadulaev
Igor Kuznetsov
Gideon Stein
Andrey Filchenkov
9
0
0
13 Jun 2019
Efficient Exploration via State Marginal Matching
Lisa Lee
Benjamin Eysenbach
Emilio Parisotto
Eric Xing
Sergey Levine
Ruslan Salakhutdinov
35
242
0
12 Jun 2019
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
OffRL
32
286
0
12 Jun 2019
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Edward Choi
Zhen Xu
Yujia Li
Michael W. Dusenberry
Gerardo Flores
Yuan Xue
Andrew M. Dai
MedIm
24
238
0
11 Jun 2019
Learning Powerful Policies by Using Consistent Dynamics Model
Shagun Sodhani
Anirudh Goyal
T. Deleu
Yoshua Bengio
Sergey Levine
Jian Tang
OffRL
19
5
0
11 Jun 2019
Learning to Score Behaviors for Guided Policy Optimization
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
A. Choromańska
K. Choromanski
Michael I. Jordan
29
38
0
11 Jun 2019
Exploration via Hindsight Goal Generation
Zhizhou Ren
Kefan Dong
Yuanshuo Zhou
Qiang Liu
Jian-wei Peng
35
85
0
10 Jun 2019
Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains
Matthieu Zimmer
Paul Weng
24
7
0
10 Jun 2019
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad
Ioannis Mitliagkas
Nicolas Le Roux
29
28
0
08 Jun 2019
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
28
9
0
07 Jun 2019
Clustered Reinforcement Learning
Xiao Ma
Shen-Yi Zhao
Wu-Jun Li
OffRL
24
6
0
06 Jun 2019
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets
Devansh Arpit
Victor Campos
Yoshua Bengio
21
56
0
05 Jun 2019
Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm
Pin Wang
Hanhan Li
Ching-yao Chan
19
51
0
05 Jun 2019
Machine Learning and System Identification for Estimation in Physical Systems
Fredrik Bagge Carlson
OOD
16
5
0
05 Jun 2019
Global Optimality Guarantees For Policy Gradient Methods
Jalaj Bhandari
Daniel Russo
41
186
0
05 Jun 2019
BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators
F. Ramos
Rafael Possas
Dieter Fox
22
156
0
04 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
32
1,035
0
03 Jun 2019
Deep Reinforcement Learning Architecture for Continuous Power Allocation in High Throughput Satellites
J. Luis
Markus Guerster
Iñigo Del Portillo
E. Crawley
B. Cameron
9
18
0
03 Jun 2019
Harnessing Reinforcement Learning for Neural Motion Planning
Tom Jurgenson
Aviv Tamar
OOD
25
65
0
01 Jun 2019
Neural Replicator Dynamics
Daniel Hennes
Dustin Morrill
Shayegan Omidshafiei
Rémi Munos
Julien Perolat
...
A. Gruslys
Jean-Baptiste Lespiau
Paavo Parmas
Edgar A. Duénez-Guzmán
K. Tuyls
24
16
0
01 Jun 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Kai Zhang
Zhuoran Yang
Tamer Basar
32
125
0
31 May 2019
Reinforcement Learning Experience Reuse with Policy Residual Representation
Wen-Ji Zhou
Yang Yu
Yingfeng Chen
Kai Guan
Tangjie Lv
Changjie Fan
Zhi-Hua Zhou
OffRL
9
2
0
31 May 2019
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks
Brijen Thananjeyan
Ashwin Balakrishna
Ugo Rosolia
Felix Li
R. McAllister
Joseph E. Gonzalez
Sergey Levine
Francesco Borrelli
Ken Goldberg
OffRL
22
4
0
31 May 2019
Advantage Amplification in Slowly Evolving Latent-State Environments
Martin Mladenov
Ofer Meshi
Jayden Ooi
Dale Schuurmans
Craig Boutilier
OffRL
26
9
0
29 May 2019
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Pan Xu
F. Gao
Quanquan Gu
24
93
0
29 May 2019
Adversarial Imitation Learning from Incomplete Demonstrations
Mingfei Sun
Xiaojuan Ma
24
29
0
29 May 2019
Snooping Attacks on Deep Reinforcement Learning
Matthew J. Inkawhich
Yiran Chen
Hai Helen Li
AAML
22
25
0
28 May 2019
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
30
0
0
28 May 2019
Disentangling Dynamics and Returns: Value Function Decomposition with Future Prediction
Hongyao Tang
Jianye Hao
Guangyong Chen
Pengfei Chen
Zhaopeng Meng
Yaodong Yang
Li Wang
23
2
0
27 May 2019
Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
Yufei Wang
Ziju Shen
Zichao Long
Bin Dong
AI4CE
PINN
19
40
0
27 May 2019
Learning latent state representation for speeding up exploration
Giulia Vezzani
Abhishek Gupta
Lorenzo Natale
Pieter Abbeel
14
27
0
27 May 2019
Policy Search by Target Distribution Learning for Continuous Control
Chuheng Zhang
Yuanqi Li
Jian Li
26
6
0
27 May 2019
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Jeff Clune
21
118
0
27 May 2019
Provably Efficient Imitation Learning from Observation Alone
Wen Sun
Anirudh Vemula
Byron Boots
J. Andrew Bagnell
30
105
0
27 May 2019
Composing Task-Agnostic Policies with Deep Reinforcement Learning
A. H. Qureshi
Jacob J. Johnson
Yuzhe Qin
Taylor Henderson
Byron Boots
Michael C. Yip
OffRL
22
30
0
25 May 2019
Transferable Cost-Aware Security Policy Implementation for Malware Detection Using Deep Reinforcement Learning
Yoni Birman
Shaked Hindi
Gilad Katz
A. Shabtai
AAML
OffRL
19
2
0
25 May 2019
Adaptive Symmetric Reward Noising for Reinforcement Learning
R. Vivanti
Talya D. Sohlberg-Baris
Shlomo Cohen
Orna Cohen
AAML
21
1
0
24 May 2019
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai
Zhuoran Yang
Jason D. Lee
Zhaoran Wang
42
29
0
24 May 2019
Distributional Policy Optimization: An Alternative Approach for Continuous Control
Chen Tessler
Guy Tennenholtz
Shie Mannor
OffRL
18
44
0
23 May 2019
From semantics to execution: Integrating action planning with reinforcement learning for robotic causal problem-solving
Manfred Eppe
Phuong D. H. Nguyen
S. Wermter
25
41
0
23 May 2019
Combine PPO with NES to Improve Exploration
Lianjiang Li
Yunrong Yang
Bingna Li
11
1
0
23 May 2019
Imitation Learning from Video by Leveraging Proprioception
F. Torabi
Garrett A. Warnell
Peter Stone
16
35
0
22 May 2019
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
Rui Zhao
Xudong Sun
Volker Tresp
31
80
0
21 May 2019
Combining Experience Replay with Exploration by Random Network Distillation
Francesco Sovrano
24
15
0
18 May 2019
MaMiC: Macro and Micro Curriculum for Robotic Reinforcement Learning
Manan Tomar
Akhil Sathuluri
Balaraman Ravindran
31
4
0
17 May 2019
Leveraging exploration in off-policy algorithms via normalizing flows
Bogdan Mazoure
T. Doan
A. Durand
R. Devon Hjelm
Joelle Pineau
OnRL
25
59
0
16 May 2019
Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation
Ruohan Wang
C. Ciliberto
P. Amadori
Y. Demiris
24
62
0
16 May 2019
Successor Options: An Option Discovery Framework for Reinforcement Learning
Rahul Ramesh
Manan Tomar
Balaraman Ravindran
13
33
0
14 May 2019
Trajectory-Based Off-Policy Deep Reinforcement Learning
Andreas Doerr
Michael Volpp
Marc Toussaint
Sebastian Trimpe
Christian Daniel
OffRL
34
2
0
14 May 2019
Previous
1
2
3
...
48
49
50
...
60
61
62
Next