Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
v1
v2
v3
v4
v5 (latest)
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 2,012 papers shown
Title
Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Charlie Blake
Vitaly Kurin
Maximilian Igl
Shimon Whiteson
AI4CE
100
13
0
01 Mar 2021
Decision Making in Monopoly using a Hybrid Deep Reinforcement Learning Approach
Trevor Bonjour
Marina Haliem
A. Alsalem
Shilpa Thomas
Hongyu Li
Vaneet Aggarwal
Mayank Kejriwal
Bharat K. Bhargava
108
15
0
01 Mar 2021
A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning
Pascal Klink
Hany Abdulsamad
Boris Belousov
Carlo DÉramo
Jan Peters
Joni Pajarinen
112
23
0
25 Feb 2021
Improved Regret Bound and Experience Replay in Regularized Policy Iteration
N. Lazić
Dong Yin
Yasin Abbasi-Yadkori
Csaba Szepesvári
OffRL
60
18
0
25 Feb 2021
Memory-based Deep Reinforcement Learning for POMDPs
Lingheng Meng
R. Gorbet
Dana Kulic
124
100
0
24 Feb 2021
Differentiable Logic Machines
Matthieu Zimmer
Xuening Feng
Claire Glanois
Zhaohui Jiang
Jianyi Zhang
Paul Weng
Li Dong
Hao Jianye
Liu Wulong
AI4CE
95
23
0
23 Feb 2021
Mixed Policy Gradient: off-policy reinforcement learning driven jointly by data and model
Yang Guan
Jingliang Duan
Shengbo Eben Li
Jie Li
Jianyu Chen
B. Cheng
OffRL
77
12
0
23 Feb 2021
MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning
DiJia Su
Jason D. Lee
John M. Mulvey
H. Vincent Poor
OffRL
62
6
0
23 Feb 2021
Softmax Policy Gradient Methods Can Take Exponential Time to Converge
Gen Li
Yuting Wei
Yuejie Chi
Yuxin Chen
127
53
0
22 Feb 2021
Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization
Jyun-Li Lin
Wei-Ting Hung
Shangtong Yang
Ping-Chun Hsieh
Xi Liu
120
14
0
22 Feb 2021
MobILE: Model-Based Imitation Learning From Observation Alone
Rahul Kidambi
Jonathan D. Chang
Wen Sun
77
40
0
22 Feb 2021
Communication Efficient Parallel Reinforcement Learning
Mridul Agarwal
Bhargav Ganguly
Vaneet Aggarwal
77
11
0
22 Feb 2021
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
Wenhao Li
Xiangfeng Wang
Bo Jin
Junjie Sheng
H. Zha
138
9
0
21 Feb 2021
Decaying Clipping Range in Proximal Policy Optimization
Mónika Farsang
Luca Szegletes
OffRL
66
4
0
20 Feb 2021
Decoupling Value and Policy for Generalization in Reinforcement Learning
Roberta Raileanu
Rob Fergus
DRL
OffRL
114
99
0
20 Feb 2021
Towards Accurate and Compact Architectures via Neural Architecture Transformer
Yong Guo
Yin Zheng
Mingkui Tan
Qi Chen
Zhipeng Li
Jian Chen
P. Zhao
Junzhou Huang
ViT
MQ
57
38
0
20 Feb 2021
On Proximal Policy Optimization's Heavy-tailed Gradients
Saurabh Garg
Joshua Zhanson
Emilio Parisotto
Adarsh Prasad
J. Zico Kolter
Zachary Chase Lipton
Sivaraman Balakrishnan
Ruslan Salakhutdinov
Pradeep Ravikumar
100
13
0
20 Feb 2021
Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking
Paula Gradu
John Hallman
Daniel Suo
Alex Yu
Naman Agarwal
Udaya Ghai
Karan Singh
Cyril Zhang
Anirudha Majumdar
Elad Hazan
62
15
0
19 Feb 2021
Continuous Doubly Constrained Batch Reinforcement Learning
Rasool Fakoor
Jonas W. Mueller
Kavosh Asadi
Pratik Chaudhari
Alex Smola
OffRL
286
27
0
18 Feb 2021
Learning Memory-Dependent Continuous Control from Demonstrations
Siqing Hou
Dongqi Han
Jun Tani
30
0
0
18 Feb 2021
On the Sample Complexity of Stability Constrained Imitation Learning
Stephen Tu
Alexander Robey
Tingnan Zhang
Nikolai Matni
98
40
0
18 Feb 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
139
24
0
17 Feb 2021
Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games
Yulai Zhao
Yuandong Tian
Jason D. Lee
S. Du
OffRL
76
18
0
17 Feb 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
130
69
0
17 Feb 2021
Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments
Amin Rakhsha
Xuezhou Zhang
Xiaojin Zhu
Adish Singla
AAML
OffRL
88
37
0
16 Feb 2021
Distributionally-Constrained Policy Optimization via Unbalanced Optimal Transport
A. Givchi
Pei Wang
Junqi Wang
Patrick Shafto
OT
OffRL
61
0
0
15 Feb 2021
Resilient Machine Learning for Networked Cyber Physical Systems: A Survey for Machine Learning Security to Securing Machine Learning for CPS
Felix O. Olowononi
D. Rawat
Chunmei Liu
95
138
0
14 Feb 2021
Online Apprenticeship Learning
Lior Shani
Tom Zahavy
Shie Mannor
OffRL
92
27
0
13 Feb 2021
Learning Variable Impedance Control via Inverse Reinforcement Learning for Force-Related Tasks
Xiang Zhang
Liting Sun
Zhian Kuang
Masayoshi Tomizuka
67
83
0
13 Feb 2021
Optimization Issues in KL-Constrained Approximate Policy Iteration
N. Lazić
Botao Hao
Yasin Abbasi-Yadkori
Dale Schuurmans
Csaba Szepesvári
57
11
0
11 Feb 2021
Sufficiently Accurate Model Learning for Planning
Clark Zhang
Santiago Paternain
Alejandro Ribeiro
29
0
0
11 Feb 2021
Robust Policy Gradient against Strong Data Corruption
Xuezhou Zhang
Yiding Chen
Xiaojin Zhu
Wen Sun
AAML
110
39
0
11 Feb 2021
Defense Against Reward Poisoning Attacks in Reinforcement Learning
Kiarash Banihashem
Adish Singla
Goran Radanović
AAML
92
27
0
10 Feb 2021
Derivative-Free Reinforcement Learning: A Review
Hong Qian
Yang Yu
OffRL
142
42
0
10 Feb 2021
Measuring Progress in Deep Reinforcement Learning Sample Efficiency
Florian E. Dorner
55
13
0
09 Feb 2021
Continuous-Time Model-Based Reinforcement Learning
Çağatay Yıldız
Markus Heinonen
Harri Lähdesmäki
OffRL
74
58
0
09 Feb 2021
Adversarially Guided Actor-Critic
Yannis Flet-Berliac
Johan Ferret
Olivier Pietquin
Philippe Preux
Matthieu Geist
77
73
0
08 Feb 2021
A Hybrid Approach for Reinforcement Learning Using Virtual Policy Gradient for Balancing an Inverted Pendulum
Dylan Bates
21
4
0
06 Feb 2021
How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned
Julian Ibarz
Jie Tan
Chelsea Finn
Mrinal Kalakrishnan
P. Pastor
Sergey Levine
OffRL
162
538
0
04 Feb 2021
Proactive and AoI-aware Failure Recovery for Stateful NFV-enabled Zero-Touch 6G Networks: Model-Free DRL Approach
Amirhossein Shaghaghi
Abolfazl Zakeri
Nader Mokari
M. Javan
M. Behdadfar
Eduard Axel Jorswieck
36
20
0
02 Feb 2021
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rongjun Qin
Songyi Gao
Xingyuan Zhang
Zhen Xu
Shengkai Huang
Zewen Li
Weinan Zhang
Yang Yu
OffRL
199
83
0
01 Feb 2021
Scalable Voltage Control using Structure-Driven Hierarchical Deep Reinforcement Learning
Sayak Mukherjee
Renke Huang
Qiuhua Huang
T. Vu
Tianzhixi Yin
36
7
0
29 Jan 2021
Meta-Reinforcement Learning for Reliable Communication in THz/VLC Wireless VR Networks
Yining Wang
Mingzhe Chen
Zhaohui Yang
Walid Saad
T. Luo
Shuguang Cui
H. Vincent Poor
OffRL
77
32
0
29 Jan 2021
Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning
Amrita Saha
Shafiq Joty
Guosheng Lin
NAI
AIMat
LRM
59
20
0
28 Jan 2021
Reinforcement Learning for Selective Key Applications in Power Systems: Recent Advances and Future Challenges
Xin Chen
Guannan Qu
Yujie Tang
S. Low
Na Li
86
241
0
27 Jan 2021
Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm
S. Khodadadian
Thinh T. Doan
Justin Romberg
S. T. Maguluri
99
43
0
26 Jan 2021
Advances and Challenges in Conversational Recommender Systems: A Survey
Chongming Gao
Wenqiang Lei
Xiangnan He
Maarten de Rijke
Tat-Seng Chua
255
284
0
23 Jan 2021
Differentiable Trust Region Layers for Deep Reinforcement Learning
Fabian Otto
P. Becker
Ngo Anh Vien
Hanna Ziesche
Gerhard Neumann
OffRL
76
19
0
22 Jan 2021
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary
Huan Zhang
Hongge Chen
Duane S. Boning
Cho-Jui Hsieh
128
169
0
21 Jan 2021
HAMMER: Multi-Level Coordination of Reinforcement Learning Agents via Learned Messaging
Nikunj Gupta
G. Srinivasaraghavan
S. Mohalik
Nishant Kumar
Matthew E. Taylor
55
15
0
18 Jan 2021
Previous
1
2
3
...
16
17
18
...
39
40
41
Next