Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
v1
v2
v3
v4
v5 (latest)
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 2,009 papers shown
Title
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
85
7
0
08 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning
Cesare Caputo
Michel-Alexandre Cardin
Pudong Ge
Fei Teng
A. Korre
Ehecatl Antonio del Rio Chanona
52
18
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
69
1
0
07 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
90
5
0
03 Dec 2022
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
72
12
0
30 Nov 2022
Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability
Whiyoung Jung
Myungsik Cho
Jongeui Park
Young-Jin Sung
85
4
0
28 Nov 2022
A Critical Review of Traffic Signal Control and A Novel Unified View of Reinforcement Learning and Model Predictive Control Approaches for Adaptive Traffic Signal Control
Xiaoyu Wang
Scott Sanner
Baher Abdulhai
69
5
0
26 Nov 2022
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Tingting Zhao
Ying Wang
Weidong Sun
Yarui Chen
Gang Niu
Masashi Sugiyama
68
1
0
23 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
88
14
0
21 Nov 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Jianchao Tan
Tamer Basar
W. Yin
117
110
0
15 Nov 2022
Redeeming Intrinsic Rewards via Constrained Optimization
Eric Chen
Zhang-Wei Hong
Joni Pajarinen
Pulkit Agrawal
OnRL
111
27
0
14 Nov 2022
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations
Yiwen Qiu
Jialong Wu
Zhangjie Cao
Mingsheng Long
63
4
0
13 Nov 2022
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
Yunpeng Qing
Shunyu Liu
Mingli Song
Huiqiong Wang
Mingli Song
XAI
93
1
0
12 Nov 2022
Job Scheduling in Datacenters using Constraint Controlled RL
V. Venkataswamy
43
1
0
10 Nov 2022
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
Takumi Tanabe
Reimi Sato
Kazuto Fukuchi
Jun Sakuma
Youhei Akimoto
OffRL
75
11
0
07 Nov 2022
Mixline: A Hybrid Reinforcement Learning Framework for Long-horizon Bimanual Coffee Stirring Task
Zheng Sun
Zhiqi Wang
Junjia Liu
Miao Li
Fei Chen
60
4
0
04 Nov 2022
A Survey on Reinforcement Learning in Aviation Applications
Pouria Razzaghi
Amin Tabrizian
Wei Guo
Shulu Chen
Abenezer Taye
Ellis E. Thompson
Alexis Bregeon
Ali Baheri
Peng Wei
OffRL
57
56
0
03 Nov 2022
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Anika Singh
Aviral Kumar
Q. Vuong
Yevgen Chebotar
Sergey Levine
OffRL
62
14
0
02 Nov 2022
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning
Riashat Islam
Hongyu Zang
Anirudh Goyal
Alex Lamb
Kenji Kawaguchi
Xin-hui Li
Romain Laroche
Yoshua Bengio
Rémi Tachet des Combes
OffRL
AI4CE
108
11
0
01 Nov 2022
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games
Viet The Bui
Tien Mai
T. Nguyen
AAML
115
5
0
30 Oct 2022
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning
Lisheng Wu
Ke Chen
81
4
0
28 Oct 2022
Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision
Ashvin Nair
Brian Zhu
Gokul Narayanan
Eugen Solowjow
Sergey Levine
OffRL
OnRL
136
16
0
27 Oct 2022
Characterising the Robustness of Reinforcement Learning for Continuous Control using Disturbance Injection
Catherine R. Glossop
Jacopo Panerati
A. Krishnan
Zhaocong Yuan
Angela P. Schoellig
81
6
0
27 Oct 2022
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Baihan Lin
OffRL
AI4TS
135
27
0
24 Oct 2022
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence
S. Pattathil
Jianchao Tan
Asuman Ozdaglar
99
14
0
23 Oct 2022
Solving Continuous Control via Q-learning
Tim Seyde
Peter Werner
Wilko Schwarting
Igor Gilitschenski
Martin Riedmiller
Daniela Rus
Markus Wulfmeier
OffRL
LRM
92
23
0
22 Oct 2022
Probing Transfer in Deep Reinforcement Learning without Task Engineering
Andrei A. Rusu
Sebastian Flennerhag
Dushyant Rao
Razvan Pascanu
R. Hadsell
74
6
0
22 Oct 2022
Deep Reinforcement Learning for Stabilization of Large-scale Probabilistic Boolean Networks
S. Moschoyiannis
Evangelos Chatzaroulas
Vytenis Sliogeris
Yuhu Wu
BDL
OffRL
AI4CE
62
8
0
21 Oct 2022
RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control
Yanfei Xiang
Xin Wang
Shu Hu
Bin Zhu
Xiaomeng Huang
Xi Wu
Siwei Lyu
SSL
102
5
0
20 Oct 2022
Safe Policy Improvement in Constrained Markov Decision Processes
Luigi Berducci
Radu Grosu
OffRL
111
2
0
20 Oct 2022
Task Phasing: Automated Curriculum Learning from Demonstrations
Vaibhav Bajaj
Guni Sharon
Peter Stone
77
8
0
20 Oct 2022
Topology Optimization via Machine Learning and Deep Learning: A Review
S. Shin
Dongju Shin
Namwoo Kang
AI4CE
88
69
0
19 Oct 2022
Proximal Learning With Opponent-Learning Awareness
S. Zhao
Chris Xiaoxuan Lu
Roger C. Grosse
Jakob N. Foerster
83
21
0
18 Oct 2022
Entropy Regularized Reinforcement Learning with Cascading Networks
R. D. Vecchia
Alena Shilova
Philippe Preux
R. Akrour
34
2
0
16 Oct 2022
When to Update Your Model: Constrained Model-based Reinforcement Learning
Tianying Ji
Yu-Juan Luo
Gang Hua
Mingxuan Jing
Fengxiang He
Wen-bing Huang
84
19
0
15 Oct 2022
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
Haoran Xu
Li Jiang
Jianxiong Li
Xianyuan Zhan
OffRL
157
64
0
15 Oct 2022
DyFEn: Agent-Based Fee Setting in Payment Channel Networks
Kian Asgari
Aida Mohammadian
M. Tefagh
31
7
0
15 Oct 2022
Just Round: Quantized Observation Spaces Enable Memory Efficient Learning of Dynamic Locomotion
Lev Grossman
Brian Plancher
MQ
69
4
0
14 Oct 2022
A Scalable Finite Difference Method for Deep Reinforcement Learning
Matthew Allen
John C. Raisbeck
Hakho Lee
61
0
0
14 Oct 2022
Mutual Information Regularized Offline Reinforcement Learning
Xiao Ma
Bingyi Kang
Zhongwen Xu
Min Lin
Shuicheng Yan
OffRL
101
8
0
14 Oct 2022
A Concise Introduction to Reinforcement Learning in Robotics
Akash Nagaraj
Mukund Sood
B. Patil
40
22
0
13 Oct 2022
Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations
N. Vadori
Leo Ardon
Sumitra Ganesh
Thomas Spooner
Selim Amrouni
Jared Vann
Mengda Xu
Zeyu Zheng
T. Balch
Manuela Veloso
75
17
0
13 Oct 2022
Observed Adversaries in Deep Reinforcement Learning
Eugene Lim
Harold Soh
AAML
26
0
0
13 Oct 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRL
OnRL
97
105
0
13 Oct 2022
Real World Offline Reinforcement Learning with Realistic Data Source
G. Zhou
Liyiming Ke
S. Srinivasa
Abhi Gupta
Aravind Rajeswaran
Vikash Kumar
OffRL
92
23
0
12 Oct 2022
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning
Yongyuan Liang
Yanchao Sun
Ruijie Zheng
Furong Huang
OOD
AAML
OffRL
48
51
0
12 Oct 2022
Discovered Policy Optimisation
Chris Xiaoxuan Lu
J. Kuba
Alistair Letcher
Luke Metz
Christian Schroeder de Witt
Jakob N. Foerster
OffRL
111
79
0
11 Oct 2022
DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning
Seungjae Lee
Jigang Kim
Inkyu Jang
H. J. Kim
OffRL
105
13
0
11 Oct 2022
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
Bin Hu
Jianchao Tan
Na Li
M. Mesbahi
Maryam Fazel
Tamer Bacsar
167
27
0
10 Oct 2022
Traffic-Aware Autonomous Driving with Differentiable Traffic Simulation
L. Zheng
Sanghyun Son
Ming-Chyuan Lin
129
3
0
07 Oct 2022
Previous
1
2
3
...
6
7
8
...
39
40
41
Next