Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network
Duzhen Zhang
Tielin Zhang
Shuncheng Jia
Qingyu Wang
Bo Xu
OffRL
180
5
0
29 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi Ma
Sergey Levine
37
14
0
24 Dec 2022
An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context
Xiaoyu Chen
Xiangming Zhu
Yufeng Zheng
Pushi Zhang
Li Zhao
...
Peng Cheng
Y. Xiong
Tao Qin
Jianyu Chen
Tie-Yan Liu
OffRL
16
11
0
24 Dec 2022
NARS vs. Reinforcement learning: ONA vs. Q-Learning
Ali Beikmohammadi
21
0
0
23 Dec 2022
Local Policy Improvement for Recommender Systems
Dawen Liang
N. Vlassis
OffRL
21
3
0
22 Dec 2022
Lifelong Reinforcement Learning with Modulating Masks
Eseoghene Ben-Iwhiwhu
Saptarshi Nath
Praveen K. Pilly
Soheil Kolouri
Andrea Soltoggio
CLL
OffRL
37
20
0
21 Dec 2022
Policy Gradient in Robust MDPs with Global Convergence Guarantee
Qiuhao Wang
C. Ho
Marek Petrik
27
24
0
20 Dec 2022
Risk-Sensitive Reinforcement Learning with Exponential Criteria
Erfaun Noorani
Christos N. Mavridis
John S. Baras
30
8
0
18 Dec 2022
Cognitive Level-
k
k
k
Meta-Learning for Safe and Pedestrian-Aware Autonomous Driving
Haozhe Lei
Quanyan Zhu
28
0
0
17 Dec 2022
Latent Variable Representation for Reinforcement Learning
Tongzheng Ren
Chenjun Xiao
Tianjun Zhang
Na Li
Zhaoran Wang
Sujay Sanghavi
Dale Schuurmans
Bo Dai
OffRL
33
10
0
17 Dec 2022
An Energy-aware and Fault-tolerant Deep Reinforcement Learning based approach for Multi-agent Patrolling Problems
C. Tong
Aaron Harwood
Maria A. Rodriguez
Richard Sinnott
24
1
0
16 Dec 2022
Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning
Lindsey Kerbel
B. Ayalew
Andrej Ivanco
K. Loiselle
OffRL
24
8
0
15 Dec 2022
Robust Policy Optimization in Deep Reinforcement Learning
Md Masudur Rahman
Yexiang Xue
25
8
0
14 Dec 2022
Efficient Exploration in Resource-Restricted Reinforcement Learning
Zhihai Wang
Taoxing Pan
Qi Zhou
Jie Wang
OffRL
20
10
0
14 Dec 2022
Proximal Policy Optimization Based Reinforcement Learning for Joint Bidding in Energy and Frequency Regulation Markets
M. Anwar
Changlong Wang
F. D. Nijs
Hao Wang
21
12
0
13 Dec 2022
PPO-UE: Proximal Policy Optimization via Uncertainty-Aware Exploration
Qisheng Zhang
Zhen Guo
A. Jøsang
Lance M. Kaplan
F. Chen
Dong-Ho Jeong
Jin-Hee Cho
25
0
0
13 Dec 2022
Variance-Reduced Conservative Policy Iteration
Naman Agarwal
Brian Bullins
Karan Singh
32
3
0
12 Dec 2022
Molecular Graph Generation by Decomposition and Reassembling
Masatsugu Yamada
M. Sugiyama
27
4
0
11 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
33
26
0
10 Dec 2022
Model-based trajectory stitching for improved behavioural cloning and its applications
Charles A. Hepburn
Giovanni Montana
OffRL
34
5
0
08 Dec 2022
Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning
Cesare Caputo
Michel-Alexandre Cardin
Pudong Ge
Fei Teng
A. Korre
Ehecatl Antonio del Rio Chanona
19
18
0
08 Dec 2022
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble
Chong Li
OffRL
32
0
0
07 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
26
4
0
03 Dec 2022
Launchpad: Learning to Schedule Using Offline and Online RL Methods
V. Venkataswamy
J. E. Grigsby
A. Grimshaw
Yanjun Qi
OffRL
OnRL
24
1
0
01 Dec 2022
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
21
11
0
30 Nov 2022
Relative Sparsity for Medical Decision Problems
Samuel J. Weisenthal
Sally W. Thurston
Ashkan Ertefaie
30
2
0
29 Nov 2022
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison
H. Flynn
David Reeb
M. Kandemir
Jan Peters
OffRL
19
7
0
29 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
39
15
0
29 Nov 2022
Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability
Whiyoung Jung
Myungsik Cho
Jongeui Park
Young-Jin Sung
38
4
0
28 Nov 2022
A Critical Review of Traffic Signal Control and A Novel Unified View of Reinforcement Learning and Model Predictive Control Approaches for Adaptive Traffic Signal Control
Xiaoyu Wang
Scott Sanner
Baher Abdulhai
22
5
0
26 Nov 2022
Explainable and Safe Reinforcement Learning for Autonomous Air Mobility
Lei Wang
Hongyu Yang
Yi Lin
S. Yin
Yuankai Wu
6
5
0
24 Nov 2022
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Tingting Zhao
Ying Wang
Weidong Sun
Yarui Chen
Gang Niu
Masashi Sugiyama
19
1
0
23 Nov 2022
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
Charles A. Hepburn
Giovanni Montana
OffRL
34
13
0
21 Nov 2022
Automating Rigid Origami Design
Jeremia Geiger
Karolis Martinkus
Oliver Richter
Roger Wattenhofer
19
1
0
20 Nov 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Kaipeng Zhang
Tamer Basar
W. Yin
48
102
0
15 Nov 2022
Redeeming Intrinsic Rewards via Constrained Optimization
Eric Chen
Zhang-Wei Hong
Joni Pajarinen
Pulkit Agrawal
OnRL
36
24
0
14 Nov 2022
Out-of-Dynamics Imitation Learning from Multimodal Demonstrations
Yiwen Qiu
Jialong Wu
Zhangjie Cao
Mingsheng Long
31
3
0
13 Nov 2022
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
Yunpeng Qing
Shunyu Liu
Mingli Song
Huiqiong Wang
Mingli Song
XAI
33
1
0
12 Nov 2022
The Expertise Problem: Learning from Specialized Feedback
Oliver Daniels-Koch
Rachel Freedman
OffRL
41
17
0
12 Nov 2022
Job Scheduling in Datacenters using Constraint Controlled RL
V. Venkataswamy
11
1
0
10 Nov 2022
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
Takumi Tanabe
Reimi Sato
Kazuto Fukuchi
Jun Sakuma
Youhei Akimoto
OffRL
27
8
0
07 Nov 2022
Decentralized Policy Optimization
Kefan Su
Zongqing Lu
21
8
0
06 Nov 2022
Mixline: A Hybrid Reinforcement Learning Framework for Long-horizon Bimanual Coffee Stirring Task
Zheng Sun
Zhiqi Wang
Junjia Liu
Miao Li
Fei Chen
34
4
0
04 Nov 2022
A Survey on Reinforcement Learning in Aviation Applications
Pouria Razzaghi
Amin Tabrizian
Wei Guo
Shulu Chen
Abenezer Taye
Ellis E. Thompson
Alexis Bregeon
Ali Baheri
Peng Wei
OffRL
23
52
0
03 Nov 2022
Geometry and convergence of natural policy gradient methods
Johannes Muller
Guido Montúfar
29
11
0
03 Nov 2022
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Anika Singh
Aviral Kumar
Q. Vuong
Yevgen Chebotar
Sergey Levine
OffRL
32
14
0
02 Nov 2022
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
Michael Giegrich
Christoph Reisinger
Yufei Zhang
37
11
0
01 Nov 2022
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning
Riashat Islam
Hongyu Zang
Anirudh Goyal
Alex Lamb
Kenji Kawaguchi
Xin-hui Li
Romain Laroche
Yoshua Bengio
Rémi Tachet des Combes
OffRL
AI4CE
25
9
0
01 Nov 2022
Teacher-student curriculum learning for reinforcement learning
Yanick Schraner
OffRL
37
2
0
31 Oct 2022
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games
Viet The Bui
Tien Mai
T. Nguyen
AAML
33
5
0
30 Oct 2022
Previous
1
2
3
...
17
18
19
...
60
61
62
Next