Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
v1
v2
v3
v4
v5 (latest)
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 2,012 papers shown
Title
Implicit Two-Tower Policies
Yunfan Zhao
Qingkai Pan
K. Choromanski
Deepali Jain
Vikas Sindhwani
OffRL
131
3
0
02 Aug 2022
Unified Automatic Control of Vehicular Systems with Reinforcement Learning
Zhongxia Yan
Abdul Rahman Kreidieh
Eugene Vinitsky
Alexandre M. Bayen
Cathy Wu
AI4CE
89
44
0
30 Jul 2022
Improved Policy Optimization for Online Imitation Learning
J. Lavington
Sharan Vaswani
Mark Schmidt
OffRL
94
6
0
29 Jul 2022
JDRec: Practical Actor-Critic Framework for Online Combinatorial Recommender System
Xin Zhao
Zhiwei Fang
Yuchen Guo
Jie He
Wenlong Chen
Changping Peng
33
0
0
27 Jul 2022
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Shuang Qiu
Xiaohan Wei
Jieping Ye
Zhaoran Wang
Zhuoran Yang
OffRL
70
12
0
25 Jul 2022
Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)
Bojun Huang
63
1
0
22 Jul 2022
Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction
Chia-Chi Chuang
Donglin Yang
Chuan Wen
Yang Gao
SSL
128
12
0
20 Jul 2022
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
80
1
0
17 Jul 2022
Asset Allocation: From Markowitz to Deep Reinforcement Learning
Ricard Durall
47
6
0
14 Jul 2022
Learning robust marking policies for adaptive mesh refinement
A. Gillette
B. Keith
S. Petrides
70
11
0
13 Jul 2022
Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments
John Tan Chong Min
Mehul Motani
63
1
0
13 Jul 2022
HTRON:Efficient Outdoor Navigation with Sparse Rewards via Heavy Tailed Adaptive Reinforce Algorithm
K. Weerakoon
Souradip Chakraborty
N. Karapetyan
A. Sathyamoorthy
Amrit Singh Bedi
Tianyi Zhou
86
14
0
08 Jul 2022
Deep Learning Approaches to Grasp Synthesis: A Review
Rhys Newbury
Morris Gu
Lachlan Chumbley
Arsalan Mousavian
Clemens Eppner
...
A. Morales
Tamim Asfour
Danica Kragic
Dieter Fox
Akansel Cosgun
152
172
0
06 Jul 2022
Learning fast and agile quadrupedal locomotion over complex terrain
Xu Chang
Zhitong Zhang
Honglei An
Hongxu Ma
Qing Wei
60
0
0
02 Jul 2022
Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Jorge Armando Mendez Mendez
Alborz Geramifard
Mohammad Ghavamzadeh
Bing-Quan Liu
OffRL
62
6
0
01 Jul 2022
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
76
2
0
28 Jun 2022
Auto-Encoding Adversarial Imitation Learning
Kaifeng Zhang
Rui Zhao
Ziming Zhang
Yang Gao
100
1
0
22 Jun 2022
Imitate then Transcend: Multi-Agent Optimal Execution with Dual-Window Denoise PPO
Jin Fang
Jiacheng Weng
Yi Xiang
Xinwen Zhang
OffRL
89
2
0
21 Jun 2022
Model-Based Imitation Learning Using Entropy Regularization of Model and Policy
E. Uchibe
53
4
0
21 Jun 2022
DNA: Proximal Policy Optimization with a Dual Network Architecture
Mathew H. Aitchison
Penny Sweetser
OffRL
69
4
0
20 Jun 2022
Constrained Reinforcement Learning for Robotics via Scenario-Based Programming
Davide Corsi
Raz Yerushalmi
Guy Amir
Alessandro Farinelli
D. Harel
Guy Katz
81
20
0
20 Jun 2022
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
125
111
0
19 Jun 2022
Robust Imitation Learning against Variations in Environment Dynamics
Jongseong Chae
Seungyul Han
Whiyoung Jung
Myungsik Cho
Sungho Choi
Young-Jin Sung
OOD
72
21
0
19 Jun 2022
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Yuanpei Chen
Tianhao Wu
Shengjie Wang
Xidong Feng
Jiechuan Jiang
...
Yiran Geng
Hao Dong
Zongqing Lu
Song-Chun Zhu
Yaodong Yang
OffRL
140
117
0
17 Jun 2022
A Search-Based Testing Approach for Deep Reinforcement Learning Agents
Amirhossein Zolfagharian
Manel Abdellatif
Lionel C. Briand
M. Bagherzadeh
Ramesh S
117
27
0
15 Jun 2022
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning
Xiaoteng Ma
Shuai Ma
Li Xia
Qianchuan Zhao
92
3
0
15 Jun 2022
Robust Reinforcement Learning with Distributional Risk-averse formulation
Pierre Clavier
S. Allassonnière
E. L. Pennec
OOD
82
7
0
14 Jun 2022
Relative Policy-Transition Optimization for Fast Policy Transfer
Jiawei Xu
Cheng Zhou
Yizheng Zhang
Zhengyou Zhang
Lei Han
52
0
0
13 Jun 2022
Rare event failure test case generation in Learning-Enabled-Controllers
H. Vardhan
J. Sztipanovits
68
20
0
11 Jun 2022
Large-Scale Retrieval for Reinforcement Learning
Peter C. Humphreys
A. Guez
O. Tieleman
Laurent Sifre
T. Weber
Timothy Lillicrap
RALM
OffRL
84
27
0
10 Jun 2022
Multifidelity Reinforcement Learning with Control Variates
Sami Khairy
Prasanna Balaprakash
OffRL
71
5
0
10 Jun 2022
Adversarial Counterfactual Environment Model Learning
Xiong-Hui Chen
Yang Yu
Zhenghong Zhu
Zhihua Yu
Zhen-Yu Chen
...
Yinan Wu
Hongqiu Wu
Rongjun Qin
Rui Ding
Fangsheng Huang
CML
OffRL
91
12
0
10 Jun 2022
Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk
Chengyang Ying
Xinning Zhou
Hang Su
Dong Yan
Ning Chen
Jun Zhu
79
44
0
09 Jun 2022
Simplifying Polylogarithms with Machine Learning
Aurélien Dersy
M. Schwartz
Xiao-Yan Zhang
AI4CE
218
16
0
08 Jun 2022
Constrained Imitation Learning for a Flapping Wing Unmanned Aerial Vehicle
T. K C
Taeyoung Lee
64
2
0
08 Jun 2022
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance
Jakob J. Hollenstein
Sayantan Auddy
Matteo Saveriano
Erwan Renaudo
J. Piater
88
21
0
08 Jun 2022
Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement Learning and Real2Sim Policy Adaptation
Yiwen Chen
Xue-Yong Li
Sheng Guo
Xiang Yao Ng
Marcelo H. Ang Jr
52
5
0
06 Jun 2022
Policy Optimization for Markov Games: Unified Framework and Faster Convergence
Runyu Zhang
Qinghua Liu
Haiquan Wang
Caiming Xiong
Na Li
Yu Bai
87
26
0
06 Jun 2022
Markovian Interference in Experiments
Vivek F. Farias
Andrew A. Li
Tianyi Peng
Andrew Zheng
OffRL
88
33
0
06 Jun 2022
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs
Dongsheng Ding
Jianchao Tan
Jiali Duan
Tamer Bacsar
Mihailo R. Jovanović
79
21
0
06 Jun 2022
Learning Dynamics and Generalization in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
Marta Z. Kwiatkowska
Y. Gal
OOD
OffRL
81
14
0
05 Jun 2022
Algorithm for Constrained Markov Decision Process with Linear Convergence
E. Gladin
Maksim Lavrik-Karmazin
K. Zainullina
Varvara Rudenko
Alexander V. Gasnikov
Martin Takáč
84
7
0
03 Jun 2022
HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning
Michael T. Lash
93
0
0
02 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
105
57
0
01 Jun 2022
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL
Wonjoon Goo
S. Niekum
OffRL
96
20
0
01 Jun 2022
Control of Two-way Coupled Fluid Systems with Differentiable Solvers
B. Ramos
Felix Trost
Nils Thuerey
AI4CE
70
6
0
01 Jun 2022
RLSS: A Deep Reinforcement Learning Algorithm for Sequential Scene Generation
Azimkhon Ostonov
Peter Wonka
D. L. Michels
59
4
0
01 Jun 2022
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL
Siyi Hu
Chuanlong Xie
Xiaodan Liang
Xiaojun Chang
58
22
0
01 Jun 2022
Learning to Use Chopsticks in Diverse Gripping Styles
Zeshi Yang
KangKang Yin
Libin Liu
150
30
0
28 May 2022
Regret-Aware Black-Box Optimization with Natural Gradients, Trust-Regions and Entropy Control
Maximilian Hüttenrauch
Gerhard Neumann
67
1
0
24 May 2022
Previous
1
2
3
...
8
9
10
...
39
40
41
Next