Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
v1
v2
v3
v4
v5 (latest)
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 2,008 papers shown
Title
Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning
Chaofan Ma
Qisen Xu
Xiangfeng Wang
Bo Jin
Xiaoyun Zhang
Yanfeng Wang
Ya Zhang
87
22
0
19 Mar 2023
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits
Weitong Zhang
Jiafan He
Zhiyuan Fan
Q. Gu
145
5
0
16 Mar 2023
Sample-efficient Adversarial Imitation Learning
Dahuin Jung
Hyungyu Lee
Sung-Hoon Yoon
SSL
74
2
0
14 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
74
24
0
14 Mar 2023
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
Matthieu Geist
Shie Mannor
66
2
0
12 Mar 2023
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
129
8
0
08 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing
Shuai Xiao
Le Guo
Zaifan Jiang
Lei Lv
Yuanbo Chen
Jun Zhu
Shuang Yang
72
21
0
02 Mar 2023
DSL-Assembly: A Robust and Safe Assembly Strategy
Yi Liu
OffRL
43
0
0
21 Feb 2023
Understanding the effect of varying amounts of replay per step
A. Paul
Videh Raj Nema
57
0
0
20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
72
6
0
20 Feb 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
80
9
0
13 Feb 2023
Graph Learning Based Decision Support for Multi-Aircraft Take-Off and Landing at Urban Air Mobility Vertiports
Prajit K. Kumar
Jhoel Witter
Steve Paul
Karthik Dantu
Souma Chowdhury
50
3
0
12 Feb 2023
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark Schmidt
Nicolas Le Roux
104
6
0
06 Feb 2023
Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization
Zichuan Lin
Xiapeng Wu
Mingfei Sun
Deheng Ye
Qiang Fu
Wei Yang
Wei Liu
108
3
0
05 Feb 2023
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Pouya Hamadanian
Arash Nasr-Esfahany
Malte Schwarzkopf
Siddartha Sen
MohammadIman Alizadeh
CLL
OffRL
159
0
0
04 Feb 2023
User-centric Heterogeneous-action Deep Reinforcement Learning for Virtual Reality in the Metaverse over Wireless Networks
Wen-li Yu
Terence Jie Chua
Junfeng Zhao
EgoV
118
18
0
03 Feb 2023
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
88
3
0
02 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
112
2
0
02 Feb 2023
Distillation Policy Optimization
Jianfei Ma
OffRL
101
1
0
01 Feb 2023
Optimizing DDPM Sampling with Shortcut Fine-Tuning
Ying Fan
Kangwook Lee
116
60
0
31 Jan 2023
PAC-Bayesian Soft Actor-Critic Learning
Bahareh Tasdighi
Abdullah Akgul
Manuel Haussmann
Kenny Kazimirzak Brink
M. Kandemir
118
4
0
30 Jan 2023
Learning the Kalman Filter with Fine-Grained Sample Complexity
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
93
16
0
30 Jan 2023
Multi-Agent Interplay in a Competitive Survival Environment
Andrea Fanti
69
0
0
19 Jan 2023
DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training
Philipp Altmann
Thomy Phan
Fabian Ritz
Thomas Gabor
Claudia Linnhoff-Popien
OffRL
63
1
0
18 Jan 2023
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
Zhanteng Xie
P. Dames
113
68
0
16 Jan 2023
Deep Reinforcement Learning for Autonomous Ground Vehicle Exploration Without A-Priori Maps
Shathushan Sivashangaran
A. Eskandarian
62
4
0
10 Jan 2023
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Zongwei Liu
Yonghong Song
Yuanlin Zhang
OffRL
84
3
0
10 Jan 2023
Transformers as Policies for Variable Action Environments
Niklas Zwingenberger
37
2
0
09 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
90
80
0
05 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
66
6
0
05 Jan 2023
Scalable Communication for Multi-Agent Reinforcement Learning via Transformer-Based Email Mechanism
Xudong Guo
Daming Shi
Wenhui Fan
63
6
0
05 Jan 2023
Learning a Generic Value-Selection Heuristic Inside a Constraint Programming Solver
Tom Marty
Tristan François
Pierre Tessier
Louis Gautier
Louis-Martin Rousseau
Quentin Cappart
106
7
0
05 Jan 2023
Deep Spectral Q-learning with Application to Mobile Health
Yuhe Gao
C. Shi
R. Song
75
0
0
03 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
198
36
0
01 Jan 2023
Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search
Wenqing Zheng
S. Sharan
Zhiwen Fan
Kevin Wang
Yihan Xi
Zhangyang Wang
102
10
0
30 Dec 2022
Transformer in Transformer as Backbone for Deep Reinforcement Learning
Hangyu Mao
Rui Zhao
Hao Chen
Jianye Hao
Yiqun Chen
Dong Li
Junge Zhang
Zhen Xiao
OffRL
95
8
0
30 Dec 2022
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
120
16
0
24 Dec 2022
An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context
Xiaoyu Chen
Geng Chen
Yufeng Zheng
Pushi Zhang
Li Zhao
...
Peng Cheng
Y. Xiong
Tao Qin
Jianyu Chen
Tie-Yan Liu
OffRL
108
13
0
24 Dec 2022
NARS vs. Reinforcement learning: ONA vs. Q-Learning
Ali Beikmohammadi
113
0
0
23 Dec 2022
Local Policy Improvement for Recommender Systems
Dawen Liang
N. Vlassis
OffRL
54
5
0
22 Dec 2022
Lifelong Reinforcement Learning with Modulating Masks
Eseoghene Ben-Iwhiwhu
Saptarshi Nath
Praveen K. Pilly
Soheil Kolouri
Andrea Soltoggio
CLL
OffRL
100
23
0
21 Dec 2022
Risk-Sensitive Reinforcement Learning with Exponential Criteria
Erfaun Noorani
Christos N. Mavridis
John S. Baras
101
9
0
18 Dec 2022
Latent Variable Representation for Reinforcement Learning
Zhaolin Ren
Chenjun Xiao
Tianjun Zhang
Na Li
Zhaoran Wang
Sujay Sanghavi
Dale Schuurmans
Bo Dai
OffRL
106
10
0
17 Dec 2022
Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning
Lindsey Kerbel
B. Ayalew
Andrej Ivanco
K. Loiselle
OffRL
48
8
0
15 Dec 2022
Robust Policy Optimization in Deep Reinforcement Learning
Md Masudur Rahman
Yexiang Xue
53
12
0
14 Dec 2022
Efficient Exploration in Resource-Restricted Reinforcement Learning
Zhihai Wang
Taoxing Pan
Qi Zhou
Jie Wang
OffRL
54
12
0
14 Dec 2022
Proximal Policy Optimization Based Reinforcement Learning for Joint Bidding in Energy and Frequency Regulation Markets
M. Anwar
Changlong Wang
F. D. Nijs
Hao Wang
23
14
0
13 Dec 2022
PPO-UE: Proximal Policy Optimization via Uncertainty-Aware Exploration
Qisheng Zhang
Zhen Guo
A. Jøsang
Lance M. Kaplan
F. Chen
Dong-Ho Jeong
Jin-Hee Cho
54
0
0
13 Dec 2022
Variance-Reduced Conservative Policy Iteration
Naman Agarwal
Brian Bullins
Karan Singh
64
3
0
12 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
90
28
0
10 Dec 2022
Previous
1
2
3
...
5
6
7
...
39
40
41
Next