Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking
Desong Du
Naiming Qi
Yanfang Liu
Wei Pan
14
0
0
07 Nov 2023
Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization
Kun Lei
Zhengmao He
Chenhao Lu
Kaizhe Hu
Yang Gao
Huazhe Xu
OffRL
OnRL
67
13
0
06 Nov 2023
Active Reasoning in an Open-World Environment
Manjie Xu
Guangyuan Jiang
Weihan Liang
Chi Zhang
Yixin Zhu
LLMAG
LRM
21
10
0
03 Nov 2023
Efficient Symbolic Policy Learning with Differentiable Symbolic Expression
Jiaming Guo
Rui Zhang
Shaohui Peng
Qi Yi
Xingui Hu
...
Zidong Du
Xishan Zhang
Ling Li
Qi Guo
Yunji Chen
OffRL
30
5
0
02 Nov 2023
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization
Jaafar Mhamed
Shangding Gu
24
0
0
01 Nov 2023
A Multi-Agent Reinforcement Learning Framework for Evaluating the U.S. Ending the HIV Epidemic Plan
Dinesh Sharma
Ankit Shah
Chaitra Gopalappa
40
0
0
01 Nov 2023
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
Joey Hong
Anca Dragan
Sergey Levine
OffRL
38
5
0
31 Oct 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
21
0
0
31 Oct 2023
Amoeba: Circumventing ML-supported Network Censorship via Adversarial Reinforcement Learning
Haoyu Liu
A. Diallo
P. Patras
AAML
16
3
0
31 Oct 2023
Dropout Strategy in Reinforcement Learning: Limiting the Surrogate Objective Variance in Policy Optimization Methods
Zhengpeng Xie
Changdong Yu
Weizheng Qiao
29
1
0
31 Oct 2023
Network Contention-Aware Cluster Scheduling with Reinforcement Learning
Junyeol Ryu
Jeongyoon Eo
GNN
17
0
0
31 Oct 2023
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
Michal Nauman
Marek Cygan
40
1
0
30 Oct 2023
Robot Control based on Motor Primitives -- A Comparison of Two Approaches
Moses C. Nah
Johannes Lachner
Neville Hogan
26
3
0
28 Oct 2023
Online Decision Mediation
Daniel Jarrett
Alihan Huyuk
M. Schaar
35
2
0
28 Oct 2023
Deep Reinforcement Learning for Weapons to Targets Assignment in a Hypersonic strike
B. Gaudet
K. Drozd
R. Furfaro
11
1
0
27 Oct 2023
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks
Ryan Sullivan
Akarsh Kumar
Shengyi Huang
John P. Dickerson
Joseph Suárez
OffRL
24
5
0
26 Oct 2023
DSAC-C: Constrained Maximum Entropy for Robust Discrete Soft-Actor Critic
Dexter Neo
Tsuhan Chen
30
1
0
26 Oct 2023
Fractal Landscapes in Policy Optimization
Tao Wang
Sylvia Herbert
Sicun Gao
34
5
0
24 Oct 2023
A Doubly Robust Approach to Sparse Reinforcement Learning
Wonyoung Hedge Kim
Garud Iyengar
A. Zeevi
25
3
0
23 Oct 2023
Policy Gradient with Kernel Quadrature
Satoshi Hayakawa
Tetsuro Morimura
OffRL
BDL
32
0
0
23 Oct 2023
Absolute Policy Optimization
Weiye Zhao
Feihan Li
Yifan Sun
Rui Chen
Tianhao Wei
Changliu Liu
52
4
0
20 Oct 2023
MARVEL: Multi-Agent Reinforcement-Learning for Large-Scale Variable Speed Limits
Yuhang Zhang
Marcos Quiñones-Grueiro
Zhiyao Zhang
Yanbing Wang
William Barbour
Gautam Biswas
Dan Work
38
5
0
18 Oct 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
Washim Uddin Mondal
Vaneet Aggarwal
38
9
0
18 Oct 2023
Quantifying Assistive Robustness Via the Natural-Adversarial Frontier
Jerry Zhi-Yang He
Zackory M. Erickson
Daniel S. Brown
Anca Dragan
AAML
29
0
0
16 Oct 2023
End-to-end Offline Reinforcement Learning for Glycemia Control
Tristan Beolet
Alice Adenis
E. Huneker
Maxime Louis
OffRL
38
1
0
16 Oct 2023
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Fengbo Lan
Shengjie Wang
Yunzhe Zhang
Haotian Xu
Oluwatosin Oseni
Yang Gao
Tao Zhang
47
5
0
13 Oct 2023
Discovering Fatigued Movements for Virtual Character Animation
N. Cheema
Rui Xu
Nam Hee Kim
Perttu Hämäläinen
Vladislav Golyanik
Marc Habermann
Christian Theobalt
Philipp Slusallek
32
4
0
12 Oct 2023
Discerning Temporal Difference Learning
Jianfei Ma
15
0
0
12 Oct 2023
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples
Hao Sun
Alihan Huyuk
Daniel Jarrett
M. Schaar
OffRL
39
7
0
11 Oct 2023
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Gregory Palmer
Chris Parry
Daniel J.B. Harrold
Chris Willis
AI4CE
23
1
0
11 Oct 2023
Diversity for Contingency: Learning Diverse Behaviors for Efficient Adaptation and Transfer
Finn Rietz
J. A. Stork
33
0
0
11 Oct 2023
Imitation Learning from Observation with Automatic Discount Scheduling
Yuyang Liu
Weijun Dong
Yingdong Hu
Chuan Wen
Zhao-Heng Yin
Chongjie Zhang
Yang Gao
30
6
0
11 Oct 2023
Imitation Learning from Purified Demonstration
Yunke Wang
Minjing Dong
Bo Du
Chang Xu
31
1
0
11 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
34
21
0
09 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
32
5
0
09 Oct 2023
Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient Updates
Guangchen Lan
Han Wang
James Anderson
Christopher G. Brinton
Vaneet Aggarwal
FedML
32
27
0
09 Oct 2023
Increasing Entropy to Boost Policy Gradient Performance on Personalization Tasks
Andrew Starnes
Anton Dereventsov
Clayton Webster
24
0
0
09 Oct 2023
Distributional Soft Actor-Critic with Three Refinements
Jingliang Duan
Wenxuan Wang
Liming Xiao
Jiaxin Gao
Shengbo Eben Li
Chang Liu
Ya-Qin Zhang
Bo Cheng
Keqiang Li
OODD
OffRL
27
2
0
09 Oct 2023
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility
Lang Feng
Dong Xing
Junru Zhang
Gang Pan
34
1
0
08 Oct 2023
Safe Deep Policy Adaptation
Wenli Xiao
Tairan He
John M. Dolan
Guanya Shi
34
9
0
08 Oct 2023
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
Jiayu Chen
Zelai Xu
Yunfei Li
Chao Yu
Jiaming Song
Huazhong Yang
Fei Fang
Yu Wang
Yi Wu
34
4
0
07 Oct 2023
Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning
Hao-bin Shi
Qing Zhu
Lei Han
Wanchao Chi
Tingguang Li
Max Q.-H. Meng
40
3
0
07 Oct 2023
Deep Model Predictive Optimization
Jacob Sacks
Rwik Rana
Kevin Huang
Alex Spitzer
Guanya Shi
Byron Boots
48
7
0
06 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
50
48
0
06 Oct 2023
Reinforcement Learning with Fast and Forgetful Memory
Steven D. Morad
Ryan Kortvelesy
Stephan Liwicki
Amanda Prorok
OffRL
29
4
0
06 Oct 2023
TRAM: Bridging Trust Regions and Sharpness Aware Minimization
Tom Sherborne
Naomi Saphra
Pradeep Dasigi
Hao Peng
32
4
0
05 Oct 2023
Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards
Zhaorun Chen
Zhuokai Zhao
Tairan He
Binhao Chen
Xuhao Zhao
Liang Gong
Chengliang Liu
29
3
0
05 Oct 2023
Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms
Akifumi Wachi
Wataru Hashimoto
Xun Shen
Kazumune Hashimoto
22
9
0
05 Oct 2023
Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study
Fouzi Boukhalfa
Réda Alami
Mastane Achab
Eric Moulines
M. Bennis
11
1
0
04 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient Methods
Sara Klein
Simon Weissmann
Leif Döring
29
7
0
04 Oct 2023
Previous
1
2
3
...
10
11
12
...
60
61
62
Next