Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
Cooperative Multi-Agent Learning for Navigation via Structured State Abstraction
Mohamed K. Abdel-Aziz
Mohammed S. Elbamby
S. Samarakoon
M. Bennis
36
4
0
20 Jun 2023
AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents
Timothée Mathieu
R. D. Vecchia
Alena Shilova
M. Centa
Hector Kohler
Odalric-Ambrym Maillard
Philippe Preux
27
0
0
19 Jun 2023
Integrating Tick-level Data and Periodical Signal for High-frequency Market Making
Jiafa He
Cong Zheng
Can Yang
AIFin
19
0
0
19 Jun 2023
Optimal Execution Using Reinforcement Learning
Cong Zheng
Jiafa He
Can Yang
21
0
0
19 Jun 2023
Deep Reinforcement Learning with Task-Adaptive Retrieval via Hypernetwork
Yonggang Jin
Chenxu Wang
Tianyu Zheng
Liuyu Xiang
Yao-Chun Yang
Junge Zhang
Jie Fu
Zhaofeng He
3DH
45
0
0
19 Jun 2023
Acceleration in Policy Optimization
Veronica Chelu
Tom Zahavy
A. Guez
Doina Precup
Sebastian Flennerhag
56
0
0
18 Jun 2023
Variational Sequential Optimal Experimental Design using Reinforcement Learning
Wanggang Shen
Jiayuan Dong
Xun Huan
23
3
0
17 Jun 2023
Actor-Critic Model Predictive Control
Angel Romero
Yunlong Song
Davide Scaramuzza
52
36
0
16 Jun 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
Yunfan Li
Yiran Wang
Y. Cheng
Lin F. Yang
OffRL
35
4
0
15 Jun 2023
Datasets and Benchmarks for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Haohong Lin
Yi-Fan Yao
Jiacheng Zhu
...
Hanjiang Hu
Wenhao Yu
Tingnan Zhang
Jie Tan
Ding Zhao
OffRL
32
37
0
15 Jun 2023
Generalizable Resource Scaling of 5G Slices using Constrained Reinforcement Learning
Muhammad Sulaiman
Mahdieh Ahmadi
M. A. Salahuddin
R. Boutaba
A. Saleh
45
6
0
15 Jun 2023
Optimal Exploration for Model-Based RL in Nonlinear Systems
Andrew Wagenmaker
Guanya Shi
Kevin G. Jamieson
41
14
0
15 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
46
7
0
14 Jun 2023
Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes
Luca Sabbioni
Francesco Corda
Marcello Restelli
29
0
0
13 Jun 2023
Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
Zeyu Zhang
Yi-Hsun Su
Hui Yuan
Yiran Wu
R. Balasubramanian
Qingyun Wu
Huazheng Wang
Mengdi Wang
OffRL
CML
44
4
0
13 Jun 2023
Robust Reinforcement Learning through Efficient Adversarial Herding
Juncheng Dong
Hao-Lun Hsu
Qitong Gao
Vahid Tarokh
Miroslav Pajic
42
4
0
12 Jun 2023
Efficiently Learning the Graph for Semi-supervised Learning
Dravyansh Sharma
Maxwell Jones
43
4
0
12 Jun 2023
A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence
Kexuan Wang
An Liu
Baishuo Liu
28
1
0
10 Jun 2023
Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints
Donghao Li
Ruiquan Huang
Cong Shen
Jing Yang
39
3
0
09 Jun 2023
Approximate information state based convergence analysis of recurrent Q-learning
Erfan Seyedsalehi
N. Akbarzadeh
Amit Sinha
Aditya Mahajan
27
6
0
09 Jun 2023
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
Dong-hwan Lee
26
2
0
09 Jun 2023
RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
Jonas Eschmann
Dario Albani
Giuseppe Loianno
OffRL
46
5
0
06 Jun 2023
Boosting Offline Reinforcement Learning with Action Preference Query
Qisen Yang
Shenzhi Wang
Matthieu Lin
S. Song
Gao Huang
OffRL
21
9
0
06 Jun 2023
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji
Yuping Luo
Gang Hua
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRL
OnRL
47
15
0
05 Jun 2023
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
Scott Fujimoto
Wei-Di Chang
Edward James Smith
S. Gu
Doina Precup
David Meger
OffRL
28
45
0
04 Jun 2023
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Banghua Zhu
Hiteshi Sharma
Felipe Vieira Frujeri
Shi Dong
Chenguang Zhu
Michael I. Jordan
Jiantao Jiao
OSLM
36
39
0
04 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Anas Barakat
Ilyas Fatkhullin
Niao He
33
11
0
02 Jun 2023
PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward
Weichao Zhou
Wenchao Li
31
0
0
02 Jun 2023
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
Andrew Jesson
Chris Xiaoxuan Lu
Gunshi Gupta
Angelos Filos
Jakob N. Foerster
Y. Gal
OffRL
29
5
0
02 Jun 2023
Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task
Reuf Kozlica
S. Wegenkittl
Simon Hirlaender
OffRL
11
4
0
02 Jun 2023
DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference
Ziyang Zhang
Yang Zhao
Huan Li
Changyao Lin
Jie Liu
46
14
0
02 Jun 2023
Improving and Benchmarking Offline Reinforcement Learning Algorithms
Bingyi Kang
Xiao Ma
Yi-Ren Wang
Yang Yue
Shuicheng Yan
OffRL
16
9
0
01 Jun 2023
Progressive Learning for Physics-informed Neural Motion Planning
Ruiqi Ni
A. H. Qureshi
23
11
0
01 Jun 2023
Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees
Pengfei Li
Jianyi Yang
Shaolei Ren
OffRL
27
4
0
31 May 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
Bingyi Kang
Xiao Ma
Chao Du
Tianyu Pang
Shuicheng Yan
OffRL
42
63
0
31 May 2023
Latent Exploration for Reinforcement Learning
A. Chiappa
Alessandro Marin Vargas
Ann Zixiang Huang
Alexander Mathis
32
13
0
31 May 2023
Representation-Driven Reinforcement Learning
Ofir Nabati
Guy Tennenholtz
Shie Mannor
27
1
0
31 May 2023
ROSARL: Reward-Only Safe Reinforcement Learning
Geraud Nangue Tasse
Tamlin Love
Mark W. Nemecek
Steven D. James
Benjamin Rosman
29
3
0
31 May 2023
On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Jiacai Liu
Jinchi Chen
Ke Wei
31
2
0
31 May 2023
Policy Optimization for Continuous Reinforcement Learning
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
40
17
0
30 May 2023
Emergent Incident Response for Unmanned Warehouses with Multi-agent Systems*
Yibo Guo
Mingxin Li
Jingting Zong
Mingliang Xu
26
0
0
29 May 2023
Online Nonstochastic Model-Free Reinforcement Learning
Udaya Ghai
Arushi Gupta
Wenhan Xia
Karan Singh
Elad Hazan
OffRL
36
6
0
27 May 2023
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL
Xiangyu Liu
Souradip Chakraborty
Yanchao Sun
Furong Huang
AAML
28
4
0
27 May 2023
Self-Supervised Reinforcement Learning that Transfers using Random Features
Boyuan Chen
Chuning Zhu
Pulkit Agrawal
Kaipeng Zhang
Abhishek Gupta
OffRL
SSL
41
6
0
26 May 2023
Emergent Agentic Transformer from Chain of Hindsight Experience
Hao Liu
Pieter Abbeel
OffRL
38
25
0
26 May 2023
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Aleksandr Beznosikov
S. Samsonov
Marina Sheshukova
Alexander Gasnikov
A. Naumov
Eric Moulines
54
14
0
25 May 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
OffRL
OnRL
45
19
0
25 May 2023
Deep Reinforcement Learning with Plasticity Injection
Evgenii Nikishin
Junhyuk Oh
Georg Ostrovski
Clare Lyle
Razvan Pascanu
Will Dabney
André Barreto
OffRL
26
50
0
24 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
37
3
0
24 May 2023
Neural Lyapunov and Optimal Control
Daniel Layeghi
Steve Tonneau
M. Mistry
21
0
0
24 May 2023
Previous
1
2
3
...
13
14
15
...
60
61
62
Next