Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 1,112 papers shown
Title
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies
Hanna Ziesche
Leonel Rozo
26
5
0
17 May 2023
What Matters in Reinforcement Learning for Tractography
Antoine Théberge
Christian Desrosiers
Maxime Descoteaux
Pierre-Marc Jodoin
OffRL
29
2
0
15 May 2023
MIMEx: Intrinsic Rewards from Masked Input Modeling
Toru Lin
Allan Jabri
OffRL
31
6
0
15 May 2023
A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes
Han Zhong
Tong Zhang
35
26
0
15 May 2023
Policy Gradient Algorithms Implicitly Optimize by Continuation
Adrien Bolland
Gilles Louppe
D. Ernst
39
3
0
11 May 2023
Mixture of personality improved Spiking actor network for efficient multi-agent cooperation
Xiyun Li
Ziyi Ni
Jingqing Ruan
Linghui Meng
Jing Shi
Tielin Zhang
Bo Xu
58
4
0
10 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
43
3
0
08 May 2023
Truncating Trajectories in Monte Carlo Reinforcement Learning
Riccardo Poiani
Alberto Maria Metelli
Marcello Restelli
26
2
0
07 May 2023
Federated Ensemble-Directed Offline Reinforcement Learning
Desik Rengarajan
N. Ragothaman
D. Kalathil
S. Shakkottai
OffRL
32
1
0
04 May 2023
Correcting for Interference in Experiments: A Case Study at Douyin
Vivek F. Farias
Hao Li
Tianyi Peng
Xinyuyang Ren
B. Hassibi
A. Zheng
33
9
0
04 May 2023
Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning
Liangyu Zhang
Yang Peng
Wenhao Yang
Zhihua Zhang
21
1
0
29 Apr 2023
Learning to Extrapolate: A Transductive Approach
Aviv Netanyahu
Abhishek Gupta
Max Simchowitz
Kaipeng Zhang
Pulkit Agrawal
49
15
0
27 Apr 2023
Distance Weighted Supervised Learning for Offline Interaction Data
Joey Hejna
Jensen Gao
Dorsa Sadigh
OffRL
36
13
0
26 Apr 2023
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Tuomas Haarnoja
Ben Moran
Guy Lever
Sandy H. Huang
Dhruva Tirumala
...
Andrea Huber
N. Hurley
F. Nori
R. Hadsell
N. Heess
50
143
0
26 Apr 2023
Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow
Dongkun Zhang
Jintao Xue
Yuxiang Cui
Yunkai Wang
Eryun Liu
Wei Jing
Junbo Chen
R. Xiong
Yue Wang
41
0
0
25 Apr 2023
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
Oscar Li
James Harrison
Jascha Narain Sohl-Dickstein
Virginia Smith
Luke Metz
54
5
0
21 Apr 2023
Synthetic Sample Selection for Generalized Zero-Shot Learning
Shreyank N. Gowda
30
16
0
06 Apr 2023
Meta-Learning with a Geometry-Adaptive Preconditioner
Suhyun Kang
Duhun Hwang
Moonjung Eo
Taesup Kim
Wonjong Rhee
AI4CE
37
15
0
04 Apr 2023
Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function
A. B. Siddique
M. H. Maqbool
Kshitija Taywade
H. Foroosh
24
12
0
24 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
33
0
0
22 Mar 2023
Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning
Chaofan Ma
Qisen Xu
Xiangfeng Wang
Bo Jin
Xiaoyun Zhang
Yanfeng Wang
Ya Zhang
37
22
0
19 Mar 2023
Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning
T. Kanazawa
Chetan Gupta
29
0
0
15 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
18
21
0
14 Mar 2023
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
M. Geist
Shie Mannor
42
1
0
12 Mar 2023
Optimal active particle navigation meets machine learning
Mahdi Nasiri
H. Löwen
B. Liebchen
16
21
0
09 Mar 2023
Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End Policy and Optimistic Smooth Fictitious Play
Wei Xi
Yongxin Zhang
Changnan Xiao
Xuefeng Huang
Shihong Deng
Haowei Liang
Jie Chen
Peng Sun
OffRL
50
8
0
07 Mar 2023
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
Amarildo Likmeta
Matteo Sacco
Alberto Maria Metelli
Marcello Restelli
OffRL
21
3
0
04 Mar 2023
Guarded Policy Optimization with Imperfect Online Demonstrations
Zhenghai Xue
Zhenghao Peng
Quanyi Li
Zhihan Liu
Bolei Zhou
OffRL
51
10
0
03 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing
Shuai Xiao
Le Guo
Zaifan Jiang
Lei Lv
Yuanbo Chen
Jun Zhu
Shuang Yang
24
21
0
02 Mar 2023
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization
Haotian Xu
Shengjie Wang
Zhaolei Wang
Yunzhe Zhang
Qing Zhuo
Yang Gao
Tao Zhang
18
0
0
28 Feb 2023
Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks
Mohammad Mohammadi
Jonathan Nöther
Debmalya Mandal
Adish Singla
Goran Radanović
AAML
OffRL
35
9
0
27 Feb 2023
Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses
Hao Zhang
Hongzhuo Liang
Lin Cong
Jianzhi Lyu
Long Zeng
Pingfa Feng
Jian-Wei Zhang
SSL
DRL
26
9
0
26 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Sequential Counterfactual Risk Minimization
Houssam Zenati
Eustache Diemert
Matthieu Martin
Julien Mairal
Pierre Gaillard
OffRL
29
3
0
23 Feb 2023
Behavior Proximal Policy Optimization
Zifeng Zhuang
Kun Lei
Jinxin Liu
Donglin Wang
Yilang Guo
OffRL
30
34
0
22 Feb 2023
DSL-Assembly: A Robust and Safe Assembly Strategy
Yi Liu
OffRL
19
0
0
21 Feb 2023
Understanding the effect of varying amounts of replay per step
A. Paul
Videh Raj Nema
8
0
0
20 Feb 2023
Improving Deep Policy Gradients with Value Function Search
Enrico Marchesini
Chris Amato
26
9
0
20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
27
5
0
20 Feb 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
35
8
0
13 Feb 2023
Graph Learning Based Decision Support for Multi-Aircraft Take-Off and Landing at Urban Air Mobility Vertiports
Prajit K. Kumar
Jhoel Witter
Steve Paul
Karthik Dantu
Souma Chowdhury
19
3
0
12 Feb 2023
An information-theoretic learning model based on importance sampling
Jiangshe Zhang
Lizhen Ji
Fei Gao
Meng-Qian Li
37
0
0
09 Feb 2023
Open Problems and Modern Solutions for Deep Reinforcement Learning
Weiqin Chen
OffRL
18
0
0
05 Feb 2023
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Pouya Hamadanian
Arash Nasr-Esfahany
Malte Schwarzkopf
Siddartha Sen
MohammadIman Alizadeh
CLL
OffRL
50
0
0
04 Feb 2023
Distributional constrained reinforcement learning for supply chain optimization
J. Berm\údez
Antonio del Rio-Chanona
Calvin Tsay
26
5
0
03 Feb 2023
User-centric Heterogeneous-action Deep Reinforcement Learning for Virtual Reality in the Metaverse over Wireless Networks
Wen-li Yu
Terence Jie Chua
Junfeng Zhao
EgoV
34
18
0
03 Feb 2023
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
22
3
0
02 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
21
2
0
02 Feb 2023
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
PAC-Bayesian Soft Actor-Critic Learning
Bahareh Tasdighi
Abdullah Akgul
Manuel Haussmann
Kenny Kazimirzak Brink
M. Kandemir
34
3
0
30 Jan 2023
Previous
1
2
3
4
5
6
...
21
22
23
Next