Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
UGAE: A Novel Approach to Non-exponential Discounting
Ariel Kwiatkowski
Vicky Kalogeiton
Julien Pettré
Marie-Paule Cani
OffRL
30
2
0
11 Feb 2023
A Survey on Causal Reinforcement Learning
Yan Zeng
Ruichu Cai
Gang Hua
Libo Huang
Zhifeng Hao
CML
31
27
0
10 Feb 2023
An information-theoretic learning model based on importance sampling
Jiangshe Zhang
Lizhen Ji
Fei Gao
Meng-Qian Li
37
0
0
09 Feb 2023
Optimizing Audio Recommendations for the Long-Term: A Reinforcement Learning Perspective
Lucas Maystre
Daniel Russo
Yu Zhao
OffRL
9
9
0
07 Feb 2023
Target-based Surrogates for Stochastic Optimization
J. Lavington
Sharan Vaswani
Reza Babanezhad
Mark Schmidt
Nicolas Le Roux
60
5
0
06 Feb 2023
Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization
Zichuan Lin
Xiapeng Wu
Mingfei Sun
Deheng Ye
Qiang Fu
Wei Yang
Wei Liu
33
3
0
05 Feb 2023
Open Problems and Modern Solutions for Deep Reinforcement Learning
Weiqin Chen
OffRL
23
0
0
05 Feb 2023
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Pouya Hamadanian
Arash Nasr-Esfahany
Malte Schwarzkopf
Siddartha Sen
MohammadIman Alizadeh
CLL
OffRL
55
0
0
04 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
32
37
0
03 Feb 2023
Distributional constrained reinforcement learning for supply chain optimization
J. Berm\údez
Antonio del Rio-Chanona
Calvin Tsay
26
5
0
03 Feb 2023
User-centric Heterogeneous-action Deep Reinforcement Learning for Virtual Reality in the Metaverse over Wireless Networks
Wen-li Yu
Terence Jie Chua
Junfeng Zhao
EgoV
40
18
0
03 Feb 2023
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Theodore H. Moskovitz
Brendan O'Donoghue
Vivek Veeriah
Sebastian Flennerhag
Satinder Singh
Tom Zahavy
50
19
0
02 Feb 2023
A general Markov decision process formalism for action-state entropy-regularized reward maximization
D. Grytskyy
Jorge Ramírez-Ruiz
R. Moreno-Bote
22
3
0
02 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
29
2
0
02 Feb 2023
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model
Zhihai Wang
Xijun Li
Jie Wang
Yufei Kuang
M. Yuan
Jianguo Zeng
Yongdong Zhang
Feng Wu
34
40
0
01 Feb 2023
Revisiting Bellman Errors for Offline Model Selection
Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok University of North Carolina at Chapel Hill
OffRL
35
5
0
31 Jan 2023
Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments
Tan Chong Min John
Mehul Motani
17
2
0
31 Jan 2023
Optimizing DDPM Sampling with Shortcut Fine-Tuning
Ying Fan
Kangwook Lee
21
53
0
31 Jan 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
65
15
0
30 Jan 2023
PAC-Bayesian Soft Actor-Critic Learning
Bahareh Tasdighi
Abdullah Akgul
Manuel Haussmann
Kenny Kazimirzak Brink
M. Kandemir
34
3
0
30 Jan 2023
Learning the Kalman Filter with Fine-Grained Sample Complexity
Xiangyuan Zhang
Bin Hu
Tamer Bacsar
26
16
0
30 Jan 2023
Sample Efficient Deep Reinforcement Learning via Local Planning
Dong Yin
S. Thiagarajan
N. Lazić
Nived Rajaraman
Botao Hao
Csaba Szepesvári
27
4
0
29 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
32
1
0
28 Jan 2023
Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation
Daesol Cho
Seungjae Lee
H. J. Kim
31
14
0
27 Jan 2023
Trust Region-Based Safe Distributional Reinforcement Learning for Multiple Constraints
Dohyeong Kim
Kyungjae Lee
Songhwai Oh
23
10
0
26 Jan 2023
Partial advantage estimator for proximal policy optimization
Xiulei Song
Yi-Fan Jin
Greg Slabaugh
Simon Lucas
OffRL
20
0
0
26 Jan 2023
Joint action loss for proximal policy optimization
Xiulei Song
Yi-Fan Jin
Greg Slabaugh
Simon Lucas
21
0
0
26 Jan 2023
AutoCost: Evolving Intrinsic Cost for Zero-violation Reinforcement Learning
Tairan He
Weiye Zhao
Changliu Liu
OffRL
39
17
0
24 Jan 2023
Constrained Reinforcement Learning for Dexterous Manipulation
A. Jain
Jack Kolb
Harish Ravichandar
24
2
0
24 Jan 2023
Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning
Zifan Wu
Chao Yu
Chong Chen
Jianye Hao
H. Zhuo
27
16
0
20 Jan 2023
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning
Haoxuan Pan
Deheng Ye
Xiaoming Duan
Qiang Fu
Wei Yang
Jianping He
Mingfei Sun
OffRL
25
2
0
20 Jan 2023
On Multi-Agent Deep Deterministic Policy Gradients and their Explainability for SMARTS Environment
Ansh Mittal
Aditya Malte
19
0
0
20 Jan 2023
Multi-Agent Interplay in a Competitive Survival Environment
Andrea Fanti
23
0
0
19 Jan 2023
DIRECT: Learning from Sparse and Shifting Rewards using Discriminative Reward Co-Training
Philipp Altmann
Thomy Phan
Fabian Ritz
Thomas Gabor
Claudia Linnhoff-Popien
OffRL
35
1
0
18 Jan 2023
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
Zhanteng Xie
P. Dames
46
61
0
16 Jan 2023
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
29
16
0
16 Jan 2023
Deep Reinforcement Learning for Autonomous Ground Vehicle Exploration Without A-Priori Maps
Shathushan Sivashangaran
A. Eskandarian
32
4
0
10 Jan 2023
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Zongwei Liu
Yonghong Song
Yuanlin Zhang
OffRL
35
2
0
10 Jan 2023
Transformers as Policies for Variable Action Environments
Niklas Zwingenberger
19
2
0
09 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
M. Geist
Stefano Ermon
OffRL
41
66
0
05 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
33
4
0
05 Jan 2023
Scalable Communication for Multi-Agent Reinforcement Learning via Transformer-Based Email Mechanism
Xudong Guo
Daming Shi
Wenhui Fan
22
5
0
05 Jan 2023
Learning a Generic Value-Selection Heuristic Inside a Constraint Programming Solver
Tom Marty
Tristan François
Pierre Tessier
Louis Gautier
Louis-Martin Rousseau
Quentin Cappart
51
7
0
05 Jan 2023
Deep Spectral Q-learning with Application to Mobile Health
Yuhe Gao
C. Shi
R. Song
34
0
0
03 Jan 2023
A Policy Optimization Method Towards Optimal-time Stability
Shengjie Wang
Lan Fengb
Xiang Zheng
Yu-wen Cao
Oluwatosin Oseni
Haotian Xu
Tao Zhang
Yang Gao
39
1
0
02 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search
Wenqing Zheng
S. Sharan
Zhiwen Fan
Kevin Wang
Yihan Xi
Zhangyang Wang
58
9
0
30 Dec 2022
Transformer in Transformer as Backbone for Deep Reinforcement Learning
Hangyu Mao
Rui Zhao
Hao Chen
Jianye Hao
Yiqun Chen
Dong Li
Junge Zhang
Zhen Xiao
OffRL
36
8
0
30 Dec 2022
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
26
0
0
29 Dec 2022
Previous
1
2
3
...
16
17
18
...
60
61
62
Next