Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,098 papers shown
Title
Optimizing Irrigation Efficiency using Deep Reinforcement Learning in the Field
Xianzhong Ding
Wan Du
AI4CE
31
13
0
04 Apr 2023
Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring
Runzhe Wan
Yu Liu
James McQueen
Doug Hains
Rui Song
OffRL
35
4
0
02 Apr 2023
Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization
M. Chadi
H. Mousannif
OffRL
21
4
0
31 Mar 2023
Models as Agents: Optimizing Multi-Step Predictions of Interactive Local Models in Model-Based Multi-Agent Reinforcement Learning
Zifan Wu
Chao Yu
Chong Chen
Jianye Hao
H. Zhuo
34
9
0
31 Mar 2023
Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function
A. B. Siddique
M. H. Maqbool
Kshitija Taywade
H. Foroosh
24
12
0
24 Mar 2023
Boosting Reinforcement Learning and Planning with Demonstrations: A Survey
Tongzhou Mu
H. Su
OffRL
35
1
0
23 Mar 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
33
0
0
22 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the Future
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MH
LM&MA
42
128
0
21 Mar 2023
Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning
Chaofan Ma
Qisen Xu
Xiangfeng Wang
Bo Jin
Xiaoyun Zhang
Yanfeng Wang
Ya Zhang
39
22
0
19 Mar 2023
Interpretable Reinforcement Learning via Neural Additive Models for Inventory Management
Julien N. Siems
Maximilian Schambach
Sebastian Schulze
Johannes Otterbach
22
2
0
18 Mar 2023
On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits
Weitong Zhang
Jiafan He
Zhiyuan Fan
Q. Gu
108
5
0
16 Mar 2023
Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning
T. Kanazawa
Chetan Gupta
31
0
0
15 Mar 2023
Sample-efficient Adversarial Imitation Learning
Dahuin Jung
Hyungyu Lee
Sung-Hoon Yoon
SSL
31
2
0
14 Mar 2023
Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Han Zheng
Xufang Luo
Pengfei Wei
Xuan Song
Dongsheng Li
Jing Jiang
OffRL
OnRL
18
21
0
14 Mar 2023
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization
E. Derman
Yevgeniy Men
M. Geist
Shie Mannor
45
1
0
12 Mar 2023
Uncertainty-Aware Instance Reweighting for Off-Policy Learning
Xiaoying Zhang
Junpu Chen
Hongning Wang
Hong Xie
Yang Liu
John C. S. Lui
Hang Li
OffRL
83
4
0
11 Mar 2023
Optimal active particle navigation meets machine learning
Mahdi Nasiri
H. Löwen
B. Liebchen
21
21
0
09 Mar 2023
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
63
8
0
08 Mar 2023
ConBaT: Control Barrier Transformer for Safe Policy Learning
Yue Meng
Sai H. Vemprala
Rogerio Bonatti
Chuchu Fan
Ashish Kapoor
OffRL
45
3
0
07 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
98
156
0
07 Mar 2023
Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End Policy and Optimistic Smooth Fictitious Play
Wei Xi
Yongxin Zhang
Changnan Xiao
Xuefeng Huang
Shihong Deng
Haowei Liang
Jie Chen
Peng Sun
OffRL
50
8
0
07 Mar 2023
Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning
Daniel Palenicek
M. Lutter
João Carvalho
Jan Peters
32
4
0
07 Mar 2023
Evolutionary Reinforcement Learning: A Survey
Hui Bai
Ran Cheng
Yaochu Jin
OffRL
50
52
0
07 Mar 2023
Safe Reinforcement Learning via Probabilistic Logic Shields
Wen-Chi Yang
G. Marra
Gavin Rens
Luc de Raedt
OffRL
46
30
0
06 Mar 2023
CFlowNets: Continuous Control with Generative Flow Networks
Yinchuan Li
Shuang Luo
Haozhi Wang
Jianye Hao
91
20
0
04 Mar 2023
Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control
Amarildo Likmeta
Matteo Sacco
Alberto Maria Metelli
Marcello Restelli
OffRL
26
3
0
04 Mar 2023
Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm using Deep Multi-Agent Reinforcement Learning
Maryam Kouzeghar
Youn-Suk Song
Malika Meghjani
Roland Bouffanais
34
12
0
03 Mar 2023
Guarded Policy Optimization with Imperfect Online Demonstrations
Zhenghai Xue
Zhenghao Peng
Quanyi Li
Zhihan Liu
Bolei Zhou
OffRL
51
10
0
03 Mar 2023
Co-learning Planning and Control Policies Constrained by Differentiable Logic Specifications
Zikang Xiong
Daniel Lawson
Joe Eappen
A. H. Qureshi
Suresh Jagannathan
13
0
0
02 Mar 2023
Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing
Shuai Xiao
Le Guo
Zaifan Jiang
Lei Lv
Yuanbo Chen
Jun Zhu
Shuang Yang
30
21
0
02 Mar 2023
Policy Dispersion in Non-Markovian Environment
B. Qu
Xiaofeng Cao
Jielong Yang
Hechang Chen
Chang Yi
Ivor W.Tsang
Yew-Soon Ong
22
0
0
28 Feb 2023
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization
Haotian Xu
Shengjie Wang
Zhaolei Wang
Yunzhe Zhang
Qing Zhuo
Yang Gao
Tao Zhang
18
0
0
28 Feb 2023
Taylor TD-learning
Michele Garibbo
Maxime Robeyns
Laurence Aitchison
OffRL
23
1
0
27 Feb 2023
Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks
Mohammad Mohammadi
Jonathan Nöther
Debmalya Mandal
Adish Singla
Goran Radanović
AAML
OffRL
35
9
0
27 Feb 2023
Reinforcement Learning Based Pushing and Grasping Objects from Ungraspable Poses
Hao Zhang
Hongzhuo Liang
Lin Cong
Jianzhi Lyu
Long Zeng
Pingfa Feng
Jian-Wei Zhang
SSL
DRL
34
9
0
26 Feb 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
Sequential Counterfactual Risk Minimization
Houssam Zenati
Eustache Diemert
Matthieu Martin
Julien Mairal
Pierre Gaillard
OffRL
29
3
0
23 Feb 2023
Behavior Proximal Policy Optimization
Zifeng Zhuang
Kun Lei
Jinxin Liu
Donglin Wang
Yilang Guo
OffRL
32
34
0
22 Feb 2023
DSL-Assembly: A Robust and Safe Assembly Strategy
Yi Liu
OffRL
19
0
0
21 Feb 2023
Understanding the effect of varying amounts of replay per step
A. Paul
Videh Raj Nema
8
0
0
20 Feb 2023
Improving Deep Policy Gradients with Value Function Search
Enrico Marchesini
Chris Amato
26
9
0
20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
39
5
0
20 Feb 2023
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Brendan O'Donoghue
OffRL
35
6
0
18 Feb 2023
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
Harshit S. Sikchi
Qinqing Zheng
Amy Zhang
S. Niekum
OffRL
38
19
0
16 Feb 2023
Model-Based Decentralized Policy Optimization
Hao Luo
Jiechuan Jiang
Zongqing Lu
24
0
0
16 Feb 2023
Trust-Region-Free Policy Optimization for Stochastic Policies
Mingfei Sun
Benjamin Ellis
Anuj Mahajan
Sam Devlin
Katja Hofmann
Shimon Whiteson
6
2
0
15 Feb 2023
CERiL: Continuous Event-based Reinforcement Learning
Celyn Walters
Simon Hadfield
OffRL
27
2
0
15 Feb 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
38
8
0
13 Feb 2023
Order Matters: Agent-by-agent Policy Optimization
Xihuai Wang
Zheng Tian
Bo Liu
Ying Wen
Jun Wang
Weinan Zhang
33
27
0
13 Feb 2023
Graph Learning Based Decision Support for Multi-Aircraft Take-Off and Landing at Urban Air Mobility Vertiports
Prajit K. Kumar
Jhoel Witter
Steve Paul
Karthik Dantu
Souma Chowdhury
19
3
0
12 Feb 2023
Previous
1
2
3
...
15
16
17
...
60
61
62
Next