ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03056
  4. Cited By
Policy Certificates: Towards Accountable Reinforcement Learning

Policy Certificates: Towards Accountable Reinforcement Learning

7 November 2018
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
    OffRL
ArXivPDFHTML

Papers citing "Policy Certificates: Towards Accountable Reinforcement Learning"

39 / 39 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
78
2
0
10 Oct 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
29
0
0
29 Mar 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with
  Uniform PAC Guarantees
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
Tadashi Kozuno
Masahiro Kato
Yuki Ichihara
Soichiro Nishimori
Akiyoshi Sannai
Sho Sonoda
Wataru Kumagai
Yutaka Matsuo
42
2
0
31 Jan 2024
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
98
22
0
25 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
Ruiqi Zhang
Andrea Zanette
OffRL
OnRL
40
7
0
10 Jul 2023
Fast Rates for Maximum Entropy Exploration
Fast Rates for Maximum Entropy Exploration
D. Tiapkin
Denis Belomestny
Daniele Calandriello
Eric Moulines
Rémi Munos
A. Naumov
Pierre Perrault
Yunhao Tang
Michal Valko
Pierre Menard
44
17
0
14 Mar 2023
Provably Efficient Reinforcement Learning via Surprise Bound
Provably Efficient Reinforcement Learning via Surprise Bound
Hanlin Zhu
Ruosong Wang
Jason D. Lee
OffRL
28
5
0
22 Feb 2023
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
34
19
0
31 Jan 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
44
10
0
31 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Qiyang Li
Yuexiang Zhai
Yi Ma
Sergey Levine
37
14
0
24 Dec 2022
Opportunistic Episodic Reinforcement Learning
Opportunistic Episodic Reinforcement Learning
Xiaoxiao Wang
Nader Bouacida
Xueying Guo
Xin Liu
21
0
0
24 Oct 2022
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
Zihan Zhang
Yuhang Jiang
Yuanshuo Zhou
Xiangyang Ji
OffRL
26
9
0
15 Oct 2022
Square-root regret bounds for continuous-time episodic Markov decision
  processes
Square-root regret bounds for continuous-time episodic Markov decision processes
Xuefeng Gao
X. Zhou
43
6
0
03 Oct 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies
Zihan Zhang
Xiangyang Ji
S. Du
30
21
0
24 Mar 2022
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost
Dan Qiao
Ming Yin
Ming Min
Yu-Xiang Wang
43
28
0
13 Feb 2022
Settling the Horizon-Dependence of Sample Complexity in Reinforcement
  Learning
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning
Yuanzhi Li
Ruosong Wang
Lin F. Yang
25
20
0
01 Nov 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear
  Function Approximation
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation
Zaiwei Chen
S. Khodadadian
S. T. Maguluri
OffRL
63
29
0
26 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu-Xiang Wang
OffRL
32
19
0
13 May 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
32
52
0
24 Mar 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CML
OffRL
71
26
0
18 Feb 2021
Task-Optimal Exploration in Linear Dynamical Systems
Task-Optimal Exploration in Linear Dynamical Systems
Andrew Wagenmaker
Max Simchowitz
Kevin G. Jamieson
22
18
0
10 Feb 2021
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous
  Q-Learning and TD-Learning Variants
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
105
54
0
02 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear
  Mixture MDP
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
71
36
0
29 Jan 2021
Learning to be Safe: Deep RL with a Safety Critic
Learning to be Safe: Deep RL with a Safety Critic
K. Srinivasan
Benjamin Eysenbach
Sehoon Ha
Jie Tan
Chelsea Finn
OffRL
33
143
0
27 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs
Jiafan He
Dongruo Zhou
Quanquan Gu
16
37
0
01 Oct 2020
Provably Efficient Reward-Agnostic Navigation with Linear Value
  Iteration
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration
Andrea Zanette
A. Lazaric
Mykel J. Kochenderfer
Emma Brunskill
22
64
0
18 Aug 2020
Near-Optimal Reinforcement Learning with Self-Play
Near-Optimal Reinforcement Learning with Self-Play
Yunru Bai
Chi Jin
Tiancheng Yu
19
129
0
22 Jun 2020
Task-agnostic Exploration in Reinforcement Learning
Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang
Yuzhe Ma
Adish Singla
OffRL
20
49
0
16 Jun 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
43
59
0
16 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
19
299
0
01 Jun 2020
Reinforcement Learning with General Value Function Approximation:
  Provably Efficient Approach via Bounded Eluder Dimension
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang
Ruslan Salakhutdinov
Lin F. Yang
23
55
0
21 May 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
18
149
0
21 Feb 2020
Interpretable Off-Policy Evaluation in Reinforcement Learning by
  Highlighting Influential Transitions
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Omer Gottesman
Joseph D. Futoma
Yao Liu
Soanli Parbhoo
Leo Anthony Celi
Emma Brunskill
Finale Doshi-Velez
OffRL
147
55
0
10 Feb 2020
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
L. Zintgraf
K. Shiarlis
Maximilian Igl
Sebastian Schulze
Y. Gal
Katja Hofmann
Shimon Whiteson
OffRL
11
272
0
18 Oct 2019
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy
  Policies
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Yonathan Efroni
Nadav Merlis
Mohammad Ghavamzadeh
Shie Mannor
OffRL
22
68
0
27 May 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and
  Regret Bound
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRL
GP
13
283
0
24 May 2019
HARK Side of Deep Learning -- From Grad Student Descent to Automated
  Machine Learning
HARK Side of Deep Learning -- From Grad Student Descent to Automated Machine Learning
O. Gencoglu
M. Gils
E. Guldogan
Chamin Morikawa
Mehmet Süzen
M. Gruber
J. Leinonen
H. Huttunen
11
36
0
16 Apr 2019
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning
  without Domain Knowledge using Value Function Bounds
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Andrea Zanette
Emma Brunskill
OffRL
24
273
0
01 Jan 2019
Agent Embeddings: A Latent Representation for Pole-Balancing Networks
Agent Embeddings: A Latent Representation for Pole-Balancing Networks
Oscar Chang
Robert Kwiatkowski
Siyuan Chen
Hod Lipson
24
6
0
12 Nov 2018
1