ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.00606
  4. Cited By
On Oracle-Efficient PAC RL with Rich Observations
v1v2v3v4 (latest)

On Oracle-Efficient PAC RL with Rich Observations

1 March 2018
Christoph Dann
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
ArXiv (abs)PDFHTML

Papers citing "On Oracle-Efficient PAC RL with Rich Observations"

31 / 81 papers shown
Title
Policy Information Capacity: Information-Theoretic Measure for Task
  Complexity in Deep Reinforcement Learning
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Hiroki Furuta
T. Matsushima
Tadashi Kozuno
Y. Matsuo
Sergey Levine
Ofir Nachum
S. Gu
OffRL
58
14
0
23 Mar 2021
Provably Efficient Cooperative Multi-Agent Reinforcement Learning with
  Function Approximation
Provably Efficient Cooperative Multi-Agent Reinforcement Learning with Function Approximation
Abhimanyu Dubey
Alex Pentland
78
26
0
08 Mar 2021
Model-free Representation Learning and Exploration in Low-rank MDPs
Model-free Representation Learning and Exploration in Low-rank MDPs
Aditya Modi
Jinglin Chen
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
OffRL
174
81
0
14 Feb 2021
RL for Latent MDPs: Regret Guarantees and a Lower Bound
RL for Latent MDPs: Regret Guarantees and a Lower Bound
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
86
80
0
09 Feb 2021
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
  Optimism, Embrace Virtual Curvature
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
Kefan Dong
Jiaqi Yang
Tengyu Ma
97
33
0
08 Feb 2021
On Query-efficient Planning in MDPs under Linear Realizability of the
  Optimal State-value Function
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function
Gellert Weisz
Philip Amortila
Barnabás Janzer
Yasin Abbasi-Yadkori
Nan Jiang
Csaba Szepesvári
OffRL
73
20
0
03 Feb 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and
  Sample-Efficient Algorithms
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms
Chi Jin
Qinghua Liu
Sobhan Miryoosefi
OffRL
144
220
0
01 Feb 2021
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear
  Mixture MDP
Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP
Zihan Zhang
Jiaqi Yang
Xiangyang Ji
S. Du
116
41
0
29 Jan 2021
A Provably Efficient Algorithm for Linear Markov Decision Process with
  Low Switching Cost
A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost
Minbo Gao
Tianle Xie
S. Du
Lin F. Yang
84
46
0
02 Jan 2021
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov
  Decision Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
Dongruo Zhou
Quanquan Gu
Csaba Szepesvári
113
209
0
15 Dec 2020
On Function Approximation in Reinforcement Learning: Optimism in the
  Face of Large State Spaces
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
99
18
0
09 Nov 2020
Instance-Dependent Complexity of Contextual Bandits and Reinforcement
  Learning: A Disagreement-Based Perspective
Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective
Dylan J. Foster
Alexander Rakhlin
D. Simchi-Levi
Yunzong Xu
163
78
0
07 Oct 2020
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient
  Learning
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
Alekh Agarwal
Mikael Henaff
Sham Kakade
Wen Sun
OffRL
94
110
0
16 Jul 2020
$Q$-learning with Logarithmic Regret
QQQ-learning with Logarithmic Regret
Kunhe Yang
Lin F. Yang
S. Du
102
62
0
16 Jun 2020
Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Yiheng Lin
Guannan Qu
Longbo Huang
Adam Wierman
99
39
0
11 Jun 2020
Reinforcement Learning with Feedback Graphs
Reinforcement Learning with Feedback Graphs
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
51
9
0
07 May 2020
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon
  Reinforcement Learning?
Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?
Ruosong Wang
S. Du
Lin F. Yang
Sham Kakade
OffRL
95
52
0
01 May 2020
Provably Efficient Exploration for Reinforcement Learning Using
  Unsupervised Learning
Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning
Fei Feng
Ruosong Wang
W. Yin
S. Du
Lin F. Yang
OffRLSSL
81
7
0
15 Mar 2020
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Near Optimal Policies with Low Inherent Bellman Error
Andrea Zanette
A. Lazaric
Mykel Kochenderfer
Emma Brunskill
OffRL
116
222
0
29 Feb 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function
  Approximation and Correlated Equilibrium
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Qiaomin Xie
Yudong Chen
Zhaoran Wang
Zhuoran Yang
175
127
0
17 Feb 2020
Kinematic State Abstraction and Provably Efficient Rich-Observation
  Reinforcement Learning
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
95
151
0
13 Nov 2019
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration
Andrea Zanette
David Brandfonbrener
Emma Brunskill
Matteo Pirotta
A. Lazaric
155
132
0
01 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
190
187
0
28 Oct 2019
Sample Complexity of Reinforcement Learning using Linearly Combined
  Model Ensembles
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Aditya Modi
Nan Jiang
Ambuj Tewari
Satinder Singh
72
133
0
23 Oct 2019
PAC Reinforcement Learning without Real-World Feedback
PAC Reinforcement Learning without Real-World Feedback
Yuren Zhong
A. Deshmukh
Clayton Scott
58
7
0
23 Sep 2019
$\sqrt{n}$-Regret for Learning in Markov Decision Processes with
  Function Approximation and Low Bellman Rank
n\sqrt{n}n​-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
Kefan Dong
Jian-wei Peng
Yining Wang
Yuanshuo Zhou
OffRL
81
36
0
05 Sep 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and
  Distribution Shift
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
138
321
0
01 Aug 2019
On Value Functions and the Agent-Environment Boundary
On Value Functions and the Agent-Environment Boundary
Nan Jiang
OffRL
151
21
0
30 May 2019
Information-Theoretic Considerations in Batch Reinforcement Learning
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen
Nan Jiang
OODOffRL
193
378
0
01 May 2019
Policy Certificates: Towards Accountable Reinforcement Learning
Policy Certificates: Towards Accountable Reinforcement Learning
Christoph Dann
Ashutosh Adhikari
Wei Wei
Jimmy J. Lin
OffRL
170
146
0
07 Nov 2018
Dual Policy Iteration
Dual Policy Iteration
Wen Sun
Geoffrey J. Gordon
Byron Boots
J. Andrew Bagnell
OffRL
105
57
0
28 May 2018
Previous
12