ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.00606
  4. Cited By
On Oracle-Efficient PAC RL with Rich Observations
v1v2v3v4 (latest)

On Oracle-Efficient PAC RL with Rich Observations

1 March 2018
Christoph Dann
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
ArXiv (abs)PDFHTML

Papers citing "On Oracle-Efficient PAC RL with Rich Observations"

50 / 81 papers shown
Title
Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise
  Matrix Estimation
Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise Matrix Estimation
Stefan Stojanovic
Yassir Jedra
Alexandre Proutiere
73
0
0
30 Oct 2024
Learning Infinite-Horizon Average-Reward Linear Mixture MDPs of Bounded
  Span
Learning Infinite-Horizon Average-Reward Linear Mixture MDPs of Bounded Span
Woojin Chae
Kihyuk Hong
Yufan Zhang
Ambuj Tewari
Dabeen Lee
69
1
0
19 Oct 2024
Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory
Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory
Alexander Levine
Peter Stone
Amy Zhang
OffRL
112
0
0
03 Oct 2024
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low
  Interaction Rank
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Wenhao Zhan
Scott Fujimoto
Zheqing Zhu
Jason D. Lee
Daniel Jiang
Yonathan Efroni
OffRL
125
0
0
01 Oct 2024
The Central Role of the Loss Function in Reinforcement Learning
The Central Role of the Loss Function in Reinforcement Learning
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
313
10
0
19 Sep 2024
Rethinking State Disentanglement in Causal Reinforcement Learning
Rethinking State Disentanglement in Causal Reinforcement Learning
Haiyao Cao
Zhen Zhang
Panpan Cai
Yuhang Liu
Jinan Zou
Ehsan Abbasnejad
Biwei Huang
Biwei Huang
Anton van den Hengel
Javen Qinfeng Shi
CML
60
0
0
24 Aug 2024
Satisficing Exploration for Deep Reinforcement Learning
Satisficing Exploration for Deep Reinforcement Learning
Dilip Arumugam
Saurabh Kumar
Ramki Gummadi
Benjamin Van Roy
67
1
0
16 Jul 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy
  Evaluation
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Jeongyeol Kwon
Shie Mannor
Constantine Caramanis
Yonathan Efroni
OffRL
109
3
0
03 Jun 2024
Exploratory Preference Optimization: Harnessing Implicit
  Q*-Approximation for Sample-Efficient RLHF
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Tengyang Xie
Dylan J. Foster
Akshay Krishnamurthy
Corby Rosset
Ahmed Hassan Awadallah
Alexander Rakhlin
100
45
0
31 May 2024
Exploration is Harder than Prediction: Cryptographically Separating
  Reinforcement Learning from Supervised Learning
Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning
Noah Golowich
Ankur Moitra
Dhruv Rohatgi
OffRL
82
4
0
04 Apr 2024
Horizon-Free Regret for Linear Markov Decision Processes
Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang
Jason D. Lee
Yuxin Chen
Simon S. Du
65
3
0
15 Mar 2024
A Natural Extension To Online Algorithms For Hybrid RL With Limited
  Coverage
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan
Ziping Xu
OffRLOnRL
81
5
0
07 Mar 2024
More Benefits of Being Distributional: Second-Order Bounds for
  Reinforcement Learning
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang
Owen Oertell
Alekh Agarwal
Nathan Kallus
Wen Sun
OffRL
128
12
0
11 Feb 2024
Agnostic Interactive Imitation Learning: New Theory and Practical
  Algorithms
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Yichen Li
Chicheng Zhang
OffRL
74
0
0
28 Dec 2023
Learning Adversarial Low-rank Markov Decision Processes with Unknown
  Transition and Full-information Feedback
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
Canzhe Zhao
Ruofeng Yang
Baoxiang Wang
Xuezhou Zhang
Shuai Li
73
3
0
14 Nov 2023
Prospective Side Information for Latent MDPs
Prospective Side Information for Latent MDPs
Jeongyeol Kwon
Yonathan Efroni
Shie Mannor
Constantine Caramanis
78
6
0
11 Oct 2023
Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement
  Learning
Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning
Stefan Stojanovic
Yassir Jedra
Alexandre Proutière
70
5
0
10 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
120
5
0
09 Oct 2023
Pessimistic Nonlinear Least-Squares Value Iteration for Offline
  Reinforcement Learning
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
Qiwei Di
Heyang Zhao
Jiafan He
Quanquan Gu
OffRL
115
5
0
02 Oct 2023
Efficient Model-Free Exploration in Low-Rank MDPs
Efficient Model-Free Exploration in Low-Rank MDPs
Zakaria Mhammedi
Adam Block
Dylan J. Foster
Alexander Rakhlin
OffRL
98
14
0
08 Jul 2023
Context-lumpable stochastic bandits
Context-lumpable stochastic bandits
Chung-Wei Lee
Qinghua Liu
Yasin Abbasi-Yadkori
Chi Jin
Tor Lattimore
Csaba Szepesvári
OffRL
158
2
0
22 Jun 2023
The Benefits of Being Distributional: Small-Loss Bounds for
  Reinforcement Learning
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning
Kaiwen Wang
Kevin Zhou
Runzhe Wu
Nathan Kallus
Wen Sun
OffRL
84
19
0
25 May 2023
Representation Learning with Multi-Step Inverse Kinematics: An Efficient
  and Optimal Approach to Rich-Observation RL
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL
Zakaria Mhammedi
Dylan J. Foster
Alexander Rakhlin
108
18
0
12 Apr 2023
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Junkai Zhang
Weitong Zhang
Quanquan Gu
64
3
0
17 Mar 2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
  Learning: Adaptivity and Computational Efficiency
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
106
28
0
21 Feb 2023
Model-Based Reinforcement Learning with Multinomial Logistic Function
  Approximation
Model-Based Reinforcement Learning with Multinomial Logistic Function Approximation
Taehyun Hwang
Min Hwan Oh
96
9
0
27 Dec 2022
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
  Processes
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Jiafan He
Heyang Zhao
Dongruo Zhou
Quanquan Gu
OffRL
136
55
0
12 Dec 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRLOnRL
97
105
0
13 Oct 2022
On Efficient Online Imitation Learning via Classification
On Efficient Online Imitation Learning via Classification
Yichen Li
Chicheng Zhang
95
4
0
26 Sep 2022
Computationally Efficient PAC RL in POMDPs with Latent Determinism and
  Conditional Embeddings
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
Masatoshi Uehara
Ayush Sekhari
Jason D. Lee
Nathan Kallus
Wen Sun
95
6
0
24 Jun 2022
Nearly Minimax Optimal Reinforcement Learning with Linear Function
  Approximation
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation
Pihe Hu
Yu Chen
Longbo Huang
86
35
0
23 Jun 2022
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear
  RL
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
Jinglin Chen
Aditya Modi
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
114
26
0
21 Jun 2022
Guarantees for Epsilon-Greedy Reinforcement Learning with Function
  Approximation
Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
120
53
0
19 Jun 2022
Interaction-Grounded Learning with Action-inclusive Feedback
Interaction-Grounded Learning with Action-inclusive Feedback
Tengyang Xie
Akanksha Saran
Dylan J. Foster
Lekan Molu
Ida Momennejad
Nan Jiang
Paul Mineiro
John Langford
91
10
0
16 Jun 2022
Deciding What to Model: Value-Equivalent Sampling for Reinforcement
  Learning
Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning
Dilip Arumugam
Benjamin Van Roy
OffRL
80
15
0
04 Jun 2022
Computationally Efficient Horizon-Free Reinforcement Learning for Linear
  Mixture MDPs
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
122
45
0
23 May 2022
Provably Efficient Kernelized Q-Learning
Provably Efficient Kernelized Q-Learning
Shuang Liu
H. Su
MLT
100
4
0
21 Apr 2022
Sequential Information Design: Markov Persuasion Process and Its
  Efficient Reinforcement Learning
Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning
Jibang Wu
Zixuan Zhang
Zhe Feng
Zhaoran Wang
Zhuoran Yang
Michael I. Jordan
Haifeng Xu
91
37
0
22 Feb 2022
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
  Optimality
Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
Jiawei Huang
Jinglin Chen
Li Zhao
Tao Qin
Nan Jiang
Tie-Yan Liu
OffRL
106
24
0
14 Feb 2022
Computational-Statistical Gaps in Reinforcement Learning
Computational-Statistical Gaps in Reinforcement Learning
D. Kane
Sihan Liu
Shachar Lovett
G. Mahajan
57
5
0
11 Feb 2022
Efficient Reinforcement Learning in Block MDPs: A Model-free
  Representation Learning Approach
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach
Xuezhou Zhang
Yuda Song
Masatoshi Uehara
Mengdi Wang
Alekh Agarwal
Wen Sun
OffRL
142
58
0
31 Jan 2022
Improved Regret Analysis for Variance-Adaptive Linear Bandits and
  Horizon-Free Linear Mixture MDPs
Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs
Yeoneung Kim
Insoon Yang
Kwang-Sung Jun
102
38
0
05 Nov 2021
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Provable RL with Exogenous Distractors via Multistep Inverse Dynamics
Yonathan Efroni
Dipendra Kumar Misra
A. Krishnamurthy
Alekh Agarwal
John Langford
OffRL
82
23
0
17 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Representation Learning for Online and Offline RL in Low-rank MDPs
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
156
129
0
09 Oct 2021
Reinforcement Learning in Reward-Mixing MDPs
Reinforcement Learning in Reward-Mixing MDPs
Jeongyeol Kwon
Yonathan Efroni
Constantine Caramanis
Shie Mannor
90
15
0
07 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement
  Learning
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning
Tong Zhang
87
65
0
02 Oct 2021
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Going Beyond Linear RL: Sample Efficient Neural Function Approximation
Baihe Huang
Kaixuan Huang
Sham Kakade
Jason D. Lee
Qi Lei
Runzhe Wang
Jiaqi Yang
103
8
0
14 Jul 2021
Explore and Control with Adversarial Surprise
Explore and Control with Adversarial Surprise
Arnaud Fickinger
Natasha Jaques
Samyak Parajuli
Michael Chang
Nicholas Rhinehart
Glen Berseth
Stuart J. Russell
Sergey Levine
73
8
0
12 Jul 2021
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations
Christoph Dann
Yishay Mansour
M. Mohri
Ayush Sekhari
Karthik Sridharan
OffRL
57
11
0
22 Jun 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear
  Function Approximation
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
Andrea Zanette
Ching-An Cheng
Alekh Agarwal
110
53
0
24 Mar 2021
12
Next