ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.1853
  4. Cited By
Model-based Reinforcement Learning and the Eluder Dimension

Model-based Reinforcement Learning and the Eluder Dimension

7 June 2014
Ian Osband
Benjamin Van Roy
ArXivPDFHTML

Papers citing "Model-based Reinforcement Learning and the Eluder Dimension"

13 / 13 papers shown
Title
Steering No-Regret Agents in MFGs under Model Uncertainty
Steering No-Regret Agents in MFGs under Model Uncertainty
Leo Widmer
Jiawei Huang
Niao He
LLMSV
75
1
0
12 Mar 2025
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation
Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation
Long-Fei Li
Yu Zhang
Peng Zhao
Zhi Zhou
152
5
0
17 Jan 2025
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner
RL-STaR: Theoretical Analysis of Reinforcement Learning Frameworks for Self-Taught Reasoner
Fu-Chieh Chang
Yu-Ting Lee
Hui-Ying Shih
Pei-Yuan Wu
Pei-Yuan Wu
OffRL
LRM
373
0
0
31 Oct 2024
HG2P: Hippocampus-inspired High-reward Graph and Model-Free Q-Gradient Penalty for Path Planning and Motion Control
HG2P: Hippocampus-inspired High-reward Graph and Model-Free Q-Gradient Penalty for Path Planning and Motion Control
Haoran Wang
Yaoru Sun
Zeshen Tang
Haibo Shi
Chenyuan Jiao
66
0
0
12 Oct 2024
Second Order Bounds for Contextual Bandits with Function Approximation
Second Order Bounds for Contextual Bandits with Function Approximation
Aldo Pacchiano
152
4
0
24 Sep 2024
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear
  Contextual Bandits and Markov Decision Processes
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chen Ye
Wei Xiong
Quanquan Gu
Tong Zhang
89
31
0
12 Dec 2022
Near-optimal Reinforcement Learning in Factored MDPs
Near-optimal Reinforcement Learning in Factored MDPs
Ian Osband
Benjamin Van Roy
64
121
0
15 Mar 2014
Generalization and Exploration via Randomized Value Functions
Generalization and Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Zheng Wen
67
314
0
04 Feb 2014
(More) Efficient Reinforcement Learning via Posterior Sampling
(More) Efficient Reinforcement Learning via Posterior Sampling
Ian Osband
Daniel Russo
Benjamin Van Roy
100
529
0
04 Jun 2013
Efficient Reinforcement Learning for High Dimensional Linear Quadratic
  Systems
Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems
M. Ibrahimi
Adel Javanmard
Benjamin Van Roy
58
91
0
24 Mar 2013
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
R. Ortner
D. Ryabko
OffRL
61
85
0
11 Feb 2013
Learning to Optimize Via Posterior Sampling
Learning to Optimize Via Posterior Sampling
Daniel Russo
Benjamin Van Roy
137
699
0
11 Jan 2013
X-Armed Bandits
X-Armed Bandits
Sébastien Bubeck
Rémi Munos
Gilles Stoltz
Csaba Szepesvari
123
383
0
25 Jan 2010
1