ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1403.5556
  4. Cited By
Learning to Optimize via Information-Directed Sampling

Learning to Optimize via Information-Directed Sampling

21 March 2014
Daniel Russo
Benjamin Van Roy
ArXivPDFHTML

Papers citing "Learning to Optimize via Information-Directed Sampling"

50 / 60 papers shown
Title
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
94
1
0
29 Apr 2025
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces
Amaury Gouverneur
Borja Rodríguez Gálvez
T. Oechtering
Mikael Skoglund
56
0
0
04 Feb 2025
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
42
0
0
07 Oct 2024
Value of Information and Reward Specification in Active Inference and
  POMDPs
Value of Information and Reward Specification in Active Inference and POMDPs
Ran Wei
57
3
0
13 Aug 2024
A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits
A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits
Junghyun Lee
Se-Young Yun
Kwang-Sung Jun
33
4
0
19 Jul 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Itai Shufaro
Nadav Merlis
Nir Weinberger
Shie Mannor
38
0
0
26 May 2024
Causally Abstracted Multi-armed Bandits
Causally Abstracted Multi-armed Bandits
Fabio Massimo Zennaro
Nicholas Bishop
Joel Dyer
Yorgos Felekis
Anisoara Calinescu
Michael Wooldridge
Theodoros Damoulas
38
2
0
26 Apr 2024
TS-RSR: A provably efficient approach for batch bayesian optimization
TS-RSR: A provably efficient approach for batch bayesian optimization
Zhaolin Ren
Na Li
31
2
0
07 Mar 2024
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice
  via HyperAgent
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li
Jiawei Xu
Lei Han
Zhi-Quan Luo
BDL
OffRL
26
5
0
05 Feb 2024
An Information Theoretic Approach to Interaction-Grounded Learning
An Information Theoretic Approach to Interaction-Grounded Learning
Xiaoyan Hu
Farzan Farnia
Ho-fung Leung
VLM
35
2
0
10 Jan 2024
Best Arm Identification in Batched Multi-armed Bandit Problems
Best Arm Identification in Batched Multi-armed Bandit Problems
Sheng Cao
Simai He
Ruoqing Jiang
Jin Xu
Hongsong Yuan
12
1
0
21 Dec 2023
High Accuracy and Low Regret for User-Cold-Start Using Latent Bandits
High Accuracy and Low Regret for User-Cold-Start Using Latent Bandits
David Young
D. Leith
11
0
0
12 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
34
8
0
05 May 2023
Did we personalize? Assessing personalization by an online reinforcement
  learning algorithm using resampling
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling
Susobhan Ghosh
Raphael Kim
Prasidh Chhabria
Raaz Dwivedi
Predrag Klasjna
Peng Liao
Kelly Zhang
Susan Murphy
OffRL
22
8
0
11 Apr 2023
Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors
Thompson Sampling for Linear Bandit Problems with Normal-Gamma Priors
Björn Lindenberg
Karl-Olof Lindahl
25
0
0
06 Mar 2023
STEERING: Stein Information Directed Exploration for Model-Based
  Reinforcement Learning
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
Souradip Chakraborty
Amrit Singh Bedi
Alec Koppel
Mengdi Wang
Furong Huang
Dinesh Manocha
24
7
0
28 Jan 2023
Tight Guarantees for Interactive Decision Making with the
  Decision-Estimation Coefficient
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient
Dylan J. Foster
Noah Golowich
Yanjun Han
OffRL
28
29
0
19 Jan 2023
Bayesian Fixed-Budget Best-Arm Identification
Bayesian Fixed-Budget Best-Arm Identification
Alexia Atsidakou
S. Katariya
Sujay Sanghavi
B. Kveton
33
11
0
15 Nov 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement
  Learning
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
31
4
0
30 Oct 2022
Environment Design for Inverse Reinforcement Learning
Environment Design for Inverse Reinforcement Learning
Thomas Kleine Buening
Victor Villin
Christos Dimitrakakis
32
1
0
26 Oct 2022
Exploration via Planning for Information about the Optimal Trajectory
Exploration via Planning for Information about the Optimal Trajectory
Viraj Mehta
I. Char
J. Abbate
R. Conlin
M. Boyer
Stefano Ermon
J. Schneider
Willie Neiswanger
OffRL
27
6
0
06 Oct 2022
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
Chao Yu
AI4TS
31
6
0
30 Sep 2022
Risk-aware linear bandits with convex loss
Risk-aware linear bandits with convex loss
Patrick Saux
Odalric-Ambrym Maillard
24
2
0
15 Sep 2022
Learning to Sell a Focal-ancillary Combination
Learning to Sell a Focal-ancillary Combination
Hanrui Wang
Xiaocheng Li
Kalyan Talluri
19
0
0
23 Jul 2022
Joint Entropy Search for Maximally-Informed Bayesian Optimization
Joint Entropy Search for Maximally-Informed Bayesian Optimization
Carl Hvarfner
Frank Hutter
Luigi Nardi
44
36
0
09 Jun 2022
Adaptive Sampling for Discovery
Adaptive Sampling for Discovery
Ziping Xu
Eunjae Shim
Ambuj Tewari
Paul M. Zimmerman
OffRL
19
4
0
30 May 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of
  Thompson Sampling for Contextual Bandits
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
Gergely Neu
Julia Olkhovskaya
Matteo Papini
Ludovic Schwartz
33
16
0
27 May 2022
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D. Tiapkin
Denis Belomestny
Eric Moulines
A. Naumov
S. Samsonov
Yunhao Tang
Michal Valko
Pierre Menard
31
17
0
16 May 2022
Non-Stationary Bandit Learning via Predictive Sampling
Non-Stationary Bandit Learning via Predictive Sampling
Yueyang Liu
Kuang Xu
Benjamin Van Roy
24
19
0
04 May 2022
Truncated LinUCB for Stochastic Linear Bandits
Truncated LinUCB for Stochastic Linear Bandits
Yanglei Song
Meng zhou
52
0
0
23 Feb 2022
Minimax Regret for Partial Monitoring: Infinite Outcomes and
  Rustichini's Regret
Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret
Tor Lattimore
16
16
0
22 Feb 2022
Adaptive Experimentation in the Presence of Exogenous Nonstationary
  Variation
Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation
Chao Qin
Daniel Russo
58
6
0
18 Feb 2022
A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit
A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit
Vladimir A. Kobzar
R. Kohn
23
4
0
11 Feb 2022
Gaussian Imagination in Bandit Learning
Gaussian Imagination in Bandit Learning
Yueyang Liu
Adithya M. Devraj
Benjamin Van Roy
Kuang Xu
34
7
0
06 Jan 2022
The Value of Information When Deciding What to Learn
The Value of Information When Deciding What to Learn
Dilip Arumugam
Benjamin Van Roy
37
12
0
26 Oct 2021
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored
  Online Binary Classification
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification
James A. Grant
David S. Leslie
44
3
0
29 Sep 2021
Deep Exploration for Recommendation Systems
Deep Exploration for Recommendation Systems
Zheqing Zhu
Benjamin Van Roy
32
11
0
26 Sep 2021
Metalearning Linear Bandits by Prior Update
Metalearning Linear Bandits by Prior Update
Amit Peleg
Naama Pearl
Ron Meir
37
18
0
12 Jul 2021
On the Sample Complexity and Metastability of Heavy-tailed Policy Search
  in Continuous Control
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Amrit Singh Bedi
Anjaly Parayil
Junyu Zhang
Mengdi Wang
Alec Koppel
38
15
0
15 Jun 2021
Information Directed Sampling for Sparse Linear Bandits
Information Directed Sampling for Sparse Linear Bandits
Botao Hao
Tor Lattimore
Wei Deng
25
19
0
29 May 2021
An Information-Theoretic Perspective on Credit Assignment in
  Reinforcement Learning
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning
Dilip Arumugam
Peter Henderson
Pierre-Luc Bacon
24
17
0
10 Mar 2021
Reinforcement Learning, Bit by Bit
Reinforcement Learning, Bit by Bit
Xiuyuan Lu
Benjamin Van Roy
Vikranth Dwaracherla
M. Ibrahimi
Ian Osband
Zheng Wen
30
70
0
06 Mar 2021
An empirical evaluation of active inference in multi-armed bandits
An empirical evaluation of active inference in multi-armed bandits
D. Marković
Hrvoje Stojić
Sarah Schwöbel
S. Kiebel
42
34
0
21 Jan 2021
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
27
349
0
30 Dec 2020
Randomized Value Functions via Posterior State-Abstraction Sampling
Randomized Value Functions via Posterior State-Abstraction Sampling
Dilip Arumugam
Benjamin Van Roy
OffRL
31
7
0
05 Oct 2020
TS-UCB: Improving on Thompson Sampling With Little to No Additional
  Computation
TS-UCB: Improving on Thompson Sampling With Little to No Additional Computation
Jackie Baek
Vivek F. Farias
38
9
0
11 Jun 2020
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic
  Gradient Descent and Thompson Sampling
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling
Qin Ding
Cho-Jui Hsieh
James Sharpnack
25
37
0
07 Jun 2020
Information Directed Sampling for Linear Partial Monitoring
Information Directed Sampling for Linear Partial Monitoring
Johannes Kirschner
Tor Lattimore
Andreas Krause
24
46
0
25 Feb 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
27
48
0
03 Jan 2020
Connections Between Mirror Descent, Thompson Sampling and the
  Information Ratio
Connections Between Mirror Descent, Thompson Sampling and the Information Ratio
Julian Zimmert
Tor Lattimore
22
34
0
28 May 2019
12
Next