An Information-Theoretic Analysis of Thompson Sampling

21 March 2014

Papers citing "An Information-Theoretic Analysis of Thompson Sampling"

50 / 81 papers shown

Title
Toward Efficient Exploration by Large Language Model Agents Dilip Arumugam Thomas L. Griffiths LLMAG 94 1 0 29 Apr 2025
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces Amaury Gouverneur Borja Rodríguez Gálvez T. Oechtering Mikael Skoglund 66 0 0 04 Feb 2025
Distributed Thompson sampling under constrained communication Saba Zerefa Zhaolin Ren Haitong Ma Na Li 46 1 0 03 Jan 2025
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Taiwo A. Adebiyi Bach Do Ruda Zhang 114 2 0 29 Oct 2024
Advances in Preference-based Reinforcement Learning: A Review Youssef Abdelkareem Shady Shehata Fakhri Karray OffRL 56 10 0 21 Aug 2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off Itai Shufaro Nadav Merlis Nir Weinberger Shie Mannor 40 0 0 26 May 2024
Incentivized Exploration via Filtered Posterior Sampling Anand Kalvit Aleksandrs Slivkins Yonatan Gur 29 1 0 20 Feb 2024
Posterior Sampling-based Online Learning for Episodic POMDPs Dengwang Tang Dongze Ye Rahul Jain A. Nayyar Pierluigi Nuzzo OffRL 57 0 0 16 Oct 2023
VITS : Variational Inference Thompson Sampling for contextual bandits Pierre Clavier Tom Huix Alain Durmus 32 3 0 19 Jul 2023
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits Yuwei Luo Mohsen Bayati 26 1 0 26 Jun 2023
Sequential Best-Arm Identification with Application to Brain-Computer Interface Xiaoping Zhou Botao Hao Jian Kang Tor Lattimore Lexin Li 35 2 0 17 May 2023
Bayesian Reinforcement Learning with Limited Cognitive Load Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy OffRL 39 8 0 05 May 2023
Optimal tests following sequential experiments Karun Adusumilli 38 2 0 30 Apr 2023
Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards Amaury Gouverneur Borja Rodríguez Gálvez T. Oechtering Mikael Skoglund 29 4 0 26 Apr 2023
Simulating Gaussian vectors via randomized dimension reduction and PCA N. Kahalé 35 0 0 14 Apr 2023
Evaluating COVID-19 vaccine allocation policies using Bayesian $m$ -top exploration Alexandra Cimpean T. Verstraeten L. Willem N. Hens Ann Nowé Pieter J. K. Libin 26 2 0 30 Jan 2023
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning Souradip Chakraborty Amrit Singh Bedi Alec Koppel Mengdi Wang Furong Huang Dinesh Manocha 24 8 0 28 Jan 2023
Bayesian Fixed-Budget Best-Arm Identification Alexia Atsidakou S. Katariya Sujay Sanghavi Branislav Kveton 35 11 0 15 Nov 2022
AdaChain: A Learned Adaptive Blockchain Chenyuan Wu Bhavana Mehta Mohammad Javad Amiri Ryan Marcus B. T. Loo 23 14 0 03 Nov 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning Dilip Arumugam Mark K. Ho Noah D. Goodman Benjamin Van Roy 36 4 0 30 Oct 2022
Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning Kang Xu Yan Ma Wei Li 48 0 0 23 Sep 2022
Batch Bayesian optimisation via density-ratio estimation with guarantees Rafael Oliveira Louis C. Tiao Fabio Ramos 49 7 0 22 Sep 2022
Thompson Sampling with Virtual Helping Agents Kartikey Pant Amod Hegde K. V. Srinivas 22 0 0 16 Sep 2022
Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions Wenhao Luo Wen Sun Ashish Kapoor OffRL 48 9 0 29 Jul 2022
Adaptive Sampling for Discovery Ziping Xu Eunjae Shim Ambuj Tewari Paul M. Zimmerman OffRL 22 4 0 30 May 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu Julia Olkhovskaya Matteo Papini Ludovic Schwartz 44 16 0 27 May 2022
Non-Stationary Bandit Learning via Predictive Sampling Yueyang Liu Kuang Xu Benjamin Van Roy 28 19 0 04 May 2022
An Analysis of Ensemble Sampling Chao Qin Zheng Wen Xiuyuan Lu Benjamin Van Roy 37 22 0 02 Mar 2022
Partial Likelihood Thompson Sampling Han Wu Stefan Wager LM&MA 35 1 0 02 Mar 2022
Thompson Sampling with Unrestricted Delays Hang Wu Stefan Wager 42 7 0 24 Feb 2022
Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation Chao Qin Daniel Russo 60 6 0 18 Feb 2022
Fast online inference for nonlinear contextual bandit based on Generative Adversarial Network Yun-Da Tsai Shou-De Lin 51 5 0 17 Feb 2022
Deep Hierarchy in Bandits Joey Hong Branislav Kveton S. Katariya Manzil Zaheer Mohammad Ghavamzadeh 33 20 0 03 Feb 2022
Gaussian Imagination in Bandit Learning Yueyang Liu Adithya M. Devraj Benjamin Van Roy Kuang Xu 40 7 0 06 Jan 2022
Hierarchical Bayesian Bandits Joey Hong Branislav Kveton Manzil Zaheer Mohammad Ghavamzadeh FedML 52 38 0 12 Nov 2021
The Value of Information When Deciding What to Learn Dilip Arumugam Benjamin Van Roy 37 12 0 26 Oct 2021
Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning Tong Zhang 27 63 0 02 Oct 2021
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification James A. Grant David S. Leslie 50 3 0 29 Sep 2021
A Payload Optimization Method for Federated Recommender Systems Farwa K. Khan Adrian Flanagan K. E. Tan Z. Alamgir Muhammad Ammad-ud-din 82 30 0 27 Jul 2021
Metalearning Linear Bandits by Prior Update Amit Peleg Naama Pearl Ron Meir 42 18 0 12 Jul 2021
Bayesian decision-making under misspecified priors with applications to meta-learning Max Simchowitz Christopher Tosh A. Krishnamurthy Daniel J. Hsu Thodoris Lykouris Miroslav Dudík Robert Schapire 42 49 0 03 Jul 2021
Applications of the Free Energy Principle to Machine Learning and Neuroscience Beren Millidge DRL 33 7 0 30 Jun 2021
Online Transfer Learning: Negative Transfer and Effect of Prior Knowledge Xuetong Wu J. Manton U. Aickelin Jingge Zhu CLL OnRL 37 7 0 04 May 2021
An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning Dilip Arumugam Peter Henderson Pierre-Luc Bacon 24 17 0 10 Mar 2021
Constrained Contextual Bandit Learning for Adaptive Radar Waveform Selection C. Thornton R. M. Buehrer A. Martone 22 21 0 09 Mar 2021
Reinforcement Learning, Bit by Bit Xiuyuan Lu Benjamin Van Roy Vikranth Dwaracherla M. Ibrahimi Ian Osband Zheng Wen 30 70 0 06 Mar 2021
Online Multi-Armed Bandits with Adaptive Inference Maria Dimakopoulou Zhimei Ren Zhengyuan Zhou 40 34 0 25 Feb 2021
The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling N. Hamidi Mohsen Bayati 22 1 0 16 Feb 2021
Uncertainty quantification and exploration-exploitation trade-off in humans Antonio Candelieri Andrea Ponti Francesco Archetti 21 4 0 05 Feb 2021
An empirical evaluation of active inference in multi-armed bandits D. Marković Hrvoje Stojić Sarah Schwöbel S. Kiebel 46 34 0 21 Jan 2021