PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits

18 May 2018

Papers citing "PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits"

22 / 22 papers shown

Title
Computationally and Sample Efficient Safe Reinforcement Learning Using Adaptive Conformal Prediction Hao Zhou Yanze Zhang Wenhao Luo 39 0 0 22 Mar 2025
Clustering Context in Off-Policy Evaluation Daniel Guzman-Olivares Philipp Schmidt Jacek Golebiowski Artur Bekasov CML OffRL 51 0 0 28 Feb 2025
Stabilizing the Kumaraswamy Distribution Max Wasserman Gonzalo Mateos BDL 47 0 0 01 Oct 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback Ruitao Chen Liwei Wang 72 1 0 18 May 2024
Thompson Sampling in Partially Observable Contextual Bandits Hongju Park Mohamad Kazem Shirani Faradonbeh 31 2 0 15 Feb 2024
Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study Xueqing Liu Nina Deliu Tanujit Chakraborty Lauren Bell Bibhas Chakraborty 16 1 0 24 Nov 2023
Overcoming Prior Misspecification in Online Learning to Rank Javad Azizi Ofer Meshi M. Zoghi Maryam Karimzadehgan 35 1 0 25 Jan 2023
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu Julia Olkhovskaya Matteo Papini Ludovic Schwartz 33 16 0 27 May 2022
Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits Yu-Heng Hung Ping-Chun Hsieh 18 2 0 08 Mar 2022
An Experimental Design Approach for Regret Minimization in Logistic Bandits Blake Mason Kwang-Sung Jun Lalit P. Jain 23 10 0 04 Feb 2022
Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction Kailun Wu Zhangming Chan Weijie Bian Lejian Ren Shiming Xiang Shuguang Han Hongbo Deng Bo Zheng 16 12 0 21 Dec 2021
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification James A. Grant David S. Leslie 44 3 0 29 Sep 2021
Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models Runzhe Wan Linjuan Ge Rui Song 36 28 0 13 Aug 2021
Regret Bounds for Generalized Linear Bandits under Parameter Drift Louis Faury Yoan Russac Marc Abeille Clément Calauzènes 15 10 0 09 Mar 2021
Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning Chao Du Zhifeng Gao Shuo Yuan Lining Gao Z. Li Yifan Zeng Xiaoqiang Zhu Jian Xu Kun Gai Kuang-chih Lee 25 18 0 25 Nov 2020
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits Marc Abeille Louis Faury Clément Calauzènes 96 37 0 23 Oct 2020
Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits Yu-Heng Hung Ping-Chun Hsieh Xi Liu P. R. Kumar 16 15 0 08 Oct 2020
Effects of Model Misspecification on Bayesian Bandits: Case Studies in UX Optimization Mack Sweeney M. Adelsberg Kathryn B. Laskey C. Domeniconi 21 1 0 07 Oct 2020
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling Qin Ding Cho-Jui Hsieh James Sharpnack 25 37 0 07 Jun 2020
Improved Optimistic Algorithms for Logistic Bandits Louis Faury Marc Abeille Clément Calauzènes Olivier Fercoq 17 85 0 18 Feb 2020
Dueling Posterior Sampling for Preference-Based Reinforcement Learning Ellen R. Novoseller Yibing Wei Yanan Sui Yisong Yue J. W. Burdick 25 59 0 04 Aug 2019
Online Sampling from Log-Concave Distributions Holden Lee Oren Mangoubi Nisheeth K. Vishnoi 18 3 0 21 Feb 2019