Double Thompson Sampling for Dueling Bandits

25 April 2016

Papers citing "Double Thompson Sampling for Dueling Bandits"

15 / 15 papers shown

Title
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Fanzeng Xia Hao Liu Yisong Yue Tongxin Li 69 1 0 03 Jan 2025
Online Bandit Learning with Offline Preference Data for Improved RLHF Akhil Agnihotri Rahul Jain Deepak Ramachandran Zheng Wen OffRL 42 2 0 13 Jun 2024
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback Qiwei Di Jiafan He Quanquan Gu 31 1 0 16 Apr 2024
Reinforcement Learning from Human Feedback with Active Queries Kaixuan Ji Jiafan He Quanquan Gu 26 17 0 14 Feb 2024
Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit Tian Huang Ke Li Ke Li 31 1 0 23 Nov 2023
Borda Regret Minimization for Generalized Linear Dueling Bandits Yue Wu Tao Jin Hao Lou Farzad Farnoud Quanquan Gu 34 11 0 15 Mar 2023
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits Thomas Kleine Buening Aadirupa Saha 48 6 0 25 Oct 2022
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem Arpit Agarwal R. Ghuge V. Nagarajan 25 1 0 25 Sep 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences Aadirupa Saha Pierre Gaillard 38 8 0 14 Feb 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability Aadirupa Saha A. Krishnamurthy 42 35 0 24 Nov 2021
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits Aadirupa Saha Shubham Gupta 33 10 0 06 Nov 2021
Preference learning along multiple criteria: A game-theoretic perspective Kush S. Bhatia A. Pananjady Peter L. Bartlett Anca Dragan Martin J. Wainwright 35 13 0 05 May 2021
KLUCB Approach to Copeland Bandits Nischal Agrawal P. Chaporkar 16 1 0 07 Feb 2019
Multi-dueling Bandits with Dependent Arms Yanan Sui Vincent Zhuang J. W. Burdick Yisong Yue 25 80 0 29 Apr 2017
Preferential Bayesian Optimization Javier I. González Zhenwen Dai Andreas C. Damianou Neil D. Lawrence 23 110 0 12 Apr 2017