ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.07101
  4. Cited By
Double Thompson Sampling for Dueling Bandits

Double Thompson Sampling for Dueling Bandits

25 April 2016
Huasen Wu
Xin Liu
ArXivPDFHTML

Papers citing "Double Thompson Sampling for Dueling Bandits"

15 / 15 papers shown
Title
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents
Fanzeng Xia
Hao Liu
Yisong Yue
Tongxin Li
67
1
0
03 Jan 2025
Online Bandit Learning with Offline Preference Data for Improved RLHF
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
42
2
0
13 Jun 2024
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback
Qiwei Di
Jiafan He
Quanquan Gu
31
1
0
16 Apr 2024
Reinforcement Learning from Human Feedback with Active Queries
Reinforcement Learning from Human Feedback with Active Queries
Kaixuan Ji
Jiafan He
Quanquan Gu
26
17
0
14 Feb 2024
Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit
Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit
Tian Huang
Ke Li
Ke Li
31
1
0
23 Nov 2023
Borda Regret Minimization for Generalized Linear Dueling Bandits
Borda Regret Minimization for Generalized Linear Dueling Bandits
Yue Wu
Tao Jin
Hao Lou
Farzad Farnoud
Quanquan Gu
34
11
0
15 Mar 2023
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive
  Non-Stationary Dueling Bandits
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits
Thomas Kleine Buening
Aadirupa Saha
48
6
0
25 Oct 2022
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit
  Problem
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem
Arpit Agarwal
R. Ghuge
V. Nagarajan
25
1
0
25 Sep 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online
  Learning from Preferences
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences
Aadirupa Saha
Pierre Gaillard
36
8
0
14 Feb 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under
  Realizability
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability
Aadirupa Saha
A. Krishnamurthy
39
35
0
24 Nov 2021
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary
  Dueling Bandits
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
Aadirupa Saha
Shubham Gupta
33
10
0
06 Nov 2021
Preference learning along multiple criteria: A game-theoretic
  perspective
Preference learning along multiple criteria: A game-theoretic perspective
Kush S. Bhatia
A. Pananjady
Peter L. Bartlett
Anca Dragan
Martin J. Wainwright
32
13
0
05 May 2021
KLUCB Approach to Copeland Bandits
KLUCB Approach to Copeland Bandits
Nischal Agrawal
P. Chaporkar
13
1
0
07 Feb 2019
Multi-dueling Bandits with Dependent Arms
Multi-dueling Bandits with Dependent Arms
Yanan Sui
Vincent Zhuang
J. W. Burdick
Yisong Yue
22
80
0
29 Apr 2017
Preferential Bayesian Optimization
Preferential Bayesian Optimization
Javier I. González
Zhenwen Dai
Andreas C. Damianou
Neil D. Lawrence
23
110
0
12 Apr 2017
1