Multi-dueling Bandits with Dependent Arms

Multi-dueling Bandits with Dependent Arms

29 April 2017

Papers citing "Multi-dueling Bandits with Dependent Arms"

16 / 16 papers shown

Title
Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents Fanzeng Xia Hao Liu Yisong Yue Tongxin Li 69 1 0 03 Jan 2025
A Minimaximalist Approach to Reinforcement Learning from Human Feedback Gokul Swamy Christoph Dann Rahul Kidambi Zhiwei Steven Wu Alekh Agarwal OffRL 51 96 0 08 Jan 2024
Adaptive Learning based Upper-Limb Rehabilitation Training System with Collaborative Robot JunSeo Lim Kaibo He Zeji Yi Chen Hou Chen Zhang Yanan Sui Luming Li 22 3 0 18 May 2023
Dueling Bandits: From Two-dueling to Multi-dueling Yihan Du Siwei Wang Longbo Huang 19 3 0 16 Nov 2022
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits Thomas Kleine Buening Aadirupa Saha 51 6 0 25 Oct 2022
Dueling Convex Optimization with General Preferences Aadirupa Saha Tomer Koren Yishay Mansour 30 3 0 27 Sep 2022
An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem Arpit Agarwal R. Ghuge V. Nagarajan 25 1 0 25 Sep 2022
Exploiting Correlation to Achieve Faster Learning Rates in Low-Rank Preference Bandits Suprovat Ghoshal Aadirupa Saha 25 11 0 23 Feb 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability Aadirupa Saha A. Krishnamurthy 42 35 0 24 Nov 2021
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits Aadirupa Saha Shubham Gupta 33 10 0 06 Nov 2021
Efficient Exploration in Binary and Preferential Bayesian Optimization T. Fauvel M. Chalk 30 7 0 18 Oct 2021
ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes Kejun Li Maegan Tucker Erdem Biyik Ellen R. Novoseller J. W. Burdick Yanan Sui Dorsa Sadigh Yisong Yue Aaron D. Ames 23 32 0 09 Nov 2020
Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits Maegan Tucker Myra Cheng Ellen R. Novoseller Richard Cheng Yisong Yue J. W. Burdick Aaron D. Ames 35 38 0 13 Mar 2020
Information Directed Sampling for Linear Partial Monitoring Johannes Kirschner Tor Lattimore Andreas Krause 24 46 0 25 Feb 2020
KLUCB Approach to Copeland Bandits Nischal Agrawal P. Chaporkar 16 1 0 07 Feb 2019
Stagewise Safe Bayesian Optimization with Gaussian Processes Yanan Sui Vincent Zhuang J. W. Burdick Yisong Yue 27 139 0 20 Jun 2018