ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.15045
  4. Cited By
DP-Dueling: Learning from Preference Feedback without Compromising User
  Privacy

DP-Dueling: Learning from Preference Feedback without Compromising User Privacy

22 March 2024
Aadirupa Saha
Hilal Asi
ArXiv (abs)PDFHTML

Papers citing "DP-Dueling: Learning from Preference Feedback without Compromising User Privacy"

22 / 22 papers shown
Title
Multi-Player Approaches for Dueling Bandits
Multi-Player Approaches for Dueling Bandits
Or Raveh
Junya Honda
Masashi Sugiyama
108
1
0
25 May 2024
Faster Convergence with Multiway Preferences
Faster Convergence with Multiway Preferences
Aadirupa Saha
Vitaly Feldman
Tomer Koren
Yishay Mansour
57
1
0
19 Dec 2023
Dueling Optimization with a Monotone Adversary
Dueling Optimization with a Monotone Adversary
Avrim Blum
Meghal Gupta
Gene Li
N. Manoj
Aadirupa Saha
Yuanyuan Yang
23
6
0
18 Nov 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
  Pessimism
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
91
60
0
29 May 2023
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive
  Non-Stationary Dueling Bandits
ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits
Thomas Kleine Buening
Aadirupa Saha
65
7
0
25 Oct 2022
Private Online Prediction from Experts: Separations and Faster Rates
Private Online Prediction from Experts: Separations and Faster Rates
Hilal Asi
Vitaly Feldman
Tomer Koren
Kunal Talwar
FedML
60
20
0
24 Oct 2022
Dueling Convex Optimization with General Preferences
Dueling Convex Optimization with General Preferences
Aadirupa Saha
Tomer Koren
Yishay Mansour
47
3
0
27 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
886
13,176
0
04 Mar 2022
Exploiting Correlation to Achieve Faster Learning Rates in Low-Rank
  Preference Bandits
Exploiting Correlation to Achieve Faster Learning Rates in Low-Rank Preference Bandits
Suprovat Ghoshal
Aadirupa Saha
53
12
0
23 Feb 2022
Versatile Dueling Bandits: Best-of-both-World Analyses for Online
  Learning from Preferences
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences
Aadirupa Saha
Pierre Gaillard
61
7
0
14 Feb 2022
Efficient and Optimal Algorithms for Contextual Dueling Bandits under
  Realizability
Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability
Aadirupa Saha
A. Krishnamurthy
87
38
0
24 Nov 2021
Dueling RL: Reinforcement Learning with Trajectory Preferences
Dueling RL: Reinforcement Learning with Trajectory Preferences
Aldo Pacchiano
Aadirupa Saha
Jonathan Lee
86
90
0
08 Nov 2021
APReL: A Library for Active Preference-based Reward Learning Algorithms
APReL: A Library for Active Preference-based Reward Learning Algorithms
Erdem Biyik
Aditi Talati
Dorsa Sadigh
58
37
0
16 Aug 2021
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
Qinghua Liu
Tiancheng Yu
Yu Bai
Chi Jin
91
122
0
04 Oct 2020
Best-item Learning in Random Utility Models with Subset Choices
Best-item Learning in Random Utility Models with Subset Choices
Aadirupa Saha
Aditya Gopalan
25
8
0
19 Feb 2020
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Yu Bai
Chi Jin
SSL
145
149
0
10 Feb 2020
Preference-Based Learning for Exoskeleton Gait Optimization
Preference-Based Learning for Exoskeleton Gait Optimization
Maegan Tucker
Ellen R. Novoseller
Claudia K. Kann
Yanan Sui
Yisong Yue
J. W. Burdick
Aaron D. Ames
112
90
0
26 Sep 2019
Dueling Posterior Sampling for Preference-Based Reinforcement Learning
Dueling Posterior Sampling for Preference-Based Reinforcement Learning
Ellen R. Novoseller
Yibing Wei
Yanan Sui
Yisong Yue
J. W. Burdick
73
64
0
04 Aug 2019
Differentially Private Contextual Linear Bandits
Differentially Private Contextual Linear Bandits
R. Shariff
Or Sheffet
76
120
0
28 Sep 2018
Preference-based Online Learning with Dueling Bandits: A Survey
Preference-based Online Learning with Dueling Bandits: A Survey
Viktor Bengs
R. Busa-Fekete
Adil El Mesaoudi-Paul
Eyke Hüllermeier
102
114
0
30 Jul 2018
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,365
0
12 Jun 2017
Reducing Dueling Bandits to Cardinal Bandits
Reducing Dueling Bandits to Cardinal Bandits
Nir Ailon
Thorsten Joachims
Zohar Karnin
171
140
0
14 May 2014
1