ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00301
  4. Cited By
Warm-starting Contextual Bandits: Robustly Combining Supervised and
  Bandit Feedback
v1v2 (latest)

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

2 January 2019
Chicheng Zhang
Alekh Agarwal
Hal Daumé
John Langford
S. Negahban
ArXiv (abs)PDFHTML

Papers citing "Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback"

17 / 17 papers shown
Title
Warm Starting of CMA-ES for Contextual Optimization Problems
Warm Starting of CMA-ES for Contextual Optimization Problems
Yuta Sekino
Kento Uchida
Shinichi Shirakawa
111
0
0
18 Feb 2025
Online Bandit Learning with Offline Preference Data for Improved RLHF
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
191
2
0
13 Jun 2024
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
Chao Yu
AI4TS
152
7
0
30 Sep 2022
Active Learning with Logged Data
Active Learning with Logged Data
Songbai Yan
Kamalika Chaudhuri
T. Javidi
117
27
0
25 Feb 2018
A Contextual Bandit Bake-off
A Contextual Bandit Bake-off
A. Bietti
Alekh Agarwal
John Langford
364
105
0
12 Feb 2018
Reinforcement Learning for Bandit Neural Machine Translation with
  Simulated Human Feedback
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
65
138
0
24 Jul 2017
Corralling a Band of Bandit Algorithms
Corralling a Band of Bandit Algorithms
Alekh Agarwal
Haipeng Luo
Behnam Neyshabur
Robert Schapire
146
157
0
19 Dec 2016
Conservative Contextual Linear Bandits
Conservative Contextual Linear Bandits
Abbas Kazerouni
Mohammad Ghavamzadeh
Y. Abbasi
Benjamin Van Roy
132
98
0
19 Nov 2016
Bandit Structured Prediction for Learning from Partial Feedback in
  Statistical Machine Translation
Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation
Artem Sokolov
Stefan Riezler
Tanguy Urvoy
50
22
0
18 Jan 2016
Active Learning from Weak and Strong Labelers
Active Learning from Weak and Strong Labelers
Chicheng Zhang
Kamalika Chaudhuri
53
103
0
09 Oct 2015
Normalized Online Learning
Normalized Online Learning
Stéphane Ross
Paul Mineiro
John Langford
146
69
0
09 Aug 2014
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
396
510
0
04 Feb 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
195
1,004
0
15 Sep 2012
Efficient Optimal Learning for Contextual Bandits
Efficient Optimal Learning for Contextual Bandits
Miroslav Dudík
Daniel J. Hsu
Satyen Kale
Nikos Karampatziakis
John Langford
L. Reyzin
Tong Zhang
192
302
0
13 Jun 2011
Online Importance Weight Aware Updates
Online Importance Weight Aware Updates
Nikos Karampatziakis
John Langford
174
79
0
06 Nov 2010
Contextual Bandit Algorithms with Supervised Learning Guarantees
Contextual Bandit Algorithms with Supervised Learning Guarantees
A. Beygelzimer
John Langford
Lihong Li
L. Reyzin
Robert Schapire
OffRL
199
326
0
22 Feb 2010
Domain Adaptation: Learning Bounds and Algorithms
Domain Adaptation: Learning Bounds and Algorithms
Yishay Mansour
M. Mohri
Afshin Rostamizadeh
298
801
0
19 Feb 2009
1