Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.00301
Cited By
v1
v2 (latest)
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
2 January 2019
Chicheng Zhang
Alekh Agarwal
Hal Daumé
John Langford
S. Negahban
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback"
17 / 17 papers shown
Title
Warm Starting of CMA-ES for Contextual Optimization Problems
Yuta Sekino
Kento Uchida
Shinichi Shirakawa
111
0
0
18 Feb 2025
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
191
2
0
13 Jun 2024
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
Chao Yu
AI4TS
152
7
0
30 Sep 2022
Active Learning with Logged Data
Songbai Yan
Kamalika Chaudhuri
T. Javidi
117
27
0
25 Feb 2018
A Contextual Bandit Bake-off
A. Bietti
Alekh Agarwal
John Langford
364
105
0
12 Feb 2018
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
65
138
0
24 Jul 2017
Corralling a Band of Bandit Algorithms
Alekh Agarwal
Haipeng Luo
Behnam Neyshabur
Robert Schapire
146
157
0
19 Dec 2016
Conservative Contextual Linear Bandits
Abbas Kazerouni
Mohammad Ghavamzadeh
Y. Abbasi
Benjamin Van Roy
132
98
0
19 Nov 2016
Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation
Artem Sokolov
Stefan Riezler
Tanguy Urvoy
50
22
0
18 Jan 2016
Active Learning from Weak and Strong Labelers
Chicheng Zhang
Kamalika Chaudhuri
53
103
0
09 Oct 2015
Normalized Online Learning
Stéphane Ross
Paul Mineiro
John Langford
146
69
0
09 Aug 2014
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
Alekh Agarwal
Daniel J. Hsu
Satyen Kale
John Langford
Lihong Li
Robert Schapire
OffRL
396
510
0
04 Feb 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
195
1,004
0
15 Sep 2012
Efficient Optimal Learning for Contextual Bandits
Miroslav Dudík
Daniel J. Hsu
Satyen Kale
Nikos Karampatziakis
John Langford
L. Reyzin
Tong Zhang
192
302
0
13 Jun 2011
Online Importance Weight Aware Updates
Nikos Karampatziakis
John Langford
174
79
0
06 Nov 2010
Contextual Bandit Algorithms with Supervised Learning Guarantees
A. Beygelzimer
John Langford
Lihong Li
L. Reyzin
Robert Schapire
OffRL
199
326
0
22 Feb 2010
Domain Adaptation: Learning Bounds and Algorithms
Yishay Mansour
M. Mohri
Afshin Rostamizadeh
298
801
0
19 Feb 2009
1