ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.02237
  4. Cited By
Efficient First-Order Contextual Bandits: Prediction, Allocation, and
  Triangular Discrimination

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

5 July 2021
Dylan J. Foster
A. Krishnamurthy
ArXivPDFHTML

Papers citing "Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination"

37 / 37 papers shown
Title
Sparse Nonparametric Contextual Bandits
Sparse Nonparametric Contextual Bandits
Hamish Flynn
Julia Olkhovskaya
Paul Rognon-Vael
51
0
0
20 Mar 2025
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces
An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces
Amaury Gouverneur
Borja Rodríguez Gálvez
T. Oechtering
Mikael Skoglund
56
0
0
04 Feb 2025
A Complete Characterization of Learnability for Stochastic Noisy Bandits
A Complete Characterization of Learnability for Stochastic Noisy Bandits
Steve Hanneke
Kun Wang
40
0
0
20 Jan 2025
How Does Variance Shape the Regret in Contextual Bandits?
How Does Variance Shape the Regret in Contextual Bandits?
Zeyu Jia
Jian Qian
Alexander Rakhlin
Chen-Yu Wei
35
4
0
16 Oct 2024
The Central Role of the Loss Function in Reinforcement Learning
The Central Role of the Loss Function in Reinforcement Learning
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
59
7
0
19 Sep 2024
Provably Efficient Interactive-Grounded Learning with Personalized
  Reward
Provably Efficient Interactive-Grounded Learning with Personalized Reward
Mengxiao Zhang
Yuheng Zhang
Haipeng Luo
Paul Mineiro
34
0
0
31 May 2024
Policy Gradient with Active Importance Sampling
Policy Gradient with Active Importance Sampling
Matteo Papini
Giorgio Manganini
Alberto Maria Metelli
Marcello Restelli
OffRL
25
1
0
09 May 2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
REBEL: Reinforcement Learning via Regressing Relative Rewards
Zhaolin Gao
Jonathan D. Chang
Wenhao Zhan
Owen Oertell
Gokul Swamy
Kianté Brantley
Thorsten Joachims
J. Andrew Bagnell
Jason D. Lee
Wen Sun
OffRL
43
31
0
25 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
152
114
0
04 Apr 2024
Optimistic Information Directed Sampling
Optimistic Information Directed Sampling
Gergely Neu
Matteo Papini
Ludovic Schwartz
50
2
0
23 Feb 2024
On the Performance of Empirical Risk Minimization with Smoothed Data
On the Performance of Empirical Risk Minimization with Smoothed Data
Adam Block
Alexander Rakhlin
Abhishek Shetty
47
3
0
22 Feb 2024
Efficient Contextual Bandits with Uninformed Feedback Graphs
Efficient Contextual Bandits with Uninformed Feedback Graphs
Mengxiao Zhang
Yuheng Zhang
Haipeng Luo
Paul Mineiro
21
4
0
12 Feb 2024
Contextual Multinomial Logit Bandits with General Value Functions
Contextual Multinomial Logit Bandits with General Value Functions
Mengxiao Zhang
Haipeng Luo
27
1
0
12 Feb 2024
More Benefits of Being Distributional: Second-Order Bounds for
  Reinforcement Learning
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang
Owen Oertell
Alekh Agarwal
Nathan Kallus
Wen Sun
OffRL
88
12
0
11 Feb 2024
Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
Yuko Kuroki
Alberto Rumi
Taira Tsuchiya
Fabio Vitale
Nicolò Cesa-Bianchi
36
5
0
24 Dec 2023
Efficient and Interpretable Bandit Algorithms
Efficient and Interpretable Bandit Algorithms
Subhojyoti Mukherjee
Ruihao Zhu
B. Kveton
FAtt
23
2
0
23 Oct 2023
Online Learning in Contextual Second-Price Pay-Per-Click Auctions
Online Learning in Contextual Second-Price Pay-Per-Click Auctions
Mengxiao Zhang
Haipeng Luo
35
4
0
08 Oct 2023
Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual
  Bandits
Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits
Haolin Liu
Chen-Yu Wei
Julian Zimmert
30
9
0
02 Sep 2023
Stochastic Graph Bandit Learning with Side-Observations
Stochastic Graph Bandit Learning with Side-Observations
Xueping Gong
Jiheng Zhang
34
1
0
29 Aug 2023
AdaptEx: A Self-Service Contextual Bandit Platform
AdaptEx: A Self-Service Contextual Bandit Platform
W. Black
Ercüment Ilhan
A. Marchini
Vilda K. Markeviciute
21
3
0
08 Aug 2023
The Benefits of Being Distributional: Small-Loss Bounds for
  Reinforcement Learning
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning
Kaiwen Wang
Kevin Zhou
Runzhe Wu
Nathan Kallus
Wen Sun
OffRL
31
17
0
25 May 2023
Neural Exploitation and Exploration of Contextual Bandits
Neural Exploitation and Exploration of Contextual Bandits
Yikun Ban
Yuchen Yan
A. Banerjee
Jingrui He
42
8
0
05 May 2023
First- and Second-Order Bounds for Adversarial Linear Contextual Bandits
First- and Second-Order Bounds for Adversarial Linear Contextual Bandits
Julia Olkhovskaya
J. Mayo
T. Erven
Gergely Neu
Chen-Yu Wei
59
10
0
01 May 2023
Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian
  rewards
Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards
Amaury Gouverneur
Borja Rodríguez Gálvez
T. Oechtering
Mikael Skoglund
24
4
0
26 Apr 2023
Smoothed Analysis of Sequential Probability Assignment
Smoothed Analysis of Sequential Probability Assignment
Alankrita Bhatt
Nika Haghtalab
Abhishek Shetty
32
9
0
08 Mar 2023
Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using
  Online Function Approximation
Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
Orin Levy
Alon Cohen
Asaf B. Cassel
Yishay Mansour
27
4
0
02 Mar 2023
Practical Contextual Bandits with Feedback Graphs
Practical Contextual Bandits with Feedback Graphs
Mengxiao Zhang
Yuheng Zhang
Olga Vrousgou
Haipeng Luo
Paul Mineiro
9
7
0
17 Feb 2023
Eluder-based Regret for Stochastic Contextual MDPs
Eluder-based Regret for Stochastic Contextual MDPs
Orin Levy
Asaf B. Cassel
Alon Cohen
Yishay Mansour
33
5
0
27 Nov 2022
Conditionally Risk-Averse Contextual Bandits
Conditionally Risk-Averse Contextual Bandits
Mónika Farsang
Paul Mineiro
Wangda Zhang
28
2
0
24 Oct 2022
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous
  Action Spaces
Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces
Yinglun Zhu
Paul Mineiro
15
16
0
12 Jul 2022
Contextual Bandits with Large Action Spaces: Made Practical
Contextual Bandits with Large Action Spaces: Made Practical
Yinglun Zhu
Dylan J. Foster
John Langford
Paul Mineiro
35
29
0
12 Jul 2022
Lifting the Information Ratio: An Information-Theoretic Analysis of
  Thompson Sampling for Contextual Bandits
Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits
Gergely Neu
Julia Olkhovskaya
Matteo Papini
Ludovic Schwartz
33
16
0
27 May 2022
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for
  Online Convex Optimization
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization
Peng Zhao
Yu-Jie Zhang
Lijun Zhang
Zhi-Hua Zhou
30
45
0
29 Dec 2021
First-Order Regret in Reinforcement Learning with Linear Function
  Approximation: A Robust Estimation Approach
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach
Andrew Wagenmaker
Yifang Chen
Max Simchowitz
S. Du
Kevin G. Jamieson
73
36
0
07 Dec 2021
Minimax Rates for Conditional Density Estimation via Empirical Entropy
Minimax Rates for Conditional Density Estimation via Empirical Entropy
Blair Bilodeau
Dylan J. Foster
Daniel M. Roy
22
21
0
21 Sep 2021
On the benefits of maximum likelihood estimation for Regression and
  Forecasting
On the benefits of maximum likelihood estimation for Regression and Forecasting
Pranjal Awasthi
Abhimanyu Das
Rajat Sen
A. Suresh
AI4TS
21
10
0
18 Jun 2021
First-Order Bayesian Regret Analysis of Thompson Sampling
First-Order Bayesian Regret Analysis of Thompson Sampling
Sébastien Bubeck
Mark Sellke
8
16
0
02 Feb 2019
1