v1v2 (latest)

Tracking the Best Expert in Non-stationary Stochastic Environments

2 December 2017

Papers citing "Tracking the Best Expert in Non-stationary Stochastic Environments"

42 / 42 papers shown

Title
Improved Impossible Tuning and Lipschitz-Adaptive Universal Online Learning with Gradient Variations Kei Takemura Ryuta Matsuno Keita Sakuma 24 0 0 27 May 2025
Variance-Dependent Regret Bounds for Non-stationary Linear Bandits Zhiyong Wang Jize Xie Yi Chen J. C. Lui Dongruo Zhou 82 1 0 15 Mar 2024
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling Zheqing Zhu Yueyang Liu Xu Kuang Benjamin Van Roy AI4TS 68 0 0 11 Oct 2023
Learning to Schedule in Non-Stationary Wireless Networks With Unknown Statistics Quang Minh Nguyen E. Modiano 50 5 0 04 Aug 2023
Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach Yu-Hu Yan Peng Zhao Zhiguang Zhou 72 9 0 17 Jul 2023
Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds Taira Tsuchiya Shinji Ito Junya Honda 70 8 0 26 May 2023
Energy Regularized RNNs for Solving Non-Stationary Bandit Problems Michael Rotman Lior Wolf 64 1 0 12 Mar 2023
MNL-Bandit in non-stationary environments Ayoub Foussoul Vineet Goyal Varun Gupta 74 3 0 04 Mar 2023
Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits Yue Kang Cho-Jui Hsieh T. C. Lee 76 1 0 18 Feb 2023
Adapting to Continuous Covariate Shift via Online Density Ratio Estimation Yu Zhang Zhenyu Zhang Peng Zhao Masashi Sugiyama OOD 85 13 0 06 Feb 2023
Learning to Price Supply Chain Contracts against a Learning Retailer Xuejun Zhao Ruihao Zhu W. Haskell OffRL 94 0 0 02 Nov 2022
One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits Pierre Gaillard Aadirupa Saha Soham Dan 69 3 0 26 Oct 2022
Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods Davoud Ataee Tarzanagh Parvin Nazari Bojian Hou Li Shen Laura Balzano 154 12 0 06 Jul 2022
No-Regret Learning in Time-Varying Zero-Sum Games Mengxiao Zhang Peng Zhao Haipeng Luo Zhi Zhou 98 40 0 30 Jan 2022
Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand Kshitija Taywade Brent Harrison J. Goldsmith 74 3 0 03 Jan 2022
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization Peng Zhao Yu Zhang Lijun Zhang Zhi Zhou 124 50 0 29 Dec 2021
The Pareto Frontier of model selection for general Contextual Bandits T. V. Marinov Julian Zimmert 100 22 0 25 Oct 2021
Online estimation and control with optimal pathlength regret Gautam Goel Djamé Seddah 90 3 0 24 Oct 2021
Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits Junpei Komiyama Edouard Fouché Junya Honda 81 6 0 23 Jul 2021
Low-Regret Active learning Cenk Baykal Lucas Liebenwein Dan Feldman Daniela Rus UQCV 79 3 0 06 Apr 2021
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach Chen-Yu Wei Haipeng Luo OffRL 189 108 0 10 Feb 2021
Non-stationary Online Learning with Memory and Non-stochastic Control Peng Zhao Yu-Hu Yan Yu Wang Zhi Zhou 87 48 0 07 Feb 2021
Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications Liyu Chen Haipeng Luo Chen-Yu Wei 123 45 0 01 Feb 2021
Generalized non-stationary bandits Anne Gael Manegueu Alexandra Carpentier Yi Yu 82 10 0 01 Feb 2021
Taking a hint: How to leverage loss predictors in contextual bandits? Chen-Yu Wei Haipeng Luo Alekh Agarwal 174 27 0 04 Mar 2020
Combinatorial Semi-Bandit in the Non-Stationary Environment Wei Chen Liwei Wang Haoyu Zhao Kai Zheng 91 18 0 10 Feb 2020
Online Second Price Auction with Semi-bandit Feedback Under the Non-Stationary Setting Haoyu Zhao Wei Chen 49 13 0 14 Nov 2019
Adaptive and Efficient Algorithms for Tracking the Best Expert Shiyin Lu Lijun Zhang 48 1 0 05 Sep 2019
Bandit Convex Optimization in Non-stationary Environments Peng Zhao G. Wang Lijun Zhang Zhi Zhou 112 44 0 29 Jul 2019
Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits Subhojyoti Mukherjee Odalric-Ambrym Maillard 76 11 0 30 May 2019
Equipping Experts/Bandits with Long-term Memory Kai Zheng Haipeng Luo Ilias Diakonikolas Liwei Wang OffRL 69 15 0 30 May 2019
Hedging the Drift: Learning to Optimize under Non-Stationarity Wang Chi Cheung D. Simchi-Levi Ruihao Zhu 112 92 0 04 Mar 2019
A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal, and Parameter-free Yifang Chen Chung-Wei Lee Haipeng Luo Chen-Yu Wei 166 135 0 03 Feb 2019
Improved Path-length Regret Bounds for Bandits Sébastien Bubeck Yuanzhi Li Haipeng Luo Chen-Yu Wei 123 46 0 29 Jan 2019
Decentralized Online Learning: Take Benefits from Others' Data without Sharing Your Own to Track Global Trend Wendi Wu Zongren Li Yawei Zhao Chenkai Yu P. Zhao Ji Liu FedML 103 17 0 29 Jan 2019
Proximal Online Gradient is Optimum for Dynamic Regret Yawei Zhao Shuang Qiu Ji Liu 123 7 0 08 Oct 2018
Learning to Optimize under Non-Stationarity Wang Chi Cheung D. Simchi-Levi Ruihao Zhu 97 139 0 06 Oct 2018
Dynamic Ensemble Active Learning: A Non-Stationary Bandit with Expert Advice Kunkun Pang Mingzhi Dong Yang Wu Timothy M. Hospedales 74 18 0 29 Sep 2018
A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem Fang Liu Joohyung Lee Ness B. Shroff 90 119 0 08 Nov 2017
Efficient Contextual Bandits in Non-stationary Worlds Haipeng Luo Chen-Yu Wei Alekh Agarwal John Langford 91 133 0 05 Aug 2017
Online Learning with Automata-based Expert Sequences M. Mohri Scott Yang OffRL 49 1 0 29 Apr 2017
Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-stationary Rewards Omar Besbes Y. Gur A. Zeevi 90 127 0 13 May 2014