Learning Adversarial Markov Decision Processes with Delayed Feedback

29 December 2020

Papers citing "Learning Adversarial Markov Decision Processes with Delayed Feedback"

10 / 10 papers shown

Title
Identifying Predictions That Influence the Future: Detecting Performative Concept Drift in Data Streams Brandon Gower-Winter Georg Krempl Sergey Dragomiretskiy Tineke Jelsma Arno Siebes 88 0 0 13 Dec 2024
Biased Dueling Bandits with Stochastic Delayed Feedback Bongsoo Yi Yue Kang Yao Li 30 1 0 26 Aug 2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization D. Tiapkin Evgenii Chzhen Gilles Stoltz 74 0 0 08 Jul 2024
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback Asaf B. Cassel Haipeng Luo Aviv A. Rosenberg Dmitry Sotnikov OffRL 29 3 0 13 May 2024
A Reduction-based Framework for Sequential Decision Making with Delayed Feedback Yunchang Yang Hangshi Zhong Tianhao Wu B. Liu Liwei Wang S. Du OffRL 27 8 0 03 Feb 2023
Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation Uri Sherman Tomer Koren Yishay Mansour 32 12 0 30 Jan 2023
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee Tianhao Wu Yunchang Yang Han Zhong Liwei Wang S. Du Jiantao Jiao 45 14 0 21 Dec 2021
Nonstochastic Bandits with Composite Anonymous Feedback Nicolò Cesa-Bianchi Tommaso Cesari Roberto Colomboni Claudio Gentile Yishay Mansour 101 39 0 06 Dec 2021
Reinforcement Learning for Feedback-Enabled Cyber Resilience Yunhan Huang Linan Huang Quanyan Zhu 16 65 0 02 Jul 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints Chi Jin Zhuoran Yang Zhaoran Wang OffRL 117 166 0 06 Jan 2021