Off-Policy Evaluation via Adaptive Weighting with Data from Contextual
Bandits

Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits

3 June 2021

David A. Hirshberg

Papers citing "Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits"

13 / 13 papers shown

Title
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement Brian Cho Ana-Roxana Pop Ariel Evince Nathan Kallus OffRL 51 0 0 17 Mar 2025
Inference with the Upper Confidence Bound Algorithm K. Khamaru Cun-Hui Zhang 48 0 0 08 Aug 2024
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning Weidong Liu Jiyuan Tu Yichen Zhang Xi Chen OffRL 26 2 0 04 Oct 2023
Online learning in bandits with predicted context Yongyi Guo Ziping Xu Susan Murphy 26 4 0 26 Jul 2023
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation Thanh Nguyen-Tang Ming Yin Sunil R. Gupta Svetha Venkatesh R. Arora OffRL 58 16 0 23 Nov 2022
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning Susan Athey Undral Byambadalai Vitor Hadad Sanath Kumar Krishnamurthy Weiwen Leung Joseph Jay Williams 40 13 0 22 Nov 2022
Anytime-valid off-policy inference for contextual bandits Ian Waudby-Smith Lili Wu Aaditya Ramdas Nikos Karampatziakis Paul Mineiro OffRL 45 25 0 19 Oct 2022
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency Wenlong Mou Martin J. Wainwright Peter L. Bartlett OffRL 41 11 0 26 Sep 2022
Best Arm Identification with Contextual Information under a Small Gap Masahiro Kato Masaaki Imaizumi Takuya Ishihara T. Kitagawa 27 2 0 15 Sep 2022
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization Thanh Nguyen-Tang Sunil R. Gupta A. Nguyen Svetha Venkatesh OffRL 34 29 0 27 Nov 2021
Dynamic Selection in Algorithmic Decision-making Jin Li Ye Luo Xiaowei Zhang 29 2 0 28 Aug 2021
Policy Learning with Adaptively Collected Data Ruohan Zhan Zhimei Ren Susan Athey Zhengyuan Zhou OffRL 45 27 0 05 May 2021
Online Multi-Armed Bandits with Adaptive Inference Maria Dimakopoulou Zhimei Ren Zhengyuan Zhou 32 34 0 25 Feb 2021