ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02029
  4. Cited By
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual
  Bandits

Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits

3 June 2021
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
    OffRL
ArXivPDFHTML

Papers citing "Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits"

13 / 13 papers shown
Title
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement
SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement
Brian Cho
Ana-Roxana Pop
Ariel Evince
Nathan Kallus
OffRL
51
0
0
17 Mar 2025
Inference with the Upper Confidence Bound Algorithm
Inference with the Upper Confidence Bound Algorithm
K. Khamaru
Cun-Hui Zhang
48
0
0
08 Aug 2024
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu
Jiyuan Tu
Yichen Zhang
Xi Chen
OffRL
26
2
0
04 Oct 2023
Online learning in bandits with predicted context
Online learning in bandits with predicted context
Yongyi Guo
Ziping Xu
Susan Murphy
26
4
0
26 Jul 2023
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
58
16
0
23 Nov 2022
Contextual Bandits in a Survey Experiment on Charitable Giving:
  Within-Experiment Outcomes versus Policy Learning
Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning
Susan Athey
Undral Byambadalai
Vitor Hadad
Sanath Kumar Krishnamurthy
Weiwen Leung
Joseph Jay Williams
40
13
0
22 Nov 2022
Anytime-valid off-policy inference for contextual bandits
Anytime-valid off-policy inference for contextual bandits
Ian Waudby-Smith
Lili Wu
Aaditya Ramdas
Nikos Karampatziakis
Paul Mineiro
OffRL
45
25
0
19 Oct 2022
Off-policy estimation of linear functionals: Non-asymptotic theory for
  semi-parametric efficiency
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
Wenlong Mou
Martin J. Wainwright
Peter L. Bartlett
OffRL
41
11
0
26 Sep 2022
Best Arm Identification with Contextual Information under a Small Gap
Best Arm Identification with Contextual Information under a Small Gap
Masahiro Kato
Masaaki Imaizumi
Takuya Ishihara
T. Kitagawa
27
2
0
15 Sep 2022
Offline Neural Contextual Bandits: Pessimism, Optimization and
  Generalization
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
34
29
0
27 Nov 2021
Dynamic Selection in Algorithmic Decision-making
Dynamic Selection in Algorithmic Decision-making
Jin Li
Ye Luo
Xiaowei Zhang
29
2
0
28 Aug 2021
Policy Learning with Adaptively Collected Data
Policy Learning with Adaptively Collected Data
Ruohan Zhan
Zhimei Ren
Susan Athey
Zhengyuan Zhou
OffRL
45
27
0
05 May 2021
Online Multi-Armed Bandits with Adaptive Inference
Online Multi-Armed Bandits with Adaptive Inference
Maria Dimakopoulou
Zhimei Ren
Zhengyuan Zhou
32
34
0
25 Feb 2021
1