Safe Exploration for Optimizing Contextual Bandits

2 February 2020

Maarten de Rijke

Papers citing "Safe Exploration for Optimizing Contextual Bandits"

4 / 4 papers shown

Title
Constrained Online Decision-Making: A Unified Framework Haichen Hu David Simchi-Levi Navid Azizan 39 0 0 11 May 2025
Proximal Ranking Policy Optimization for Practical Safety in Counterfactual Learning to Rank Shashank Gupta Harrie Oosterhuis Maarten de Rijke OffRL 51 0 0 15 Sep 2024
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems Mohammad Kachuee Sungjin Lee 76 4 0 17 Sep 2022
Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback for Unbiased Learning to Rank Harrie Oosterhuis CML 54 27 0 31 Mar 2022