ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.00467
  4. Cited By
Safe Exploration for Optimizing Contextual Bandits

Safe Exploration for Optimizing Contextual Bandits

2 February 2020
R. Jagerman
Ilya Markov
Maarten de Rijke
    OffRL
ArXivPDFHTML

Papers citing "Safe Exploration for Optimizing Contextual Bandits"

4 / 4 papers shown
Title
Constrained Online Decision-Making: A Unified Framework
Constrained Online Decision-Making: A Unified Framework
Haichen Hu
David Simchi-Levi
Navid Azizan
41
0
0
11 May 2025
Proximal Ranking Policy Optimization for Practical Safety in
  Counterfactual Learning to Rank
Proximal Ranking Policy Optimization for Practical Safety in Counterfactual Learning to Rank
Shashank Gupta
Harrie Oosterhuis
Maarten de Rijke
OffRL
51
0
0
15 Sep 2024
Constrained Policy Optimization for Controlled Self-Learning in
  Conversational AI Systems
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Mohammad Kachuee
Sungjin Lee
76
4
0
17 Sep 2022
Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback
  for Unbiased Learning to Rank
Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback for Unbiased Learning to Rank
Harrie Oosterhuis
CML
54
27
0
31 Mar 2022
1