ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.05736
  4. Cited By
Optimal Baseline Corrections for Off-Policy Contextual Bandits

Optimal Baseline Corrections for Off-Policy Contextual Bandits

9 May 2024
Shashank Gupta
Olivier Jeunen
Harrie Oosterhuis
Maarten de Rijke
ArXivPDFHTML

Papers citing "Optimal Baseline Corrections for Off-Policy Contextual Bandits"

8 / 8 papers shown
Title
Counterfactual Inference under Thompson Sampling
Counterfactual Inference under Thompson Sampling
Olivier Jeunen
OffRL
LRM
35
0
0
03 Apr 2025
A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning
Shashank Gupta
Chaitanya Ahuja
Tsung-Yu Lin
Sreya Dutta Roy
Harrie Oosterhuis
Maarten de Rijke
Satya Narayan Shukla
46
1
0
02 Mar 2025
Proximal Ranking Policy Optimization for Practical Safety in
  Counterfactual Learning to Rank
Proximal Ranking Policy Optimization for Practical Safety in Counterfactual Learning to Rank
Shashank Gupta
Harrie Oosterhuis
Maarten de Rijke
OffRL
32
0
0
15 Sep 2024
A Simpler Alternative to Variational Regularized Counterfactual Risk
  Minimization
A Simpler Alternative to Variational Regularized Counterfactual Risk Minimization
Hua Chang Bakker
Shashank Gupta
Harrie Oosterhuis
OffRL
28
0
0
15 Sep 2024
Practical and Robust Safety Guarantees for Advanced Counterfactual
  Learning to Rank
Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank
Shashank Gupta
Harrie Oosterhuis
Maarten de Rijke
43
6
0
29 Jul 2024
Multi-Objective Recommendation via Multivariate Policy Learning
Multi-Objective Recommendation via Multivariate Policy Learning
Olivier Jeunen
Jatin Mandav
Ivan Potapov
Nakul Agarwal
Sourabh Vaid
Wenzhe Shi
Aleksei Ustimenko
OffRL
21
3
0
03 May 2024
Safe Deployment for Counterfactual Learning to Rank with Exposure-Based
  Risk Minimization
Safe Deployment for Counterfactual Learning to Rank with Exposure-Based Risk Minimization
Shashank Gupta
Harrie Oosterhuis
Maarten de Rijke
32
14
0
26 Apr 2023
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1