ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.12947
  4. Cited By
Learning from eXtreme Bandit Feedback

Learning from eXtreme Bandit Feedback

27 September 2020
Romain Lopez
Inderjit S. Dhillon
Michael I. Jordan
    OffRL
ArXivPDFHTML

Papers citing "Learning from eXtreme Bandit Feedback"

6 / 6 papers shown
Title
On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation
  Metric for Top-$n$ Recommendation
On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-nnn Recommendation
Olivier Jeunen
Ivan Potapov
Aleksei Ustimenko
ELM
OffRL
27
11
0
27 Jul 2023
Constrained Policy Optimization for Controlled Self-Learning in
  Conversational AI Systems
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Mohammad Kachuee
Sungjin Lee
73
4
0
17 Sep 2022
Supporting Massive DLRM Inference Through Software Defined Memory
Supporting Massive DLRM Inference Through Software Defined Memory
E. K. Ardestani
Changkyu Kim
Seung Jae Lee
Luoshang Pan
Valmiki Rampersad
...
Krishnakumar Nair
Maxim Naumov
Christopher Peterson
M. Smelyanskiy
Vijay Rao
BDL
33
20
0
21 Oct 2021
On component interactions in two-stage recommender systems
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CML
LRM
40
31
0
28 Jun 2021
Learning Representations for Counterfactual Inference
Learning Representations for Counterfactual Inference
Fredrik D. Johansson
Uri Shalit
David Sontag
CML
OOD
BDL
232
719
0
12 May 2016
Off-Policy Actor-Critic
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1