Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.12947
Cited By
Learning from eXtreme Bandit Feedback
27 September 2020
Romain Lopez
Inderjit S. Dhillon
Michael I. Jordan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from eXtreme Bandit Feedback"
6 / 6 papers shown
Title
On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-
n
n
n
Recommendation
Olivier Jeunen
Ivan Potapov
Aleksei Ustimenko
ELM
OffRL
27
11
0
27 Jul 2023
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Mohammad Kachuee
Sungjin Lee
73
4
0
17 Sep 2022
Supporting Massive DLRM Inference Through Software Defined Memory
E. K. Ardestani
Changkyu Kim
Seung Jae Lee
Luoshang Pan
Valmiki Rampersad
...
Krishnakumar Nair
Maxim Naumov
Christopher Peterson
M. Smelyanskiy
Vijay Rao
BDL
33
20
0
21 Oct 2021
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CML
LRM
40
31
0
28 Jun 2021
Learning Representations for Counterfactual Inference
Fredrik D. Johansson
Uri Shalit
David Sontag
CML
OOD
BDL
232
719
0
12 May 2016
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1