ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1605.04812
  4. Cited By
Off-policy evaluation for slate recommendation

Off-policy evaluation for slate recommendation

16 May 2016
Adith Swaminathan
A. Krishnamurthy
Alekh Agarwal
Miroslav Dudík
John Langford
Damien Jose
I. Zitouni
    CML
    OffRL
ArXivPDFHTML

Papers citing "Off-policy evaluation for slate recommendation"

49 / 49 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
42
0
0
02 May 2025
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
41
5
0
22 Feb 2024
Off-Policy Evaluation of Slate Bandit Policies via Optimizing
  Abstraction
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
27
6
0
03 Feb 2024
Individualized Policy Evaluation and Learning under Clustered Network Interference
Individualized Policy Evaluation and Learning under Clustered Network Interference
Yi Zhang
Kosuke Imai
OffRL
42
1
0
04 Nov 2023
Representation Learning in Low-rank Slate-based Recommender Systems
Representation Learning in Low-rank Slate-based Recommender Systems
Yijia Dai
Wen Sun
OffRL
30
0
0
10 Sep 2023
Distributional Off-Policy Evaluation for Slate Recommendations
Distributional Off-Policy Evaluation for Slate Recommendations
Shreyas Chaudhari
David Arbour
Georgios Theocharous
N. Vlassis
OffRL
46
0
0
27 Aug 2023
On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation
  Metric for Top-$n$ Recommendation
On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-nnn Recommendation
Olivier Jeunen
Ivan Potapov
Aleksei Ustimenko
ELM
OffRL
32
11
0
27 Jul 2023
Leveraging Factored Action Spaces for Efficient Offline Reinforcement
  Learning in Healthcare
Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare
Shengpu Tang
Maggie Makar
Michael Sjoding
Finale Doshi-Velez
Jenna Wiens
OffRL
65
40
0
02 May 2023
Recommender Systems: A Primer
Recommender Systems: A Primer
P. Castells
Dietmar Jannach
OffRL
34
5
0
06 Feb 2023
SPEED: Experimental Design for Policy Evaluation in Linear
  Heteroscedastic Bandits
SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits
Subhojyoti Mukherjee
Qiaomin Xie
Josiah P. Hanna
R. Nowak
OffRL
58
5
0
29 Jan 2023
Policy learning "without'' overlap: Pessimism and generalized empirical
  Bernstein's inequality
Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
40
25
0
19 Dec 2022
Multi-Task Off-Policy Learning from Bandit Feedback
Multi-Task Off-Policy Learning from Bandit Feedback
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
37
10
0
09 Dec 2022
Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation
Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation
Diego Martinez-Taboada
Dino Sejdinovic
CML
OffRL
27
0
0
02 Nov 2022
Off-policy evaluation for learning-to-rank via interpolating the
  item-position model and the position-based model
Off-policy evaluation for learning-to-rank via interpolating the item-position model and the position-based model
Alexander K. Buchholz
Ben London
Giuseppe Di Benedetto
Thorsten Joachims
OffRL
26
2
0
15 Oct 2022
Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of
  Simulation
Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation
Imad Aouali
Amine Benhalloum
Martin Bompaire
Benjamin Heymann
Olivier Jeunen
D. Rohde
Otmane Sakhi
Flavian Vasile
OffRL
11
2
0
18 Sep 2022
Constrained Policy Optimization for Controlled Self-Learning in
  Conversational AI Systems
Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Mohammad Kachuee
Sungjin Lee
76
4
0
17 Sep 2022
Inverse Propensity Score based offline estimator for deterministic
  ranking lists using position bias
Inverse Propensity Score based offline estimator for deterministic ranking lists using position bias
Nick Wood
Sumit Sidana
OffRL
14
0
0
31 Aug 2022
Scalable and Robust Self-Learning for Skill Routing in Large-Scale
  Conversational AI Systems
Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems
Mohammad Kachuee
Jinseok Nam
Sarthak Ahuja
J. Won
Sungjin Lee
34
5
0
14 Apr 2022
Off-Policy Evaluation in Embedded Spaces
Off-Policy Evaluation in Embedded Spaces
Jaron J. R. Lee
David Arbour
Georgios Theocharous
OffRL
25
3
0
05 Mar 2022
Safe Exploration for Efficient Policy Evaluation and Comparison
Safe Exploration for Efficient Policy Evaluation and Comparison
Runzhe Wan
Branislav Kveton
Rui Song
OffRL
38
10
0
26 Feb 2022
Offline Reinforcement Learning for Mobile Notifications
Offline Reinforcement Learning for Mobile Notifications
Yiping Yuan
A. Muralidharan
Preetam Nandy
Miao Cheng
Prakruthi Prabhakar
OffRL
36
9
0
04 Feb 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the
  Cascade Behavior Model
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
Haruka Kiyohara
Yuta Saito
Tatsuya Matsuhiro
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
OffRL
26
42
0
03 Feb 2022
Edge-Compatible Reinforcement Learning for Recommendations
Edge-Compatible Reinforcement Learning for Recommendations
James E. Kostas
Philip S. Thomas
Georgios Theocharous
OffRL
23
0
0
10 Dec 2021
Contextual Bandit Applications in Customer Support Bot
Contextual Bandit Applications in Customer Support Bot
Sandra Sajeev
Jade Huang
Nikos Karampatziakis
Matthew Hall
Sebastian Kochman
Weizhu Chen
30
10
0
06 Dec 2021
Safe Data Collection for Offline and Online Policy Learning
Safe Data Collection for Offline and Online Policy Learning
Ruihao Zhu
Branislav Kveton
OffRL
21
5
0
08 Nov 2021
Value Penalized Q-Learning for Recommender Systems
Value Penalized Q-Learning for Recommender Systems
Chengqian Gao
Ke Xu
Kuangqi Zhou
Lanqing Li
Xueqian Wang
Bo Yuan
P. Zhao
OffRL
54
20
0
15 Oct 2021
The Benchmark Lottery
The Benchmark Lottery
Mostafa Dehghani
Yi Tay
A. Gritsenko
Zhe Zhao
N. Houlsby
Fernando Diaz
Donald Metzler
Oriol Vinyals
44
89
0
14 Jul 2021
On component interactions in two-stage recommender systems
On component interactions in two-stage recommender systems
Jiri Hron
K. Krauth
Michael I. Jordan
Niki Kilbertus
CML
LRM
42
31
0
28 Jun 2021
Finding Valid Adjustments under Non-ignorability with Minimal DAG
  Knowledge
Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge
Abhin Shah
Karthikeyan Shanmugam
Kartik Ahuja
CML
38
13
0
22 Jun 2021
Policy Learning with Adaptively Collected Data
Policy Learning with Adaptively Collected Data
Ruohan Zhan
Zhimei Ren
Susan Athey
Zhengyuan Zhou
OffRL
45
27
0
05 May 2021
Benchmarks for Deep Off-Policy Evaluation
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
35
100
0
30 Mar 2021
Split-Treatment Analysis to Rank Heterogeneous Causal Effects for
  Prospective Interventions
Split-Treatment Analysis to Rank Heterogeneous Causal Effects for Prospective Interventions
Yanbo Xu
Divyat Mahajan
Liz Manrao
Amit Sharma
Emre Kıcıman
CML
15
2
0
11 Nov 2020
Carousel Personalization in Music Streaming Apps with Contextual Bandits
Carousel Personalization in Music Streaming Apps with Contextual Bandits
Walid Bendada
Guillaume Salha-Galvan
Théo Bontempelli
29
56
0
14 Sep 2020
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible
  Off-Policy Evaluation
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Counterfactual Evaluation of Slate Recommendations with Sequential
  Reward Interactions
Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions
James McInerney
B. Brost
Praveen Chandar
Rishabh Mehrotra
Ben Carterette
BDL
CML
OffRL
121
55
0
25 Jul 2020
Design and Evaluation of Personalized Free Trials
Design and Evaluation of Personalized Free Trials
Hema Yoganarasimhan
E. Barzegary
Abhishek Pani
15
11
0
24 Jun 2020
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement
  Learning
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Ming Yin
Yu Wang
OffRL
29
80
0
29 Jan 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
32
152
0
15 Nov 2019
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior
  Policies
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Xinyun Chen
Lu Wang
Yizhe Hang
Heng Ge
H. Zha
OffRL
14
5
0
10 Oct 2019
Causal Modeling for Fairness in Dynamical Systems
Causal Modeling for Fairness in Dynamical Systems
Elliot Creager
David Madras
T. Pitassi
R. Zemel
29
67
0
18 Sep 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
13
328
0
10 Jun 2019
Empirical Likelihood for Contextual Bandits
Empirical Likelihood for Contextual Bandits
Nikos Karampatziakis
John Langford
Paul Mineiro
OffRL
23
9
0
07 Jun 2019
Reinforcement Learning for Slate-based Recommender Systems: A Tractable
  Decomposition and Practical Methodology
Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology
Eugene Ie
Vihan Jain
Jing Wang
Sanmit Narvekar
Ritesh Agarwal
...
Vince Gatto
Paul Covington
Jim McFadden
Tushar Chandra
Craig Boutilier
OffRL
24
69
0
29 May 2019
Top-K Off-Policy Correction for a REINFORCE Recommender System
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CML
OffRL
33
474
0
06 Dec 2018
Counterfactual Mean Embeddings
Counterfactual Mean Embeddings
Krikamol Muandet
Motonobu Kanagawa
Sorawit Saengkyongam
S. Marukatat
CML
OffRL
26
39
0
22 May 2018
Offline Evaluation of Ranking Policies with Click Models
Offline Evaluation of Ranking Policies with Click Models
Shuai Li
Yasin Abbasi-Yadkori
Branislav Kveton
S. Muthukrishnan
Vishwa Vinay
Zheng Wen
CML
OffRL
10
65
0
27 Apr 2018
Semiparametric Contextual Bandits
Semiparametric Contextual Bandits
A. Krishnamurthy
Zhiwei Steven Wu
Vasilis Syrgkanis
33
44
0
12 Mar 2018
Beyond Greedy Ranking: Slate Optimization via List-CVAE
Beyond Greedy Ranking: Slate Optimization via List-CVAE
Ray Jiang
Sven Gowal
Timothy A. Mann
Danilo Jimenez Rezende
24
49
0
05 Mar 2018
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
17
264
0
10 Feb 2018
1