ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02353
  4. Cited By
Top-K Off-Policy Correction for a REINFORCE Recommender System

Top-K Off-Policy Correction for a REINFORCE Recommender System

6 December 2018
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
    CML
    OffRL
ArXivPDFHTML

Papers citing "Top-K Off-Policy Correction for a REINFORCE Recommender System"

50 / 187 papers shown
Title
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
DARLR: Dual-Agent Offline Reinforcement Learning for Recommender Systems with Dynamic Reward
Yi Zhang
Ruihong Qiu
Xuwei Xu
Jiajun Liu
Sen Wang
OffRL
34
0
0
12 May 2025
xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems
xMTF: A Formula-Free Model for Reinforcement-Learning-Based Multi-Task Fusion in Recommender Systems
Yang Cao
Changhao Zhang
Xiaoshuang Chen
Kaiqiao Zhan
Ben Wang
28
0
0
08 Apr 2025
User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems
User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems
Jianling Wang
Yifan Liu
Yinghao Sun
Xuejian Ma
Yueqi Wang
...
Onkar Dalal
Ed Chi
Lichan Hong
Ningren Han
Haokai Lu
31
0
0
07 Apr 2025
Counterfactual Inference under Thompson Sampling
Counterfactual Inference under Thompson Sampling
Olivier Jeunen
OffRL
LRM
35
0
0
03 Apr 2025
Finding Interest Needle in Popularity Haystack: Improving Retrieval by Modeling Item Exposure
Finding Interest Needle in Popularity Haystack: Improving Retrieval by Modeling Item Exposure
Amit Jaspal
Rahul Agarwal
32
0
0
31 Mar 2025
LapSum -- One Method to Differentiate Them All: Ranking, Sorting and Top-k Selection
Łukasz Struski
Michał B. Bednarczyk
Igor T. Podolak
Jacek Tabor
BDL
62
0
0
08 Mar 2025
Improving Retrospective Language Agents via Joint Policy Gradient Optimization
Xueyang Feng
Bo Lan
Quanyu Dai
Lei Wang
Jiakai Tang
X. Chen
Zhenhua Dong
Zhicheng Dou
LLMAG
67
0
0
03 Mar 2025
Conversational Planning for Personal Plans
Konstantina Christakopoulou
Iris Qu
John Canny
Andrew Goodridge
Cj Adams
Minmin Chen
Maja Matarić
LLMAG
LM&Ro
62
0
0
26 Feb 2025
Producers Equilibria and Dynamics in Engagement-Driven Recommender Systems
Producers Equilibria and Dynamics in Engagement-Driven Recommender Systems
Krishna Acharya
Varun Vangala
Jingyan Wang
Juba Ziani
99
3
0
21 Feb 2025
GraCo -- A Graph Composer for Integrated Circuits
GraCo -- A Graph Composer for Integrated Circuits
Stefan Uhlich
Andrea Bonetti
Arun Venkitaraman
Ali Momeni
Ryoga Matsuo
Chia-Yu Hsieh
Eisaku Ohbuchi
Lorenzo Servadei
GNN
95
0
0
21 Nov 2024
Primal-Dual Spectral Representation for Off-policy Evaluation
Primal-Dual Spectral Representation for Off-policy Evaluation
Yang Hu
Tianyi Chen
Na Li
Kai Wang
Bo Dai
OffRL
32
0
0
23 Oct 2024
Ranking Policy Learning via Marketplace Expected Value Estimation From
  Observational Data
Ranking Policy Learning via Marketplace Expected Value Estimation From Observational Data
Ehsan Ebrahimzadeh
Nikhil Monga
Hang Gao
Alex Cozzi
Abraham Bagherjeiran
CML
OffRL
27
0
0
06 Oct 2024
Minimizing Live Experiments in Recommender Systems: User Simulation to
  Evaluate Preference Elicitation Policies
Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies
Chih-Wei Hsu
Martin Mladenov
Ofer Meshi
James Pine
Hubert Pham
...
Xujian Liang
Anton Polishko
Li Yang
Ben Scheetz
Craig Boutilier
OffRL
25
2
0
26 Sep 2024
FedSlate:A Federated Deep Reinforcement Learning Recommender System
FedSlate:A Federated Deep Reinforcement Learning Recommender System
Yongxin Deng
Xihe Qiu
Xiaoyu Tan
Yaochu Jin
FedML
96
0
0
23 Sep 2024
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise
  Recommendation
Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation
Luo Ji
Gao Liu
Mingyang Yin
Hongxia Yang
Jingren Zhou
23
0
0
11 Sep 2024
Adaptive User Journeys in Pharma E-Commerce with Reinforcement Learning:
  Insights from SwipeRx
Adaptive User Journeys in Pharma E-Commerce with Reinforcement Learning: Insights from SwipeRx
Ana Fernández del Río
Michael Brennan Leong
Paulo Saraiva
Ivan Nazarov
Aditya Rastogi
Moiz Hassan
Dexian Tang
África Periánez
OffRL
OnRL
42
2
0
15 Aug 2024
Adaptive Behavioral AI: Reinforcement Learning to Enhance Pharmacy
  Services
Adaptive Behavioral AI: Reinforcement Learning to Enhance Pharmacy Services
Ana Fernández del Río
Michael Brennan Leong
Paulo Saraiva
Ivan Nazarov
Aditya Rastogi
Moiz Hassan
Dexian Tang
África Periánez
OffRL
24
3
0
14 Aug 2024
Learned Ranking Function: From Short-term Behavior Predictions to
  Long-term User Satisfaction
Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction
Yi Wu
Daryl Chang
Jennifer She
Zhe Zhao
Li Wei
Lukasz Heldt
27
0
0
12 Aug 2024
Optimizing Novelty of Top-k Recommendations using Large Language Models
  and Reinforcement Learning
Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning
Amit Sharma
Hua Li
Xue Li
Jian Jiao
LRM
39
0
0
20 Jun 2024
Low-Redundant Optimization for Large Language Model Alignment
Low-Redundant Optimization for Large Language Model Alignment
Zhipeng Chen
Kun Zhou
Wayne Xin Zhao
Jingyuan Wang
Ji-Rong Wen
39
2
0
18 Jun 2024
Adaptively Learning to Select-Rank in Online Platforms
Adaptively Learning to Select-Rank in Online Platforms
Jingyuan Wang
Perry Dong
Ying Jin
Ruohan Zhan
Zhengyuan Zhou
CML
35
0
0
07 Jun 2024
DEER: A Delay-Resilient Framework for Reinforcement Learning with
  Variable Delays
DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays
Bo Xia
Yilun Kong
Yongzhe Chang
Bo Yuan
Zhiheng Li
Xueqian Wang
Bin Liang
OffRL
50
3
0
05 Jun 2024
SUBER: An RL Environment with Simulated Human Behavior for Recommender
  Systems
SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems
Nathan Corecco
Giorgio Piatti
Luca A. Lanzendörfer
Flint Xiaofeng Fan
Roger Wattenhofer
OffRL
29
2
0
01 Jun 2024
LLMs for User Interest Exploration in Large-scale Recommendation Systems
LLMs for User Interest Exploration in Large-scale Recommendation Systems
Jianling Wang
Haokai Lu
Yifan Liu
He Ma
Yueqi Wang
...
Ningren Han
Shuchao Bi
Lexi Baugher
Ed H. Chi
Minmin Chen
45
5
0
25 May 2024
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection
  and Learning
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
Otmane Sakhi
Imad Aouali
Pierre Alquier
Nicolas Chopin
OffRL
43
1
0
23 May 2024
Optimal Baseline Corrections for Off-Policy Contextual Bandits
Optimal Baseline Corrections for Off-Policy Contextual Bandits
Shashank Gupta
Olivier Jeunen
Harrie Oosterhuis
Maarten de Rijke
33
7
0
09 May 2024
Multi-Objective Recommendation via Multivariate Policy Learning
Multi-Objective Recommendation via Multivariate Policy Learning
Olivier Jeunen
Jatin Mandav
Ivan Potapov
Nakul Agarwal
Sourabh Vaid
Wenzhe Shi
Aleksei Ustimenko
OffRL
18
3
0
03 May 2024
Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems
Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems
Xiaoshuang Chen
Gengrui Zhang
Yao Wang
Yulin Wu
Shuo Su
Kaiqiao Zhan
Ben Wang
OffRL
19
2
0
23 Apr 2024
Towards a Theoretical Understanding of Two-Stage Recommender Systems
Towards a Theoretical Understanding of Two-Stage Recommender Systems
Amit Kumar Jaiswal
29
2
0
23 Feb 2024
EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based
  Recommender Systems
EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender Systems
Yuanqing Yu
Chongming Gao
Jiawei Chen
Heng Tang
Yuefeng Sun
Qian Chen
Weizhi Ma
Min Zhang
OffRL
42
2
0
23 Feb 2024
UOEP: User-Oriented Exploration Policy for Enhancing Long-Term User
  Experiences in Recommender Systems
UOEP: User-Oriented Exploration Policy for Enhancing Long-Term User Experiences in Recommender Systems
Changshuo Zhang
Sirui Chen
Xiao Zhang
Sunhao Dai
Weijie Yu
Jun Xu
OffRL
35
1
0
17 Jan 2024
Towards Off-Policy Reinforcement Learning for Ranking Policies with
  Human Feedback
Towards Off-Policy Reinforcement Learning for Ranking Policies with Human Feedback
Teng Xiao
Suhang Wang
OffRL
33
8
0
17 Jan 2024
MultiSlot ReRanker: A Generic Model-based Re-Ranking Framework in
  Recommendation Systems
MultiSlot ReRanker: A Generic Model-based Re-Ranking Framework in Recommendation Systems
Q. Xiao
A. Muralidharan
B. Tiwana
Johnson Jia
Fedor Borisyuk
Aman Gupta
Dawn Woodard
OffRL
19
1
0
11 Jan 2024
Improving Large Language Models via Fine-grained Reinforcement Learning
  with Minimum Editing Constraint
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint
Zhipeng Chen
Kun Zhou
Wayne Xin Zhao
Junchen Wan
Fuzheng Zhang
Di Zhang
Ji-Rong Wen
KELM
39
32
0
11 Jan 2024
An Adaptive Framework of Geographical Group-Specific Network on O2O
  Recommendation
An Adaptive Framework of Geographical Group-Specific Network on O2O Recommendation
Luo Ji
Jiayu Mao
Hailong Shi
Qian Li
Yunfei Chu
Hongxia Yang
16
0
0
28 Dec 2023
Reinforcement Unlearning
Reinforcement Unlearning
Dayong Ye
Tianqing Zhu
Congcong Zhu
Derui Wang
Zewei Shi
Sheng Shen
Wanlei Zhou
Jason Xue
MU
28
7
0
26 Dec 2023
Adversarial Batch Inverse Reinforcement Learning: Learn to Reward from
  Imperfect Demonstration for Interactive Recommendation
Adversarial Batch Inverse Reinforcement Learning: Learn to Reward from Imperfect Demonstration for Interactive Recommendation
Jialin Liu
Xinyan Su
Zeyu He
Xiangyu Zhao
Jun Li
OffRL
23
0
0
30 Oct 2023
A General Neural Causal Model for Interactive Recommendation
A General Neural Causal Model for Interactive Recommendation
Jialin Liu
Xinyan Su
Peng Zhou
Xiangyu Zhao
Jun Li
CML
13
0
0
30 Oct 2023
Model-enhanced Contrastive Reinforcement Learning for Sequential
  Recommendation
Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation
Chengpeng Li
Zhengyi Yang
Jizhi Zhang
Jiancan Wu
Dingxian Wang
Xiangnan He
Xiang Wang
OffRL
29
1
0
25 Oct 2023
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey
Gregory Palmer
Chris Parry
Daniel J.B. Harrold
Chris Willis
AI4CE
21
1
0
11 Oct 2023
Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via
  Uniform Data
Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data
S.M.F. Sani
Seyed Abbas Hosseini
Hamid R. Rabiee
OffRL
27
1
0
07 Oct 2023
A General Offline Reinforcement Learning Framework for Interactive
  Recommendation
A General Offline Reinforcement Learning Framework for Interactive Recommendation
Teng Xiao
Donglin Wang
OffRL
34
73
0
01 Oct 2023
Maximum diffusion reinforcement learning
Maximum diffusion reinforcement learning
Thomas A. Berrueta
Allison Pinosky
Todd D. Murphey
AI4CE
DiffM
14
5
0
26 Sep 2023
Ad-load Balancing via Off-policy Learning in a Content Marketplace
Ad-load Balancing via Off-policy Learning in a Content Marketplace
Hitesh Sagtani
M. Jhawar
Rishabh Mehrotra
Olivier Jeunen
OffRL
25
6
0
19 Sep 2023
Modeling Recommender Ecosystems: Research Challenges at the Intersection
  of Mechanism Design, Reinforcement Learning and Generative Models
Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models
Craig Boutilier
Martin Mladenov
Guy Tennenholtz
OffRL
CML
44
8
0
08 Sep 2023
Model-free Reinforcement Learning with Stochastic Reward Stabilization
  for Recommender Systems
Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems
Tianchi Cai
Shenliao Bao
Jiyan Jiang
Shiji Zhou
Wenpeng Zhang
Lihong Gu
Jinjie Gu
Guannan Zhang
OffRL
31
2
0
25 Aug 2023
Master-slave Deep Architecture for Top-K Multi-armed Bandits with
  Non-linear Bandit Feedback and Diversity Constraints
Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints
Han Huang
Li Shen
Deheng Ye
Wei Liu
17
0
0
24 Aug 2023
Learning from Negative User Feedback and Measuring Responsiveness for
  Sequential Recommenders
Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders
Yueqi Wang
Yoni Halpern
Shuo Chang
Jingchen Feng
Elaine Ya Le
...
Minxue Huang
Shan Li
Alex Beutel
Yaping Zhang
Shuchao Bi
26
8
0
23 Aug 2023
On the Opportunities and Challenges of Offline Reinforcement Learning
  for Recommender Systems
On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems
Xiaocong Chen
Siyu Wang
Julian McAuley
Dietmar Jannach
Lina Yao
OffRL
21
5
0
22 Aug 2023
Towards Validating Long-Term User Feedbacks in Interactive
  Recommendation Systems
Towards Validating Long-Term User Feedbacks in Interactive Recommendation Systems
Hojoon Lee
Dongyoon Hwang
Kyushik Min
Jaegul Choo
18
6
0
22 Aug 2023
1234
Next