ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.11431
  4. Cited By
Local Policy Improvement for Recommender Systems

Local Policy Improvement for Recommender Systems

22 December 2022
Dawen Liang
N. Vlassis
    OffRL
ArXivPDFHTML

Papers citing "Local Policy Improvement for Recommender Systems"

36 / 36 papers shown
Title
Supervised Advantage Actor-Critic for Recommender Systems
Supervised Advantage Actor-Critic for Recommender Systems
Xin Xin
Alexandros Karatzoglou
Ioannis Arapakis
J. Jose
OffRL
49
30
0
05 Nov 2021
Evaluating the Robustness of Off-Policy Evaluation
Evaluating the Robustness of Off-Policy Evaluation
Yuta Saito
Takuma Udagawa
Haruka Kiyohara
Kazuki Mogi
Yusuke Narita
Kei Tateno
ELM
OffRL
25
38
0
31 Aug 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
73
169
0
16 Jun 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Michael Janner
Qiyang Li
Sergey Levine
OffRL
118
675
0
03 Jun 2021
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer: Reinforcement Learning via Sequence Modeling
Lili Chen
Kevin Lu
Aravind Rajeswaran
Kimin Lee
Aditya Grover
Michael Laskin
Pieter Abbeel
A. Srinivas
Igor Mordatch
OffRL
110
1,638
0
02 Jun 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale
  of Pessimism
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
208
289
0
22 Mar 2021
Off-policy Bandits with Deficient Support
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
172
75
0
16 Jun 2020
Self-Supervised Reinforcement Learning for Recommender Systems
Self-Supervised Reinforcement Learning for Recommender Systems
Xin Xin
Alexandros Karatzoglou
Ioannis Arapakis
J. Jose
SSL
OffRL
128
201
0
10 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
131
1,809
0
08 Jun 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
542
2,022
0
04 May 2020
Behavior Regularized Offline Reinforcement Learning
Behavior Regularized Offline Reinforcement Learning
Yifan Wu
George Tucker
Ofir Nachum
OffRL
85
684
0
26 Nov 2019
Advantage-Weighted Regression: Simple and Scalable Off-Policy
  Reinforcement Learning
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
Xue Bin Peng
Aviral Kumar
Grace Zhang
Sergey Levine
OffRL
128
556
0
01 Oct 2019
BERT4Rec: Sequential Recommendation with Bidirectional Encoder
  Representations from Transformer
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
Fei Sun
Jun Liu
Jian Wu
Changhua Pei
Xiao Lin
Wenwu Ou
Peng Jiang
BDL
HAI
174
2,166
0
14 Apr 2019
Imitation-Regularized Offline Learning
Imitation-Regularized Offline Learning
Yifei Ma
Yu Wang
Balakrishnan
Balakrishnan Narayanaswamy
OffRL
53
22
0
15 Jan 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
210
1,604
0
07 Dec 2018
Top-K Off-Policy Correction for a REINFORCE Recommender System
Top-K Off-Policy Correction for a REINFORCE Recommender System
Minmin Chen
Alex Beutel
Paul Covington
Sagar Jain
Francois Belletti
Ed H. Chi
CML
OffRL
114
479
0
06 Dec 2018
Personalized Top-N Sequential Recommendation via Convolutional Sequence
  Embedding
Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding
Jiaxi Tang
Ke Wang
BDL
66
1,702
0
19 Sep 2018
Self-Attentive Sequential Recommendation
Self-Attentive Sequential Recommendation
Wang-Cheng Kang
Julian McAuley
HAI
BDL
147
2,429
0
20 Aug 2018
A Simple Convolutional Generative Network for Next Item Recommendation
A Simple Convolutional Generative Network for Next Item Recommendation
Fajie Yuan
Alexandros Karatzoglou
Ioannis Arapakis
J. Jose
Xiangnan He
56
549
0
15 Aug 2018
Recommendations with Negative Feedback via Pairwise Deep Reinforcement
  Learning
Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning
Xiangyu Zhao
Li Zhang
Zhuoye Ding
Long Xia
Jiliang Tang
Dawei Yin
84
332
0
19 Feb 2018
Variational Autoencoders for Collaborative Filtering
Variational Autoencoders for Collaborative Filtering
Dawen Liang
Rahul G. Krishnan
Matthew D. Hoffman
Tony Jebara
BDL
179
1,239
0
16 Feb 2018
Offline A/B testing for Recommender Systems
Offline A/B testing for Recommender Systems
Alexandre Gilotte
Clément Calauzènes
Thomas Nedelec
A. Abraham
Simon Dollé
OffRL
67
221
0
22 Jan 2018
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
448
19,006
0
20 Jul 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
110
1,322
0
30 May 2017
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
413
576
0
04 Apr 2016
Recommendations as Treatments: Debiasing Learning and Evaluation
Recommendations as Treatments: Debiasing Learning and Evaluation
Tobias Schnabel
Adith Swaminathan
Ashudeep Singh
Navin Chandak
Thorsten Joachims
CML
160
685
0
17 Feb 2016
Variational Inference: A Review for Statisticians
Variational Inference: A Review for Statisticians
David M. Blei
A. Kucukelbir
Jon D. McAuliffe
BDL
250
4,787
0
04 Jan 2016
Modeling User Exposure in Recommendation
Modeling User Exposure in Recommendation
Dawen Liang
Laurent Charlin
James McInerney
David M. Blei
95
390
0
23 Oct 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,764
0
19 Feb 2015
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
Adith Swaminathan
Thorsten Joachims
OffRL
117
166
0
09 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
Counterfactual Reasoning and Learning Systems
Counterfactual Reasoning and Learning Systems
Léon Bottou
J. Peters
J. Q. Candela
Denis Xavier Charles
D. M. Chickering
Elon Portugaly
Dipankar Ray
Patrice Y. Simard
Edward Snelson
CML
OffRL
371
783
0
11 Sep 2012
Collaborative Filtering and the Missing at Random Assumption
Collaborative Filtering and the Missing at Random Assumption
Benjamin M. Marlin
R. Zemel
S. Roweis
Malcolm Slaney
71
313
0
20 Jun 2012
Doubly Robust Policy Evaluation and Learning
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
331
697
0
23 Mar 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
448
2,949
0
28 Feb 2010
Learning from Logged Implicit Exploration Data
Learning from Logged Implicit Exploration Data
Alexander L. Strehl
John Langford
Sham Kakade
Lihong Li
OffRL
179
255
0
27 Feb 2010
1