ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.13163
  4. Cited By
ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems

ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems

18 July 2024
Yi Zhang
Ruihong Qiu
Jiajun Liu
Sen Wang
    OffRL
ArXivPDFHTML

Papers citing "ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems"

20 / 20 papers shown
Title
KuaiRec: A Fully-observed Dataset and Insights for Evaluating
  Recommender Systems
KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems
Chongming Gao
Shijun Li
Wenqiang Lei
Jiawei Chen
Biao Li
Peng Jiang
Xiangnan He
Jiaxin Mao
Tat-Seng Chua
57
135
0
22 Feb 2022
What are the Statistical Limits of Offline RL with Linear Function
  Approximation?
What are the Statistical Limits of Offline RL with Linear Function Approximation?
Ruosong Wang
Dean Phillips Foster
Sham Kakade
OffRL
117
163
0
22 Oct 2020
Critic Regularized Regression
Critic Regularized Regression
Ziyun Wang
Alexander Novikov
Konrad Zolna
Jost Tobias Springenberg
Scott E. Reed
...
Noah Y. Siegel
J. Merel
Çağlar Gülçehre
N. Heess
Nando de Freitas
OffRL
134
320
0
26 Jun 2020
Self-Supervised Reinforcement Learning for Recommender Systems
Self-Supervised Reinforcement Learning for Recommender Systems
Xin Xin
Alexandros Karatzoglou
Ioannis Arapakis
J. Jose
SSL
OffRL
112
200
0
10 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
121
1,780
0
08 Jun 2020
MOPO: Model-based Offline Policy Optimization
MOPO: Model-based Offline Policy Optimization
Tianhe Yu
G. Thomas
Lantao Yu
Stefano Ermon
James Zou
Sergey Levine
Chelsea Finn
Tengyu Ma
OffRL
74
759
0
27 May 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
499
1,994
0
04 May 2020
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
76
939
0
19 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
102
1,044
0
03 Jun 2019
Off-Policy Deep Reinforcement Learning without Exploration
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto
David Meger
Doina Precup
OffRL
BDL
183
1,586
0
07 Dec 2018
Self-Attentive Sequential Recommendation
Self-Attentive Sequential Recommendation
Wang-Cheng Kang
Julian McAuley
HAI
BDL
120
2,390
0
20 Aug 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
243
8,236
0
04 Jan 2018
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
285
18,685
0
20 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
526
129,831
0
12 Jun 2017
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Ryan J. Lowe
Yi Wu
Aviv Tamar
J. Harb
Pieter Abbeel
Igor Mordatch
125
4,441
0
07 Jun 2017
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
Huifeng Guo
Ruiming Tang
Yunming Ye
Zhenguo Li
Xiuqiang He
99
2,625
0
13 Mar 2017
Recommendations as Treatments: Debiasing Learning and Evaluation
Recommendations as Treatments: Debiasing Learning and Evaluation
Tobias Schnabel
Adith Swaminathan
Ashudeep Singh
Navin Chandak
Thorsten Joachims
CML
129
679
0
17 Feb 2016
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
170
8,805
0
04 Feb 2016
Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
146
7,590
0
22 Sep 2015
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
Adith Swaminathan
Thorsten Joachims
OffRL
83
167
0
09 Feb 2015
1