Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.08331
Cited By
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation
17 September 2021
Haruka Kiyohara
K. Kawakami
Yuta Saito
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation"
15 / 15 papers shown
Title
Generative Auto-Bidding with Value-Guided Explorations
Jingtong Gao
Yewen Li
Shuai Mao
Peng Jiang
Nan Jiang
...
Fei Pan
Peng Jiang
Kun Gai
Bo An
Xiangyu Zhao
OffRL
46
0
0
20 Apr 2025
AutoOPE: Automated Off-Policy Estimator Selection
Nicolò Felicioni
Michael Benigni
Maurizio Ferrari Dacrema
OffRL
24
1
0
26 Jun 2024
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
Yuta Saito
Masahiro Nomura
OffRL
50
2
0
23 Apr 2024
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
25
5
0
03 Feb 2024
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
32
9
0
30 Nov 2023
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
ELM
37
4
0
30 Nov 2023
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Haruka Kiyohara
Masatoshi Uehara
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
Yuta Saito
OffRL
CML
30
8
0
26 Jun 2023
User Behavior Simulation with Large Language Model based Agents
Lei Wang
Jingsen Zhang
Hao-ran Yang
Zhiyuan Chen
Jiakai Tang
...
Wayne Xin Zhao
Jun Xu
Zhicheng Dou
Jun Wang
Ji-Rong Wen
LM&Ro
LLMAG
27
40
0
05 Jun 2023
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Takuma Udagawa
Haruka Kiyohara
Yusuke Narita
Yuta Saito
Keisuke Tateno
OffRL
27
23
0
25 Nov 2022
Synthetic Data-Based Simulators for Recommender Systems: A Survey
Elizaveta Stavinova
A. Grigorievskiy
A. Volodkevich
P. Chunaev
Klavdiya Olegovna Bochenina
D. Bugaychenko
SyDa
34
7
0
22 Jun 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
Haruka Kiyohara
Yuta Saito
Tatsuya Matsuhiro
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
OffRL
24
42
0
03 Feb 2022
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
222
419
0
16 Feb 2021
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rongjun Qin
Songyi Gao
Xingyuan Zhang
Zhen Xu
Shengkai Huang
Zewen Li
Weinan Zhang
Yang Yu
OffRL
140
79
0
01 Feb 2021
Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions
James McInerney
B. Brost
Praveen Chandar
Rishabh Mehrotra
Ben Carterette
BDL
CML
OffRL
121
55
0
25 Jul 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
343
1,963
0
04 May 2020
1