ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08331
  4. Cited By
Accelerating Offline Reinforcement Learning Application in Real-Time
  Bidding and Recommendation: Potential Use of Simulation

Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation

17 September 2021
Haruka Kiyohara
K. Kawakami
Yuta Saito
    OffRL
ArXivPDFHTML

Papers citing "Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation"

15 / 15 papers shown
Title
Generative Auto-Bidding with Value-Guided Explorations
Generative Auto-Bidding with Value-Guided Explorations
Jingtong Gao
Yewen Li
Shuai Mao
Peng Jiang
Nan Jiang
...
Fei Pan
Peng Jiang
Kun Gai
Bo An
Xiangyu Zhao
OffRL
46
0
0
20 Apr 2025
AutoOPE: Automated Off-Policy Estimator Selection
AutoOPE: Automated Off-Policy Estimator Selection
Nicolò Felicioni
Michael Benigni
Maurizio Ferrari Dacrema
OffRL
24
1
0
26 Jun 2024
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning
  and How to Deal with It
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
Yuta Saito
Masahiro Nomura
OffRL
50
2
0
23 Apr 2024
Off-Policy Evaluation of Slate Bandit Policies via Optimizing
  Abstraction
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
25
5
0
03 Feb 2024
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy
  Evaluation
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
32
9
0
30 Nov 2023
SCOPE-RL: A Python Library for Offline Reinforcement Learning and
  Off-Policy Evaluation
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
ELM
37
4
0
30 Nov 2023
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Haruka Kiyohara
Masatoshi Uehara
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
Yuta Saito
OffRL
CML
30
8
0
26 Jun 2023
User Behavior Simulation with Large Language Model based Agents
User Behavior Simulation with Large Language Model based Agents
Lei Wang
Jingsen Zhang
Hao-ran Yang
Zhiyuan Chen
Jiakai Tang
...
Wayne Xin Zhao
Jun Xu
Zhicheng Dou
Jun Wang
Ji-Rong Wen
LM&Ro
LLMAG
27
40
0
05 Jun 2023
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Policy-Adaptive Estimator Selection for Off-Policy Evaluation
Takuma Udagawa
Haruka Kiyohara
Yusuke Narita
Yuta Saito
Keisuke Tateno
OffRL
27
23
0
25 Nov 2022
Synthetic Data-Based Simulators for Recommender Systems: A Survey
Synthetic Data-Based Simulators for Recommender Systems: A Survey
Elizaveta Stavinova
A. Grigorievskiy
A. Volodkevich
P. Chunaev
Klavdiya Olegovna Bochenina
D. Bugaychenko
SyDa
34
7
0
22 Jun 2022
Doubly Robust Off-Policy Evaluation for Ranking Policies under the
  Cascade Behavior Model
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model
Haruka Kiyohara
Yuta Saito
Tatsuya Matsuhiro
Yusuke Narita
N. Shimizu
Yasuo Yamamoto
OffRL
24
42
0
03 Feb 2022
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
222
419
0
16 Feb 2021
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rongjun Qin
Songyi Gao
Xingyuan Zhang
Zhen Xu
Shengkai Huang
Zewen Li
Weinan Zhang
Yang Yu
OffRL
140
79
0
01 Feb 2021
Counterfactual Evaluation of Slate Recommendations with Sequential
  Reward Interactions
Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions
James McInerney
B. Brost
Praveen Chandar
Rishabh Mehrotra
Ben Carterette
BDL
CML
OffRL
121
55
0
25 Jul 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
343
1,963
0
04 May 2020
1