ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.01001
40
0

Towards An Efficient LLM Training Paradigm for CTR Prediction

2 March 2025
Allen Lin
Renqin Cai
Yun He
Hanchao Yu
Jing Qian
Rui Li
Qifan Wang
James Caverlee
ArXivPDFHTML
Abstract

Large Language Models (LLMs) have demonstrated tremendous potential as the next-generation ranking-based recommendation system. Many recent works have shown that LLMs can significantly outperform conventional click-through-rate (CTR) prediction approaches. Despite such promising results, the computational inefficiency inherent in the current training paradigm makes it particularly challenging to train LLMs for ranking-based recommendation tasks on large datasets. To train LLMs for CTR prediction, most existing studies adopt the prevalent ''sliding-window'' paradigm. Given a sequence of mmm user interactions, a unique training prompt is constructed for each interaction by designating it as the prediction target along with its preceding nnn interactions serving as context. In turn, the sliding-window paradigm results in an overall complexity of O(mn2)O(mn^2)O(mn2) that scales linearly with the length of user interactions. Consequently, a direct adoption to train LLMs with such strategy can result in prohibitively high training costs as the length of interactions grows. To alleviate the computational inefficiency, we propose a novel training paradigm, namely Dynamic Target Isolation (DTI), that structurally parallelizes the training of kkk (where k>>1k >> 1k>>1) target interactions. Furthermore, we identify two major bottlenecks - hidden-state leakage and positional bias overfitting - that limit DTI to only scale up to a small value of kkk (e.g., 5) then propose a computationally light solution to effectively tackle each. Through extensive experiments on three widely adopted public CTR datasets, we empirically show that DTI reduces training time by an average of \textbf{92%} (e.g., from 70.570.570.5 hrs to 5.315.315.31 hrs), without compromising CTR prediction performance.

View on arXiv
@article{lin2025_2503.01001,
  title={ Towards An Efficient LLM Training Paradigm for CTR Prediction },
  author={ Allen Lin and Renqin Cai and Yun He and Hanchao Yu and Jing Qian and Rui Li and Qifan Wang and James Caverlee },
  journal={arXiv preprint arXiv:2503.01001},
  year={ 2025 }
}
Comments on this paper