Predicting user behavior is essential for intelligent assistant services, yet deep learning models often struggle to capture long-tailed behaviors. Large language models (LLMs), with their pretraining on vast corpora containing rich behavioral knowledge, offer promise. However, existing fine-tuning approaches tend to overfit to frequent ``anchor'' behaviors, reducing their ability to predict less common ``tail'' behaviors. In this paper, we introduce BehaviorLM, a progressive fine-tuning approach that addresses this issue. In the first stage, LLMs are fine-tuned on anchor behaviors while preserving general behavioral knowledge. In the second stage, fine-tuning uses a balanced subset of all behaviors based on sample difficulty to improve tail behavior predictions without sacrificing anchor performance. Experimental results on two real-world datasets demonstrate that BehaviorLM robustly predicts both anchor and tail behaviors and effectively leverages LLM behavioral knowledge to master tail behavior prediction with few-shot examples.
View on arXiv@article{meng2025_2505.17682, title={ Tuning Language Models for Robust Prediction of Diverse User Behaviors }, author={ Fanjin Meng and Jingtao Ding and Jiahui Gong and Chen Yang and Hong Chen and Zuojian Wang and Haisheng Lu and Yong Li }, journal={arXiv preprint arXiv:2505.17682}, year={ 2025 } }