Leveraging Large Language Models (LLMs) for generative recommendation has attracted significant research interest, where item tokenization is a critical step. It involves assigning item identifiers for LLMs to encode user history and generate the next item. Existing approaches leverage either token-sequence identifiers, representing items as discrete token sequences, or single-token identifiers, using ID or semantic embeddings. Token-sequence identifiers face issues such as the local optima problem in beam search and low generation efficiency due to step-by-step generation. In contrast, single-token identifiers fail to capture rich semantics or encode Collaborative Filtering (CF) information, resulting in suboptimal performance.
View on arXiv@article{lin2025_2502.10833, title={ Order-agnostic Identifier for Large Language Model-based Generative Recommendation }, author={ Xinyu Lin and Haihan Shi and Wenjie Wang and Fuli Feng and Qifan Wang and See-Kiong Ng and Tat-Seng Chua }, journal={arXiv preprint arXiv:2502.10833}, year={ 2025 } }