14

MERGE: Next-Generation Item Indexing Paradigm for Large-Scale Streaming Recommendation

Jing Yan
Yimeng Bai
Zongyu Liu
Yahui Liu
Junwei Wang
Jingze Huang
Haoda Li
Sihao Ding
Shaohui Ruan
Yang Zhang
Main:8 Pages
6 Figures
Bibliography:2 Pages
3 Tables
Appendix:1 Pages
Abstract

Item indexing, which maps a large corpus of items into compact discrete representations, is critical for both discriminative and generative recommender systems, yet existing Vector Quantization (VQ)-based approaches struggle with the highly skewed and non-stationary item distributions common in streaming industry recommenders, leading to poor assignment accuracy, imbalanced cluster occupancy, and insufficient cluster separation. To address these challenges, we propose MERGE, a next-generation item indexing paradigm that adaptively constructs clusters from scratch, dynamically monitors cluster occupancy, and forms hierarchical index structures via fine-to-coarse merging. Extensive experiments demonstrate that MERGE significantly improves assignment accuracy, cluster uniformity, and cluster separation compared with existing indexing methods, while online A/B tests show substantial gains in key business metrics, highlighting its potential as a foundational indexing approach for large-scale recommendation.

View on arXiv
Comments on this paper