Evolution of ReID: From Early Methods to LLM Integration
- VLM

Person re-identification (ReID) has evolved from handcrafted feature-based methods to deep learning approaches and, more recently, to models incorporating large language models (LLMs). Early methods struggled with variations in lighting, pose, and viewpoint, but deep learning addressed these issues by learning robust visual features. Building on this, LLMs now enable ReID systems to integrate semantic and contextual information through natural language. This survey traces that full evolution and offers one of the first comprehensive reviews of ReID approaches that leverage LLMs, where textual descriptions are used as privileged information to improve visual matching. A key contribution is the use of dynamic, identity-specific prompts generated by GPT-4o, which enhance the alignment between images and text in vision-language ReID systems. Experimental results show that these descriptions improve accuracy, especially in complex or ambiguous cases. To support further research, we release a large set of GPT-4o-generated descriptions for standard ReID datasets. By bridging computer vision and natural language processing, this survey offers a unified perspective on the field's development and outlines key future directions such as better prompt design, cross-modal transfer learning, and real-world adaptability.
View on arXiv@article{bhuiyan2025_2506.13039, title={ Evolution of ReID: From Early Methods to LLM Integration }, author={ Amran Bhuiyan and Mizanur Rahman and Md Tahmid Rahman Laskar and Aijun An and Jimmy Xiangji Huang }, journal={arXiv preprint arXiv:2506.13039}, year={ 2025 } }