FingerVeinSyn-5M: A Million-Scale Dataset and Benchmark for Finger Vein Recognition

A major challenge in finger vein recognition is the lack of large-scale public datasets. Existing datasets contain few identities and limited samples per finger, restricting the advancement of deep learning-based methods. To address this, we introduce FVeinSyn, a synthetic generator capable of producing diverse finger vein patterns with rich intra-class variations. Using FVeinSyn, we created FingerVeinSyn-5M -- the largest available finger vein dataset -- containing 5 million samples from 50,000 unique fingers, each with 100 variations including shift, rotation, scale, roll, varying exposure levels, skin scattering blur, optical blur, and motion blur. FingerVeinSyn-5M is also the first to offer fully annotated finger vein images, supporting deep learning applications in this field. Models pretrained on FingerVeinSyn-5M and fine-tuned with minimal real data achieve an average 53.91\% performance gain across multiple benchmarks. The dataset is publicly available at:this https URL.
View on arXiv@article{wang2025_2506.03635, title={ FingerVeinSyn-5M: A Million-Scale Dataset and Benchmark for Finger Vein Recognition }, author={ Yinfan Wang and Jie Gui and Baosheng Yu and Qi Li and Zhenan Sun and Juho Kannala and Guoying Zhao }, journal={arXiv preprint arXiv:2506.03635}, year={ 2025 } }