Long-Tailed Visual Recognition via Permutation-Invariant Head-to-Tail Feature Fusion

The imbalanced distribution of long-tailed data presents a significant challenge for deep learning models, causing them to prioritize head classes while neglecting tail classes. Two key factors contributing to low recognition accuracy are the deformed representation space and a biased classifier, stemming from insufficient semantic information in tail classes. To address these issues, we propose permutation-invariant and head-to-tail feature fusion (PI-H2T), a highly adaptable method. PI-H2T enhances the representation space through permutation-invariant representation fusion (PIF), yielding more clustered features and automatic class margins. Additionally, it adjusts the biased classifier by transferring semantic information from head to tail classes via head-to-tail fusion (H2TF), improving tail class diversity. Theoretical analysis and experiments show that PI-H2T optimizes both the representation space and decision boundaries. Its plug-and-play design ensures seamless integration into existing methods, providing a straightforward path to further performance improvements. Extensive experiments on long-tailed benchmarks confirm the effectiveness of PI-H2T.
View on arXiv@article{li2025_2506.00625, title={ Long-Tailed Visual Recognition via Permutation-Invariant Head-to-Tail Feature Fusion }, author={ Mengke Li and Zhikai Hu and Yang Lu and Weichao Lan and Yiu-ming Cheung and Hui Huang }, journal={arXiv preprint arXiv:2506.00625}, year={ 2025 } }