Long-Tailed Visual Recognition via Permutation-Invariant Head-to-Tail Feature Fusion

31 May 2025

Main:9 Pages

8 Figures

Bibliography:3 Pages

Appendix:1 Pages

Abstract

The imbalanced distribution of long-tailed data presents a significant challenge for deep learning models, causing them to prioritize head classes while neglecting tail classes. Two key factors contributing to low recognition accuracy are the deformed representation space and a biased classifier, stemming from insufficient semantic information in tail classes. To address these issues, we propose permutation-invariant and head-to-tail feature fusion (PI-H2T), a highly adaptable method. PI-H2T enhances the representation space through permutation-invariant representation fusion (PIF), yielding more clustered features and automatic class margins. Additionally, it adjusts the biased classifier by transferring semantic information from head to tail classes via head-to-tail fusion (H2TF), improving tail class diversity. Theoretical analysis and experiments show that PI-H2T optimizes both the representation space and decision boundaries. Its plug-and-play design ensures seamless integration into existing methods, providing a straightforward path to further performance improvements. Extensive experiments on long-tailed benchmarks confirm the effectiveness of PI-H2T.

View on arXiv

@article{li2025_2506.00625,
  title={ Long-Tailed Visual Recognition via Permutation-Invariant Head-to-Tail Feature Fusion },
  author={ Mengke Li and Zhikai Hu and Yang Lu and Weichao Lan and Yiu-ming Cheung and Hui Huang },
  journal={arXiv preprint arXiv:2506.00625},
  year={ 2025 }
}

Comments on this paper