15
0

Equally Critical: Samples, Targets, and Their Mappings in Datasets

Main:9 Pages
18 Figures
Bibliography:3 Pages
10 Tables
Appendix:18 Pages
Abstract

Data inherently possesses dual attributes: samples and targets. For targets, knowledge distillation has been widely employed to accelerate model convergence, primarily relying on teacher-generated soft target supervision. Conversely, recent advancements in data-efficient learning have emphasized sample optimization techniques, such as dataset distillation, while neglected the critical role of target. This dichotomy motivates our investigation into understanding how both sample and target collectively influence training dynamic. To address this gap, we first establish a taxonomy of existing paradigms through the lens of sample-target interactions, categorizing them into distinct sample-to-target mapping strategies. Building upon this foundation, we then propose a novel unified loss framework to assess their impact on training efficiency. Through extensive empirical studies on our proposed strategies, we comprehensively analyze how variations in target and sample types, quantities, and qualities influence model training, providing six key insights to enhance training efficacy.

View on arXiv
@article{yang2025_2506.01987,
  title={ Equally Critical: Samples, Targets, and Their Mappings in Datasets },
  author={ Runkang Yang and Peng Sun and Xinyi Shang and Yi Tang and Tao Lin },
  journal={arXiv preprint arXiv:2506.01987},
  year={ 2025 }
}
Comments on this paper