
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
Papers citing "Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning"
22 / 22 papers shown