
v1v2 (latest)
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Papers citing "Align before Fuse: Vision and Language Representation Learning with Momentum Distillation"
50 / 1,231 papers shown
Title |
---|
![]() Attentive Mask CLIP Yifan Yang Weiquan Huang Yixuan Wei Houwen Peng Xinyang Jiang ...Fangyun Wei Yin Wang Han Hu Lili Qiu Yuqing Yang |