Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language
  Pre-training

Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training

Papers citing "Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training"