
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Papers citing "CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment"
8 / 58 papers shown
Title |
---|
![]() Movie Description Anna Rohrbach Atousa Torabi Marcus Rohrbach Niket Tandon C. Pal Hugo Larochelle Aaron Courville Bernt Schiele |