
v1v2 (latest)
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Yi Wang
Conghui He
Ping Luo
Ziwei Liu
Yu Qiao
Papers citing "InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation"
50 / 75 papers shown
Title |
---|