
v1v2 (latest)
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Papers citing "Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts"
50 / 871 papers shown
Title |
---|
![]() Grounded Language-Image Pre-training Liunian Harold Li Pengchuan Zhang Haotian Zhang Jianwei Yang Chunyuan Li ...Lu Yuan Lei Zhang Lei Li Kai-Wei Chang Jianfeng Gao |