Title |
---|
![]() EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Yuxuan Zhang Tianheng Cheng Lianghui Zhu Lei Liu Heng Liu Longjin Ran Xiaoxin Chen Xiaoxin Chen Wenyu Liu Xinggang Wang |
![]() OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images
Interleaved with Text Qingyun Li Zhe Chen Weiyun Wang Wenhai Wang Shenglong Ye ...Dahua Lin Yu Qiao Botian Shi Conghui He Jifeng Dai |