Title |
---|
![]() MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal
Dataset with One Trillion Tokens Anas Awadalla Le Xue Oscar Lo Manli Shu Hannah Lee ...Silvio Savarese Caiming Xiong Ran Xu Yejin Choi Ludwig Schmidt |
![]() OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images
Interleaved with Text Qingyun Li Zhe Chen Weiyun Wang Wenhai Wang Shenglong Ye ...Dahua Lin Yu Qiao Botian Shi Conghui He Jifeng Dai |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |