Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.10228
Cited By
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
14 June 2024
Chenyu Zhou
Mengdan Zhang
Peixian Chen
Chaoyou Fu
Yunhang Shen
Xiawu Zheng
Xing Sun
Rongrong Ji
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models"
4 / 4 papers shown
Title
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model
Anwen Hu
Yaya Shi
Haiyang Xu
Jiabo Ye
Qinghao Ye
Mingshi Yan
Chenliang Li
Qi Qian
Ji Zhang
Fei Huang
MLLM
66
25
0
30 Nov 2023
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
CoGe
82
415
0
07 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
601
9,009
0
28 Jan 2022
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
133
873
0
27 Nov 2018
1