Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.01331
Cited By
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model
29 March 2024
Musashi Hinck
Matthew Lyle Olson
David Cobbley
Shao-Yen Tseng
Vasudev Lal
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model"
6 / 6 papers shown
Title
Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals
Phillip Howard
Kathleen C. Fraser
Anahita Bhiwandiwalla
S. Kiritchenko
103
12
0
30 May 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Shengbang Tong
Zhuang Liu
Yuexiang Zhai
Yi-An Ma
Yann LeCun
Saining Xie
VLM
MLLM
79
320
0
11 Jan 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
312
4,253
0
09 Jun 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
250
1,230
0
20 Sep 2022
TextCaps: a Dataset for Image Captioning with Reading Comprehension
Oleksii Sidorov
Ronghang Hu
Marcus Rohrbach
Amanpreet Singh
58
411
0
24 Mar 2020
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
322
3,224
0
02 Dec 2016
1