Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.02333
Cited By
Why do LLaVA Vision-Language Models Reply to Images in English?
2 July 2024
Musashi Hinck
Carolin Holtermann
M. L. Olson
Florian Schneider
Sungduk Yu
Anahita Bhiwandiwalla
Anne Lauscher
Shaoyen Tseng
Vasudev Lal
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Why do LLaVA Vision-Language Models Reply to Images in English?"
9 / 9 papers shown
Title
Behind Maya: Building a Multilingual Vision Language Model
Nahid Alam
Karthik Reddy Kanjula
Surya Guthikonda
Timothy Chung
Bala Krishna S Vegesna
...
Isha Chaturvedi
Genta Indra Winata
Ashvanth.S
Snehanshu Mukherjee
Alham Fikri Aji
MLLM
VLM
33
0
0
13 May 2025
Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization
Iñigo Pikabea
Iñaki Lacunza
Oriol Pareras
Carlos Escolano
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
VLM
52
0
0
28 Mar 2025
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
44
174
0
02 May 2024
What Is Missing in Multilingual Visual Reasoning and How to Fix It
Yueqi Song
Simran Khanuja
Graham Neubig
VLM
LRM
94
6
0
03 Mar 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
56
97
0
16 Feb 2024
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti
Suraj Nair
Ashwin Balakrishna
Percy Liang
Thomas Kollar
Dorsa Sadigh
MLLM
VLM
57
98
0
12 Feb 2024
Semantic and Expressive Variation in Image Captions Across Languages
Andre Ye
Sebastin Santy
Jena D. Hwang
Amy X. Zhang
Ranjay Krishna
VLM
56
3
0
22 Oct 2023
Linearly Mapping from Image to Text Space
Jack Merullo
Louis Castricato
Carsten Eickhoff
Ellie Pavlick
VLM
167
104
0
30 Sep 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
88
72
0
25 May 2022
1