Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.14356
Cited By
Semantic and Expressive Variation in Image Captions Across Languages
22 October 2023
Andre Ye
Sebastin Santy
Jena D. Hwang
Amy X. Zhang
Ranjay Krishna
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Semantic and Expressive Variation in Image Captions Across Languages"
12 / 12 papers shown
Title
Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models
Minh Duc Bui
K. Wense
Anne Lauscher
VLM
28
1
0
06 Nov 2024
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur
Darshan Singh
Makarand Tapaswi
115
1
0
04 Sep 2024
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal
Aditya Avinash
N. Alldrin
Jan Dlabal
Wenlei Zhou
...
Chun-Ta Lu
Howard Zhou
Ranjay Krishna
Ariel Fuxman
Tom Duerig
VLM
75
7
0
05 Mar 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
56
95
0
16 Feb 2024
Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing
Yong Cao
Wenyan Li
Jiaang Li
Yifei Yuan
Antonia Karamolegkou
Daniel Hershcovich
VLM
20
7
0
08 Feb 2024
Identifying the Correlation Between Language Distance and Cross-Lingual Transfer in a Multilingual Representation Space
Fred Philippy
Siwen Guo
Shohreh Haddadan
33
7
0
03 May 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,229
0
30 Jan 2023
Multilingual Multimodal Learning with Machine Translated Text
Chen Qiu
Dan Oneaţă
Emanuele Bugliarello
Stella Frank
Desmond Elliott
45
13
0
24 Oct 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
84
72
0
25 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Visually Grounded Reasoning across Languages and Cultures
Fangyu Liu
Emanuele Bugliarello
E. Ponti
Siva Reddy
Nigel Collier
Desmond Elliott
VLM
LRM
106
168
0
28 Sep 2021
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
208
310
0
02 Mar 2021
1