How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads

21 May 2025

Papers citing "How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads"

1 / 1 papers shown

Title
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding Seil Kang Jinyeong Kim Junhyeok Kim Seong Jae Hwang VLM 127 4 0 08 Mar 2025