Context-Independent OCR with Multimodal LLMs: Effects of Image Resolution and Visual Complexity

31 March 2025

Papers citing "Context-Independent OCR with Multimodal LLMs: Effects of Image Resolution and Visual Complexity"

3 / 3 papers shown

Title
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning Ling Fu Biao Yang Zhebin Kuang Jiajun Song Yuzhe Li ... Jingqun Tang Wei Chen Lianwen Jin Yunxing Liu Xiang Bai 83 22 0 31 Dec 2024
VTimeLLM: Empower LLM to Grasp Video Moments Bin Huang Xin Wang Hong Chen Zihan Song Wenwu Zhu MLLM 124 125 0 30 Nov 2023
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Wenbo Hu Y. Xu Yuante Li W. Li Zhe Chen Zhuowen Tu MLLM VLM 73 132 0 19 Aug 2023