Mysterious Projections: Multimodal LLMs Gain Domain-Specific Visual Capabilities Without Richer Cross-Modal Projections

26 February 2024

Papers citing "Mysterious Projections: Multimodal LLMs Gain Domain-Specific Visual Capabilities Without Richer Cross-Modal Projections"

4 / 4 papers shown

Title
LLMs Can Compensate for Deficiencies in Visual Representations Sho Takishita Jay Gala Abdelrahman Mohamed Kentaro Inui Yova Kementchedjhieva VLM 54 0 0 05 Jun 2025
Visual Large Language Models for Generalized and Specialized Applications Yifan Li Zhixin Lai Wentao Bao Zhen Tan Anh Dao Kewei Sui Jiayi Shen Dong Liu Huan Liu Yu Kong VLM 177 15 0 06 Jan 2025
Phase Diagram of Vision Large Language Models Inference: A Perspective from Interaction across Image and Instruction Houjing Wei Hakaze Cho Yuting Shi MLLM 109 0 0 01 Nov 2024
CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning Peiyuan Liu Hang Guo Tao Dai Naiqi Li Jigang Bao Xudong Ren Yong Jiang Shu-Tao Xia AI4TS 172 31 0 12 Mar 2024