See What You Are Told: Visual Attention Sink in Large Multimodal Models

5 March 2025

Papers citing "See What You Are Told: Visual Attention Sink in Large Multimodal Models"

5 / 5 papers shown

Title
Mitigate Language Priors in Large Vision-Language Models by Cross-Images Contrastive Decoding Jianfei Zhao Feng Zhang X. Sun Chong Feng MLLM 28 0 0 15 May 2025
Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs Pengkun Jiao Bin Zhu Jingjing Chen Chong-Wah Ngo Yu Jiang 38 0 0 13 Apr 2025
The Power of One: A Single Example is All it Takes for Segmentation in VLMs Mir Rayat Imtiaz Hossain Mennatullah Siam Leonid Sigal James J. Little MLLM VLM 79 0 0 13 Mar 2025
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models Mingi Jung Saehuyng Lee Eunji Kim Sungroh Yoon 68 0 0 03 Feb 2025
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation Chanyoung Kim Dayun Ju Woojung Han Ming-Hsuan Yang Seong Jae Hwang VLM VOS 79 0 0 26 Nov 2024