v1v2 (latest)

Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations

3 October 2024

Papers citing "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"

50 / 55 papers shown

Title
Investigating Mechanisms for In-Context Vision Language Binding Darshana Saravanan Makarand Tapaswi Vineet Gandhi 38 0 0 28 May 2025
Debiasing CLIP: Interpreting and Correcting Bias in Attention Heads Wei Jie Yeo Rui Mao Moloud Abdar Erik Cambria Ranjan Satapathy 135 0 0 23 May 2025
Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts Michal Golovanevsky William Rudman Michael Lepori Amir Bar Ritambhara Singh Carsten Eickhoff 89 0 0 21 May 2025
Mitigating Hallucinations via Inter-Layer Consistency Aggregation in Large Vision-Language Models Kai Tang Jinhao You Xiuqi Ge Hanze Li Yichen Guo Xiande Huang MLLM 153 0 0 18 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding Zongxia Li Xiyang Wu Guangyao Shi Yubin Qin Hongyang Du Tianyi Zhou Dinesh Manocha Jordan Lee Boyd-Graber MLLM 112 0 0 02 May 2025
LogicQA: Logical Anomaly Detection with Vision Language Model Generated Questions Yejin Kwon Daeun Moon Youngje Oh Hyunsoo Yoon 132 0 0 26 Mar 2025
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation Victor Shea-Jay Huang Le Zhuo Yi Xin Zhaokai Wang Peng Gao Hongsheng Li DiffM 160 1 0 10 Mar 2025
EAZY: Eliminating Hallucinations in LVLMs by Zeroing out Hallucinatory Image Tokens Liwei Che Tony Qingze Liu Jing Jia Weiyi Qin Ruixiang Tang Vladimir Pavlovic MLLM VLM 172 2 0 10 Mar 2025
Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding Seil Kang Jinyeong Kim Junhyeok Kim Seong Jae Hwang VLM 123 4 0 08 Mar 2025
Treble Counterfactual VLMs: A Causal Approach to Hallucination Li Li Jiashu Qu Yuxiao Zhou Yuehan Qin Tiankai Yang Yue Zhao 143 2 0 08 Mar 2025
Forgotten Polygons: Multimodal Large Language Models are Shape-Blind William Rudman Michal Golovanesky Amir Bar Vedant Palit Yann LeCun Carsten Eickhoff Ritambhara Singh LRM 152 4 0 21 Feb 2025
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders Bartosz Cywiński Kamil Deja DiffM 122 9 0 29 Jan 2025
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models Ido Cohen Daniela Gottesman Mor Geva Raja Giryes VLM 170 1 1 18 Dec 2024
Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens Zhangqi Jiang Junkai Chen Beier Zhu Tingjin Luo Yankun Shen Xu Yang 162 7 0 23 Nov 2024
Towards Interpreting Visual Information Processing in Vision-Language Models Clement Neo Luke Ong Philip Torr Mor Geva David M. Krueger Fazl Barez 131 15 0 09 Oct 2024
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models Jinrui Zhang Teng Wang Haigang Zhang Ping Lu Feng Zheng MLLM LRM VLM 83 4 0 16 Jul 2024
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation Michal Golovanevsky William Rudman Vedant Palit Ritambhara Singh Carsten Eickhoff 118 3 0 24 Jun 2024
MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model Jiahao Huo Yibo Yan Boren Hu Yutao Yue Xuming Hu LRM MLLM 97 8 0 17 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP Yossi Gandelsman Alexei A. Efros Jacob Steinhardt MILM 126 24 0 06 Jun 2024
Hallucination of Multimodal Large Language Models: A Survey Zechen Bai Pichao Wang Tianjun Xiao Tong He Zongbo Han Zheng Zhang Mike Zheng Shou VLM LRM 222 197 0 29 Apr 2024
A Multimodal Automated Interpretability Agent Tamar Rott Shaham Sarah Schwettmann Franklin Wang Achyuta Rajaram Evan Hernandez Jacob Andreas Antonio Torralba 202 27 0 22 Apr 2024
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models Weihang Su Changyue Wang Qingyao Ai Hu Yiran Zhijing Wu Yujia Zhou Yiqun Liu HILM 110 32 0 11 Mar 2024
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space Shaolei Zhang Tian Yu Yang Feng HILM KELM 87 52 0 27 Feb 2024
CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models Ziyue Wang Chi Chen Zihao Wan Zhaolu Kang Qidong Yan ... Xiaoyue Mi Peng Li Ning Ma Maosong Sun Yang Liu 92 8 0 21 Feb 2024
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection Chao Chen Kai-Chun Liu Ze Chen Yi Gu Yue-bo Wu Mingyuan Tao Zhihang Fu Jieping Ye HILM 128 110 0 06 Feb 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Asma Ghandeharioun Avi Caciularu Adam Pearce Lucas Dixon Mor Geva 126 114 0 11 Jan 2024
LLM Factoscope: Uncovering LLMs' Factual Discernment through Inner States Analysis Jinwen He Yujia Gong Kai-xiang Chen Zijin Lin Chengán Wei Yue Zhao 58 3 0 27 Dec 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation Qidong Huang Xiao-wen Dong Pan Zhang Bin Wang Conghui He Jiaqi Wang Dahua Lin Weiming Zhang Neng H. Yu MLLM 130 206 0 29 Nov 2023
Woodpecker: Hallucination Correction for Multimodal Large Language Models Shukang Yin Chaoyou Fu Sirui Zhao Tong Xu Hao Wang Dianbo Sui Yunhang Shen Ke Li Xingguo Sun Enhong Chen VLM MLLM 96 133 0 24 Oct 2023
Interpreting and Controlling Vision Foundation Models via Text Explanations Haozhe Chen Junfeng Yang Carl Vondrick Chengzhi Mao 75 3 0 16 Oct 2023
Interpreting CLIP's Image Representation via Text-Based Decomposition Yossi Gandelsman Alexei A. Efros Jacob Steinhardt VLM 69 101 0 09 Oct 2023
Improved Baselines with Visual Instruction Tuning Haotian Liu Chunyuan Li Yuheng Li Yong Jae Lee VLM MLLM 181 2,826 0 05 Oct 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models Yiyang Zhou Chenhang Cui Jaehong Yoon Linjun Zhang Zhun Deng Chelsea Finn Mohit Bansal Huaxiu Yao MLLM 152 186 0 01 Oct 2023
Sparse Autoencoders Find Highly Interpretable Features in Language Models Hoagy Cunningham Aidan Ewart Logan Riggs R. Huben Lee Sharkey MILM 135 448 0 15 Sep 2023
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP Vedant Palit Rohan Pandey Aryaman Arora Paul Pu Liang 86 23 0 27 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers Sarah Schwettmann Neil Chowdhury Samuel J. Klein David Bau Antonio Torralba MILM 86 32 0 03 Aug 2023
Overthinking the Truth: Understanding how Language Models Process False Demonstrations Danny Halawi Jean-Stanislas Denain Jacob Steinhardt 81 59 0 18 Jul 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation Neeraj Varshney Wenlin Yao Hongming Zhang Jianshu Chen Dong Yu HILM 111 175 0 08 Jul 2023
Rosetta Neurons: Mining the Common Units in a Model Zoo Amil Dravid Yossi Gandelsman Alexei A. Efros Assaf Shocher 73 31 0 15 Jun 2023
Evaluating Object Hallucination in Large Vision-Language Models Yifan Li Yifan Du Kun Zhou Jinpeng Wang Wayne Xin Zhao Ji-Rong Wen MLLM LRM 312 814 0 17 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability Arthur Conmy Augustine N. Mavor-Parker Aengus Lynch Stefan Heimersheim Adrià Garriga-Alonso 68 319 0 28 Apr 2023
The Internal State of an LLM Knows When It's Lying A. Azaria Tom Michael Mitchell HILM 321 345 0 26 Apr 2023
Eliciting Latent Predictions from Transformers with the Tuned Lens Nora Belrose Zach Furman Logan Smith Danny Halawi Igor V. Ostrovsky Lev McKinney Stella Biderman Jacob Steinhardt 92 230 0 14 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 432 4,656 0 30 Jan 2023
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task Kenneth Li Aspen K. Hopkins David Bau Fernanda Viégas Hanspeter Pfister Martin Wattenberg MILM 151 297 0 24 Oct 2022
Locating and Editing Factual Associations in GPT Kevin Meng David Bau A. Andonian Yonatan Belinkov KELM 253 1,389 0 10 Feb 2022
Survey of Hallucination in Natural Language Generation Ziwei Ji Nayeon Lee Rita Frieske Tiezheng Yu D. Su ... Delong Chen Wenliang Dai Ho Shu Chan Andrea Madotto Pascale Fung HILM LRM 250 2,449 0 08 Feb 2022
What if This Modified That? Syntactic Interventions via Counterfactual Embeddings Mycal Tucker Peng Qian R. Levy 66 40 0 28 May 2021
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers Hila Chefer Shir Gur Lior Wolf ViT 77 325 0 29 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 1.0K 29,926 0 26 Feb 2021