16
v1v2 (latest)

Generalizable Multimodal Large Language Model Editing via Invariant Trajectory Learning

Jiajie Su
Haoyuan Wang
Xiaohua Feng
Yunshan Ma
Xiaobo Xia
Yuyuan Li
Xiaolin Zheng
Jianmao Xiao
Chaochao Chen
Main:8 Pages
6 Figures
Bibliography:3 Pages
6 Tables
Appendix:6 Pages
Abstract

Knowledge editing emerges as a crucial technique for efficiently correcting incorrect or outdated knowledge in large language models (LLM). Existing editing methods rely on a rigid mapping from parameter or module modifications to output, which causes the generalization limitation in Multimodal LLM (MLLM). In this paper, we reformulate MLLM editing as an out-of-distribution (OOD) generalization problem, where the goal is to discern semantic shift with factual shift and thus achieve robust editing among diverse cross-modal prompting. The key challenge of this OOD problem lies in identifying invariant causal trajectories that generalize accurately while suppressing spurious correlations. To address it, we propose ODEdit, a plug-and-play invariant learning based framework that optimizes the tripartite OOD risk objective to simultaneously enhance editing reliability, locality, andthis http URLfurther introduce an edit trajectory invariant learning method, which integrates a total variation penalty into the risk minimization objective to stabilize edit trajectories against environmental variations. Theoretical analysis and extensive experiments demonstrate the effectiveness of ODEdit.

View on arXiv
Comments on this paper