Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing

Instruction-based image editing, which aims to modify the image faithfully according to the instruction while preserving irrelevant content unchanged, has made significant progress. However, there still lacks a comprehensive metric for assessing the editing quality. Existing metrics either require high human evaluation costs, which hinder large-scale evaluation, or are adapted from other tasks and lose task-specific concerns, failing to comprehensively evaluate both instruction-based modification and preservation of irrelevant regions, resulting in biased evaluation. To tackle this, we introduce a new metric called Balancing Preservation and Modification (BPM), tailored for instruction-based image editing by explicitly disentangling the image into editing-relevant and irrelevant regions for specific consideration. We first identify and locate editing-relevant regions, followed by a two-tier process to assess editing quality: Region-Aware Judge evaluates whether the position and size of the edited region align with the instruction, and Semantic-Aware Judge further assesses the instruction content compliance within editing-relevant regions as well as content preservation within irrelevant regions, yielding comprehensive and interpretable quality assessment. Moreover, the editing-relevant region localization in BPM can be integrated into image editing approaches to improve editing quality, demonstrating its broad applicability. We verify the effectiveness of the BPM metric on comprehensive instruction-editing data, and the results show the highest alignment with human evaluation compared to existing metrics, indicating its efficacy. Code is available at:this https URL
View on arXiv