ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.00829
18
3
v1v2 (latest)

COMPKE: Complex Question Answering under Knowledge Editing

1 June 2025
Keyuan Cheng
Zijian Kan
Zhixian He
Zhuoran Zhang
Muhammad Asif Ali
Ke Xu
Lijie Hu
Di Wang
    KELM
ArXiv (abs)PDFHTML
Main:11 Pages
10 Figures
Bibliography:3 Pages
11 Tables
Appendix:6 Pages
Abstract

Knowledge Editing, which efficiently modifies the knowledge in large language models, has gathered great attention. Current benchmarks primarily use multi-hop question answering to assess and analyze newly injected or updated knowledge. However, we argue that these benchmarks fail to effectively evaluate how well the updated models apply this knowledge in real-life scenarios, particularly when questions require complex reasoning, involving one-to-many relationships or multi-step logical intersections. To fill in this gap, we introduce a new benchmark, COMPKE: Complex Question Answering under Knowledge Editing, which includes 11,924 complex questions that reflect real-life situations. We conduct an extensive evaluation of four knowledge editing methods on COMPKE, revealing that their effectiveness varies notably across different models. For instance, MeLLo attains an accuracy of 39.47 on GPT-4O-MINI, but this drops sharply to 3.83 on QWEN2.5-3B. We further investigate the underlying causes of these disparities from both methodological and model-specific perspectives. The datasets are available at this https URL.

View on arXiv
@article{cheng2025_2506.00829,
  title={ COMPKE: Complex Question Answering under Knowledge Editing },
  author={ Keyuan Cheng and Zijian Kan and Zhixian He and Zhuoran Zhang and Muhammad Asif Ali and Ke Xu and Lijie Hu and Di Wang },
  journal={arXiv preprint arXiv:2506.00829},
  year={ 2025 }
}
Comments on this paper