71
2

DependEval: Benchmarking LLMs for Repository Dependency Understanding

Abstract

While large language models (LLMs) have shown considerable promise in code generation, real-world software development demands advanced repository-level reasoning. This includes understanding dependencies, project structures, and managing multi-file changes. However, the ability of LLMs to effectively comprehend and handle complex code repositories has yet to be fully explored. To address challenges, we introduce a hierarchical benchmark designed to evaluate repository dependency understanding (DependEval). Benchmark is based on 15,576 repositories collected from real-world websites. It evaluates models on three core tasks: Dependency Recognition, Repository Construction, and Multi-file Editing, across 8 programming languages from actual code repositories. Our evaluation of over 25 LLMs reveals substantial performance gaps and provides valuable insights into repository-level code understanding.

View on arXiv
@article{du2025_2503.06689,
  title={ DependEval: Benchmarking LLMs for Repository Dependency Understanding },
  author={ Junjia Du and Yadi Liu and Hongcheng Guo and Jiawei Wang and Haojian Huang and Yunyi Ni and Zhoujun Li },
  journal={arXiv preprint arXiv:2503.06689},
  year={ 2025 }
}
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.