ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.00894
37
3

CODEMENV: Benchmarking Large Language Models on Code Migration

1 June 2025
Keyuan Cheng
Xudong Shen
Yihao Yang
Tengyue Wang
Yang Cao
Muhammad Asif Ali
Hanbin Wang
Lijie Hu
Di Wang
ArXiv (abs)PDFHTML
Main:9 Pages
5 Figures
Bibliography:2 Pages
8 Tables
Appendix:14 Pages
Abstract

Large language models (LLMs) have shown remarkable capabilities across various software engineering tasks; however, their effectiveness in code migration, adapting code to run in different environments, remains insufficiently studied. In this work, we introduce CODEMENV: Code Migration Across Environment, a new benchmark specifically designed to assess LLMs' abilities in code migration scenarios. CODEMENV consists of 922 examples spanning 19 Python and Java packages, and covers three core tasks: (1) identifying functions incompatible with specific versions, (2) detecting changes in function definitions, and (3) adapting code to target environments. Experimental evaluation with seven LLMs on CODEMENV yields an average pass@1 rate of 26.50%, with GPT-4O achieving the highest score at 43.84%. Key findings include: (i) LLMs tend to be more proficient with newer function versions, which aids in migrating legacy code, and (ii) LLMs sometimes exhibit logical inconsistencies by identifying function changes irrelevant to the intended migration environment. The datasets are available atthis https URL.

View on arXiv
@article{cheng2025_2506.00894,
  title={ CODEMENV: Benchmarking Large Language Models on Code Migration },
  author={ Keyuan Cheng and Xudong Shen and Yihao Yang and Tengyue Wang and Yang Cao and Muhammad Asif Ali and Hanbin Wang and Lijie Hu and Di Wang },
  journal={arXiv preprint arXiv:2506.00894},
  year={ 2025 }
}
Comments on this paper