Compiler auto-tuning optimizes pass sequences to improve performance metrics such as Intermediate Representation (IR) instruction count. Although recent advances leveraging Large Language Models (LLMs) have shown promise in automating compiler tuning, two significant challenges still remain: the absence of high-quality reasoning datasets for agents training, and limited effective interactions with the compilation environment. In this work, we introduce Compiler-R1, the first reinforcement learning (RL)-driven framework specifically augmenting LLM capabilities for compiler auto-tuning. Compiler-R1 features a curated, high-quality reasoning dataset and a novel two-stage end-to-end RL training pipeline, enabling efficient environment exploration and learning through an outcome-based reward. Extensive experiments across seven datasets demonstrate Compiler-R1 achieving an average 8.46% IR instruction count reduction compared to opt -Oz, showcasing the strong potential of RL-trained LLMs for compiler optimization. Our code and datasets are publicly available atthis https URL.
View on arXiv@article{pan2025_2506.15701, title={ Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning }, author={ Haolin Pan and Hongyu Lin and Haoran Luo and Yang Liu and Kaichun Yao and Libo Zhang and Mingjie Xing and Yanjun Wu }, journal={arXiv preprint arXiv:2506.15701}, year={ 2025 } }