48
0
v1v2 (latest)

TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

Main:8 Pages
11 Figures
Bibliography:4 Pages
7 Tables
Appendix:10 Pages
Abstract

High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present TAG-INSTRUCT, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, TAG-INSTRUCT compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that TAG-INSTRUCT outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.

View on arXiv
@article{zhu2025_2505.18557,
  title={ TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation },
  author={ He Zhu and Zhiwen Ruan and Junyou Su and Xingwei He and Yun Chen and Wenjia Zhang and Guanhua Chen },
  journal={arXiv preprint arXiv:2505.18557},
  year={ 2025 }
}
Comments on this paper