Ace-CEFR -- A Dataset for Automated Evaluation of the Linguistic Difficulty of Conversational Texts for LLM Applications

16 June 2025

Main:8 Pages

2 Figures

Bibliography:1 Pages

5 Tables

Appendix:3 Pages

Abstract

There is an unmet need to evaluate the language difficulty of short, conversational passages of text, particularly for training and filtering Large Language Models (LLMs). We introduce Ace-CEFR, a dataset of English conversational text passages expert-annotated with their corresponding level of text difficulty. We experiment with several models on Ace-CEFR, including Transformer-based models and LLMs. We show that models trained on Ace-CEFR can measure text difficulty more accurately than human experts and have latency appropriate to production environments. Finally, we release the Ace-CEFR dataset to the public for research and development.

View on arXiv

@article{kogan2025_2506.14046,
  title={ Ace-CEFR -- A Dataset for Automated Evaluation of the Linguistic Difficulty of Conversational Texts for LLM Applications },
  author={ David Kogan and Max Schumacher and Sam Nguyen and Masanori Suzuki and Melissa Smith and Chloe Sophia Bellows and Jared Bernstein },
  journal={arXiv preprint arXiv:2506.14046},
  year={ 2025 }
}

Comments on this paper