ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.15629
7
0

Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability

18 June 2025
Yusuke Sakai
Hidetaka Kamigaito
Taro Watanabe
Author Contacts:
sakai.yusuke.sr9@is.naist.jpkamigaito.h@is.naist.jptaro@is.naist.jp
    LRM
ArXiv (abs)PDFHTML
Main:10 Pages
4 Figures
Bibliography:8 Pages
10 Tables
Appendix:2 Pages
Abstract

In generative commonsense reasoning tasks such as CommonGen, generative large language models (LLMs) compose sentences that include all given concepts. However, when focusing on instruction-following capabilities, if a prompt specifies a concept order, LLMs must generate sentences that adhere to the specified order. To address this, we propose Ordered CommonGen, a benchmark designed to evaluate the compositional generalization and instruction-following abilities of LLMs. This benchmark measures ordered coverage to assess whether concepts are generated in the specified order, enabling a simultaneous evaluation of both abilities. We conducted a comprehensive analysis using 36 LLMs and found that, while LLMs generally understand the intent of instructions, biases toward specific concept order patterns often lead to low-diversity outputs or identical results even when the concept order is altered. Moreover, even the most instruction-compliant LLM achieved only about 75% ordered coverage, highlighting the need for improvements in both instruction-following and compositional generalization capabilities.

View on arXiv
@article{sakai2025_2506.15629,
  title={ Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability },
  author={ Yusuke Sakai and Hidetaka Kamigaito and Taro Watanabe },
  journal={arXiv preprint arXiv:2506.15629},
  year={ 2025 }
}
Comments on this paper