Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models

3 October 2024

Yinhong Liu

Abstract

Large Language Models (LLMs) are expected to be predictable and trustworthy to support reliable decision-making systems. Yet current LLMs often show inconsistencies in their judgments. In this work, we examine logical preference consistency as a foundational requirement for building more dependable LLM systems, ensuring stable and coherent decision-making while minimizing erratic or contradictory outputs. To quantify the logical preference consistency, we propose a universal evaluation framework based on three fundamental properties: transitivity, commutativity and negation invariance. Through extensive experimentation across diverse LLMs, we demonstrate that these properties serve as strong indicators of judgment robustness. Furthermore, we introduce a data refinement and augmentation technique, REPAIR, that enhances logical consistency while maintaining alignment with human preferences. Finally, we show that improving consistency leads to better performance in LLM-driven logic-based algorithms, reinforcing stability and coherence in decision-making systems.

View on arXiv

@article{liu2025_2410.02205,
  title={ Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models },
  author={ Yinhong Liu and Zhijiang Guo and Tianya Liang and Ehsan Shareghi and Ivan Vulić and Nigel Collier },
  journal={arXiv preprint arXiv:2410.02205},
  year={ 2025 }
}

Comments on this paper