171

From Disagreement to Understanding: The Case for Ambiguity Detection in NLI

Main:7 Pages
7 Figures
Bibliography:3 Pages
Abstract

This position paper argues that annotation disagreement in Natural Language Inference (NLI) is not mere noise but often reflects meaningful interpretive variation, especially when triggered by ambiguity in the premise or hypothesis. While underspecified guidelines and annotator behavior can contribute to variation, content-based ambiguity offers a process-independent signal of divergent human perspectives. We call for a shift toward ambiguity-aware NLI by systematically identifying ambiguous input pairs and classifying ambiguity types. To support this, we present a unified framework that integrates existing taxonomies and illustrate key ambiguity subtypes through concrete examples. These examples reveal how ambiguity shapes annotator decisions and motivate the need for targeted detection methods that better align models with human interpretation. A key limitation is the lack of datasets annotated for ambiguity and subtypes. We propose addressing this gap through new annotated resources and unsupervised approaches to ambiguity detection -- paving the way for more robust, explainable, and human-aligned NLI systems.

View on arXiv
Comments on this paper