ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.20767
42
0
v1v2v3v4 (latest)

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models

27 May 2025
Xiaqiang Tang
Jian Li
Keyu Hu
Du Nan
Xiaolong Li
Xi Zhang
Weigao Sun
Sihong Xie
    HILM
ArXiv (abs)PDFHTML
Main:7 Pages
15 Figures
Bibliography:3 Pages
7 Tables
Appendix:9 Pages
Abstract

Faithfulness hallucinations are claims generated by a Large Language Model (LLM) not supported by contexts provided to the LLM. Lacking assessment standards, existing benchmarks focus on "factual statements" that rephrase source materials while overlooking "cognitive statements" that involve making inferences from the given context. Consequently, evaluating and detecting the hallucination of cognitive statements remains challenging. Inspired by how evidence is assessed in the legal domain, we design a rigorous framework to assess different levels of faithfulness of cognitive statements and introduce the CogniBench dataset where we reveal insightful statistics. To keep pace with rapidly evolving LLMs, we further develop an automatic annotation pipeline that scales easily across different models. This results in a large-scale CogniBench-L dataset, which facilitates training accurate detectors for both factual and cognitive hallucinations. We release our model and datasets at:this https URL

View on arXiv
@article{tang2025_2505.20767,
  title={ CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models },
  author={ Xiaqiang Tang and Jian Li and Keyu Hu and Du Nan and Xiaolong Li and Xi Zhang and Weigao Sun and Sihong Xie },
  journal={arXiv preprint arXiv:2505.20767},
  year={ 2025 }
}
Comments on this paper