ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.06908
  4. Cited By
Generating Benchmarks for Factuality Evaluation of Language Models

Generating Benchmarks for Factuality Evaluation of Language Models

13 July 2023
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
    HILM
ArXivPDFHTML

Papers citing "Generating Benchmarks for Factuality Evaluation of Language Models"

21 / 21 papers shown
Title
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
108
0
0
03 May 2025
Steering off Course: Reliability Challenges in Steering Language Models
Steering off Course: Reliability Challenges in Steering Language Models
Patrick Queiroz Da Silva
Hari Sethuraman
Dheeraj Rajagopal
Hannaneh Hajishirzi
Sachin Kumar
LLMSV
31
1
0
06 Apr 2025
OAEI-LLM-T: A TBox Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching
OAEI-LLM-T: A TBox Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching
Zhangcheng Qiang
Kerry Taylor
Weiqing Wang
Jing Jiang
52
0
0
25 Mar 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
53
1
0
18 Mar 2025
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
Xinyan Jiang
Hang Ye
Yongxin Zhu
Xiaoying Zheng
Zikang Chen
Jun Gong
49
0
0
17 Mar 2025
OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models
OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models
Chongren Sun
Y. Li
Di Wu
Benoit Boulet
HILM
LRM
88
1
0
22 Jan 2025
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
Catarina G. Belem
Pouya Pezeskhpour
Hayate Iso
Seiji Maekawa
Nikita Bhutani
Estevam R. Hruschka
HILM
73
1
0
17 Oct 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language
  Models
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
Yuzhe Gu
Ziwei Ji
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
HILM
39
5
0
05 Jul 2024
Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities
Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities
Zhonghao Li
Xuming Hu
Aiwei Liu
Kening Zheng
S. Huang
Hui Xiong
RALM
115
8
0
17 Jun 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended
  Generation via Asymptotic Entropy
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
HILM
42
2
0
11 Jun 2024
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented
  Generation for Question-Answering
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering
Zijian Hei
Weiling Liu
Wenjie Ou
Juyi Qiao
Junming Jiao
Guowen Song
Ting Tian
Yi Lin
RALM
46
5
0
11 Jun 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
52
47
0
21 May 2024
Fast Adversarial Attacks on Language Models In One GPU Minute
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan
Shoumik Saha
Gaurang Sriramanan
Priyatham Kattakinda
Atoosa Malemir Chegini
S. Feizi
MIALM
43
34
0
23 Feb 2024
Siren's Song in the AI Ocean: A Survey on Hallucination in Large
  Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
46
522
0
03 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
339
12,003
0
04 Mar 2022
The Factual Inconsistency Problem in Abstractive Text Summarization: A
  Survey
The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey
Yi-Chong Huang
Xiachong Feng
Xiaocheng Feng
Bing Qin
HILM
136
105
0
30 Apr 2021
Understanding Factuality in Abstractive Summarization with FRANK: A
  Benchmark for Factuality Metrics
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
231
305
0
27 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for
  Free-form Text Generation
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
228
144
0
18 Apr 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
269
346
0
01 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
279
1,996
0
31 Dec 2020
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
419
2,588
0
03 Sep 2019
1