ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.00396
  4. Cited By
RAGTruth: A Hallucination Corpus for Developing Trustworthy
  Retrieval-Augmented Language Models
v1v2 (latest)

RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models

31 December 2023
Cheng Niu
Yuanhao Wu
Juno Zhu
Siliang Xu
Kashun Shum
Randy Zhong
Juntong Song
Tong Zhang
    HILM
ArXiv (abs)PDFHTML

Papers citing "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"

31 / 31 papers shown
Title
RePCS: Diagnosing Data Memorization in LLM-Powered Retrieval-Augmented Generation
RePCS: Diagnosing Data Memorization in LLM-Powered Retrieval-Augmented Generation
Le Vu Anh
Nguyen Viet Anh
Mehmet Dik
Luong Van Nghia
43
0
0
18 Jun 2025
GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation
GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation
Ionut Teodor Sorodoc
Leonardo F. R. Ribeiro
Rexhina Blloshmi
Christopher Davis
Adria de Gispert
21
0
0
09 Jun 2025
Beyond Facts: Evaluating Intent Hallucination in Large Language Models
Beyond Facts: Evaluating Intent Hallucination in Large Language Models
Yijie Hao
Haofei Yu
Jiaxuan You
HILMLRM
40
0
0
06 Jun 2025
When to Trust Context: Self-Reflective Debates for Context Reliability
When to Trust Context: Self-Reflective Debates for Context Reliability
Zeqi Zhou
Fang Wu
Shayan Talaei
Haokai Zhao
Cheng Meixin
Tinson Xu
Amin Saberi
Yejin Choi
HILM
63
0
0
06 Jun 2025
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection
Ron Eliav
Arie Cattan
Eran Hirsch
Shahaf Bassan
Elias Stengel-Eskin
Mohit Bansal
Ido Dagan
LRM
92
0
0
05 Jun 2025
Data-efficient Meta-models for Evaluation of Context-based Questions and Answers in LLMs
Data-efficient Meta-models for Evaluation of Context-based Questions and Answers in LLMs
Julia Belikova
Konstantin Polev
Rauf Parchiev
Dmitry Simakov
58
0
0
29 May 2025
Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers
Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers
Chaitanya Sharma
RALM3DV
44
0
0
28 May 2025
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
Ekaterina Fadeeva
Aleksandr Rubashevskii
Roman Vashurin
Shehzaad Dhuliawala
Artem Shelmanov
Timothy Baldwin
Preslav Nakov
Mrinmaya Sachan
Maxim Panov
HILM
77
0
0
27 May 2025
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models
Xiaqiang Tang
Jian Li
Keyu Hu
Du Nan
Xiaolong Li
Xi Zhang
Weigao Sun
Sihong Xie
HILM
55
0
0
27 May 2025
EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation
EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation
Ruobing Yao
Yifei Zhang
Shuang Song
Neng Gao
Chenyang Tu
SILM
80
1
0
16 May 2025
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Manveer Singh Tamber
F. S. Bao
Chenyu Xu
Ge Luo
Suleman Kazi
Minseok Bae
Miaoran Li
Ofer Mendelevitch
Renyi Qu
Jimmy J. Lin
VLM
76
1
0
07 May 2025
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur
Jimmy J. Lin
Sam Havens
Michael Carbin
Omar Khattab
Andrew Drozdov
130
5
0
17 Apr 2025
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes
SemEval-2025 Task 3: Mu-SHROOM, the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes
Raúl Vázquez
Timothee Mickus
Elaine Zosa
Teemu Vahtola
Jörg Tiedemann
...
Liane Guillou
Ona de Gibert
Jaione Bengoetxea
Joseph Attieh
Marianna Apidianaki
HILMVLMLRM
168
1
0
16 Apr 2025
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation
Hanghui Guo
Jia Zhu
Shimin Di
Weijie Shi
Zhangze Chen
Jiajie Xu
120
0
0
14 Apr 2025
LRAGE: Legal Retrieval Augmented Generation Evaluation Tool
LRAGE: Legal Retrieval Augmented Generation Evaluation Tool
Minhu Park
Hongseok Oh
Eunkyung Choi
Wonseok Hwang
AILawRALMELM
177
0
0
02 Apr 2025
Human Cognition Inspired RAG with Knowledge Graph for Complex Problem Solving
Yao Cheng
Yibo Zhao
Jiapeng Zhu
Yunxing Liu
Xingwu Sun
Xiang Li
RALMReLM
117
0
0
09 Mar 2025
LettuceDetect: A Hallucination Detection Framework for RAG Applications
LettuceDetect: A Hallucination Detection Framework for RAG Applications
Adam Kovacs
Gábor Recski
69
5
0
24 Feb 2025
Evaluating Step-by-step Reasoning Traces: A Survey
Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee
Julia Hockenmaier
LRMELM
164
2
0
17 Feb 2025
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Ameya Godbole
Robin Jia
HILM
145
2
0
24 Jan 2025
ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting
ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting
Steven H. Wang
Maksim Zubkov
Kexin Fan
Sarah Harrell
Yuyang Sun
Wei Chen
Andreas Plesner
Roger Wattenhofer
AILaw
135
2
0
11 Jan 2025
Improving Model Factuality with Fine-grained Critique-based Evaluator
Improving Model Factuality with Fine-grained Critique-based Evaluator
Yiqing Xie
Wenxuan Zhou
Pradyot Prakash
Di Jin
Yuning Mao
...
Sinong Wang
Han Fang
Carolyn Rose
Daniel Fried
Hejia Zhang
HILM
170
8
0
24 Oct 2024
FaithBench: A Diverse Hallucination Benchmark for Summarization by
  Modern LLMs
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs
F. S. Bao
Miaoran Li
Renyi Qu
Ge Luo
Erana Wan
...
Ruixuan Tu
Chenyu Xu
Matthew Gonzales
Ofer Mendelevitch
Amin Ahmad
VLMHILM
93
7
0
17 Oct 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
Nandan Thakur
Suleman Kazi
Ge Luo
Jimmy J. Lin
Amin Ahmad
VLMRALM
210
7
0
17 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
Zhongxiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
140
17
0
15 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
175
1
0
09 Oct 2024
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Deqing Fu
Tong Xiao
Rui Wang
Wang Zhu
Pengchuan Zhang
Guan Pang
Robin Jia
Lawrence Chen
166
7
0
07 Oct 2024
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
Pengcheng Jiang
Cao Xiao
Minhao Jiang
Parminder Bhatia
Taha A. Kass-Hout
Jimeng Sun
Jiawei Han
RALMAI4MH
199
7
0
06 Oct 2024
LRP4RAG: Detecting Hallucinations in Retrieval-Augmented Generation via Layer-wise Relevance Propagation
LRP4RAG: Detecting Hallucinations in Retrieval-Augmented Generation via Layer-wise Relevance Propagation
Haichuan Hu
Yuhan Sun
Xiaochen Xie
Quanjun Zhang
111
6
0
28 Aug 2024
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework
Kunlun Zhu
Yifan Luo
Dingling Xu
Ruobing Wang
Shi Yu
...
Yishan Li
Zhiyuan Liu
Xu Han
Zhiyuan Liu
Maosong Sun
223
21
0
02 Aug 2024
R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval
  Augmented Large Language Models
R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
Shangqing Tu
Yuanchun Wang
Jifan Yu
Yuyang Xie
Yaran Shi
Xiaozhi Wang
Jing Zhang
Lei Hou
Juanzi Li
ELM
100
4
0
17 Jun 2024
Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning
Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning
Mingtian Zhang
Shawn Lan
Peter Hayes
David Barber
130
3
0
19 Feb 2024
1