Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.16739
Cited By
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
26 May 2023
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AlignScore: Evaluating Factual Consistency with a Unified Alignment Function"
45 / 45 papers shown
Title
Reranking-based Generation for Unbiased Perspective Summarization
Narutatsu Ri
Nicholas Deas
Kathleen McKeown
OffRL
12
0
0
19 Jun 2025
Re-Initialization Token Learning for Tool-Augmented Large Language Models
Chenghao Li
Liu Liu
B. Yu
Jiayan Qiu
Yibing Zhan
LLMAG
CLL
KELM
38
0
0
17 Jun 2025
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection
Ron Eliav
Arie Cattan
Eran Hirsch
Shahaf Bassan
Elias Stengel-Eskin
Mohit Bansal
Ido Dagan
LRM
83
0
0
05 Jun 2025
A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization
Sarvesh Soni
Dina Demner-Fushman
118
3
0
04 Jun 2025
QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question Answering
A. Tang
Xiuzhen Zhang
M. Dinh
Zhuang Li
RALM
62
0
0
04 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Yixuan Li
Seongheon Park
Ling Chen
AAML
HILM
54
0
0
03 Jun 2025
Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages
Hyangsuk Min
Yuho Lee
Minjeong Ban
Jiaqi Deng
Nicole Hee-Yeon Kim
Taewon Yun
Hang Su
Jason (Jinglun) Cai
Hwanjun Song
ELM
25
0
0
31 May 2025
VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky
Jonathan Larson
HILM
56
0
0
27 May 2025
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
Ekaterina Fadeeva
Aleksandr Rubashevskii
Roman Vashurin
Shehzaad Dhuliawala
Artem Shelmanov
Timothy Baldwin
Preslav Nakov
Mrinmaya Sachan
Maxim Panov
HILM
69
0
0
27 May 2025
Retrieval Augmented Generation-based Large Language Models for Bridging Transportation Cybersecurity Legal Knowledge Gaps
Khandakar Ashrafi Akbar
Md Nahiyan Uddin
Latifur Khan
Trayce Hockstad
Mizanur Rahman
M. Chowdhury
B. Thuraisingham
AILaw
RALM
242
0
0
23 May 2025
Long-Form Information Alignment Evaluation Beyond Atomic Facts
Danna Zheng
Mirella Lapata
Jeff Z. Pan
HILM
70
0
0
21 May 2025
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Yu Fan
Jingwei Ni
Jakob Merane
Etienne Salimbeni
Yang Tian
...
Mrinmaya Sachan
Alexander Stremitzer
Christoph Engel
Elliott Ash
Joel Niklaus
AILaw
ELM
126
0
0
19 May 2025
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Manveer Singh Tamber
F. S. Bao
Chenyu Xu
Ge Luo
Suleman Kazi
Minseok Bae
Miaoran Li
Ofer Mendelevitch
Renyi Qu
Jimmy J. Lin
VLM
68
1
0
07 May 2025
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin
Hagai Taitelbaum
Yonatan Bitton
Brian Gordon
Michal Sokolik
Nitzan Bitton-Guetta
Almog Gueta
Royi Rassin
Itay Laish
Dani Lischinski
EGVM
VGen
101
0
0
24 Apr 2025
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Andrea Santilli
Adam Goliñski
Michael Kirchhof
Federico Danieli
Arno Blaas
Miao Xiong
Luca Zappella
Sinead Williamson
67
3
0
18 Apr 2025
Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models
Shiran Dudy
Thulasi Tholeti
R. Ramachandranpillai
Muhammad Ali
Toby Jia-Jun Li
Ricardo Baeza-Yates
115
1
0
16 Mar 2025
Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs
Mayank Singh
Abhijeet Kumar
Sasidhar Donaparthi
Gayatri Karambelkar
119
0
0
12 Mar 2025
GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking
Yingjian Chen
Haoran Liu
Yinhong Liu
Rui Yang
Han Yuan
...
Pengyuan Zhou
Peng Yuan Zhou
Qingyu Chen
James Caverlee
Irene Li
HILM
133
0
0
23 Feb 2025
Position: Beyond Assistance - Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care
Abeer Badawi
Md Tahmid Rahman Laskar
J. Huang
Shaina Raza
Elham Dolatabadi
AI4MH
57
0
0
21 Feb 2025
Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models
Artyom Kharinaev
Viktor Moskvoretskii
Egor Shvetsov
Kseniia Studenikina
Bykov Mikhail
Evgeny Burnaev
MQ
96
0
0
18 Feb 2025
Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee
Julia Hockenmaier
LRM
ELM
153
2
0
17 Feb 2025
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You
Xien Liu
Qixin Sun
Huan Zhang
Kaiyin Zhou
Shaohui Liu
Guoping Hu
Shijin Wang
Si Liu
Ji Wu
182
0
0
13 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou
Mirella Lapata
MoMe
535
1
0
03 Feb 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge
Aparna Elangovan
Jongwoo Ko
Lei Xu
Mahsa Elyasi
Ling Liu
S. Bodapati
Dan Roth
125
6
0
28 Jan 2025
RELexED: Retrieval-Enhanced Legal Summarization with Exemplar Diversity
T. Y. S. S. Santosh
Chen Jia
Patrick Goroncy
Matthias Grabmair
AILaw
98
1
0
23 Jan 2025
Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use
Mohit Chandra
Siddharth Sriraman
Gaurav Verma
Harneet Singh Khanuja
Jose Suarez Campayo
Zihang Li
Michael L. Birnbaum
M. D. Choudhury
AI4MH
109
7
0
08 Jan 2025
Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models
Jeonghwan Kim
Heng Ji
MLLM
106
2
0
08 Jan 2025
SummExecEdit: A Factual Consistency Benchmark in Summarization with Executable Edits
Onkar Thorat
Philippe Laban
Chien-Sheng Wu
HILM
152
1
0
17 Dec 2024
Retrieval-Augmented Generation with Estimation of Source Reliability
Jeongyeon Hwang
Junyoung Park
Hyejin Park
Dongwoo Kim
Sangdon Park
Jungseul Ok
RALM
98
1
0
30 Oct 2024
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs
F. S. Bao
Miaoran Li
Renyi Qu
Ge Luo
Erana Wan
...
Ruixuan Tu
Chenyu Xu
Matthew Gonzales
Ofer Mendelevitch
Amin Ahmad
VLM
HILM
89
7
0
17 Oct 2024
A Little Human Data Goes A Long Way
Dhananjay Ashok
Jonathan May
SyDa
120
4
0
17 Oct 2024
Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?
Qisheng Hu
Quanyu Long
Wenya Wang
405
9
0
17 Oct 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
161
11
0
11 Sep 2024
Claim Verification in the Age of Large Language Models: A Survey
A. Dmonte
Roland Oruche
Marcos Zampieri
Prasad Calyam
Isabelle Augenstein
181
11
0
26 Aug 2024
STORYSUMM: Evaluating Faithfulness in Story Summarization
Melanie Subbiah
Faisal Ladhak
Akankshya Mishra
Griffin Adams
Lydia B. Chilton
Kathleen McKeown
136
4
0
09 Jul 2024
Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa
Xinyu Zhao
Junyi Jessy Li
Greg Durrett
92
16
0
02 Jul 2024
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection
Jooyoung Lee
Toshini Agrawal
Adaku Uchendu
Thai V. Le
Jinghui Chen
Dongwon Lee
183
1
0
24 Jun 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Preslav Nakov
Maxim Panov
Artem Shelmanov
HILM
182
28
0
21 Jun 2024
FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering
Tianchi Cai
Zhiwen Tan
Xierui Song
Tao Sun
Jiyan Jiang
Yunqi Xu
Yinger Zhang
Jinjie Gu
80
7
0
19 Jun 2024
Detecting Response Generation Not Requiring Factual Judgment
Ryohei Kamei
Daiki Shiono
Reina Akama
Jun Suzuki
HILM
57
0
0
14 Jun 2024
WisPerMed at "Discharge Me!": Advancing Text Generation in Healthcare with Large Language Models, Dynamic Expert Selection, and Priming Techniques on MIMIC-IV
Hendrik Damm
T. M. G. Pakull
Bahadir Eryilmaz
Helmut Becker
Ahmad Idrissi-Yaghir
Henning Schafer
Sergej Schultenkämper
Christoph M. Friedrich
69
3
0
18 May 2024
A Survey of Automatic Hallucination Evaluation on Natural Language Generation
Siya Qi
Yulan He
Yulan He
Zheng Yuan
LRM
HILM
99
1
0
18 Apr 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
241
12
0
25 Mar 2024
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Ashok Urlana
Charaka Vinayak Kumar
Ajeet Kumar Singh
B. Garlapati
S. Chalamala
Rahul Mishra
122
8
0
22 Feb 2024
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
Tianchi Cai
Xierui Song
Jiyan Jiang
Fei Teng
Jinjie Gu
Guannan Zhang
ALM
80
5
0
05 Dec 2023
1