ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.08202
  4. Cited By
$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues
  via Question Generation and Question Answering

Q2Q^{2}Q2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

16 April 2021
Or Honovich
Leshem Choshen
Roee Aharoni
Ella Neeman
Idan Szpektor
Omri Abend
    HILM
ArXivPDFHTML

Papers citing "$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering"

50 / 112 papers shown
Title
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
TWIZ-v2: The Wizard of Multimodal Conversational-Stimulus
Rafael Ferreira
Diogo Tavares
Diogo Glória-Silva
Rodrigo Valerio
João Bordalo
Ines Simoes
Vasco Ramos
David Semedo
João Magalhães
24
4
0
03 Oct 2023
A Novel Computational and Modeling Foundation for Automatic Coherence
  Assessment
A Novel Computational and Modeling Foundation for Automatic Coherence Assessment
Aviya Maimon
Reut Tsarfaty
80
5
0
01 Oct 2023
Efficient Benchmarking of Language Models
Efficient Benchmarking of Language Models
Yotam Perlitz
Elron Bandel
Ariel Gera
Ofir Arviv
L. Ein-Dor
Eyal Shnarch
Noam Slonim
Michal Shmueli-Scheuer
Leshem Choshen
ALM
24
24
0
22 Aug 2023
Evaluating Correctness and Faithfulness of Instruction-Following Models
  for Question Answering
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
Vaibhav Adlakha
Parishad BehnamGhader
Xing Han Lù
Nicholas Meade
Siva Reddy
25
120
0
31 Jul 2023
A Dialogue System for Assessing Activities of Daily Living: Improving
  Consistency with Grounded Knowledge
A Dialogue System for Assessing Activities of Daily Living: Improving Consistency with Grounded Knowledge
Zhecheng Sheng
Raymond L. Finzel
M. Lucke
Sheena Dufresne
Maria Gini
Serguei V. S. Pakhomov
32
0
0
15 Jul 2023
Neural models for Factual Inconsistency Classification with Explanations
Neural models for Factual Inconsistency Classification with Explanations
Tathagata Raha
Mukund Choudhary
Abhinav Menon
Harshit Gupta
KV Aditya Srivatsa
Manish Gupta
Vasudeva Varma
19
3
0
15 Jun 2023
With a Little Push, NLI Models can Robustly and Efficiently Predict
  Faithfulness
With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness
Julius Steen
Juri Opitz
Anette Frank
K. Markert
HILM
25
9
0
26 May 2023
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
21
180
0
26 May 2023
The Dangers of trusting Stochastic Parrots: Faithfulness and Trust in
  Open-domain Conversational Question Answering
The Dangers of trusting Stochastic Parrots: Faithfulness and Trust in Open-domain Conversational Question Answering
Sabrina Chiesurin
Dimitris Dimakopoulos
Marco Antonio Sobrevilla Cabezudo
Arash Eshghi
Ioannis V. Papaioannou
Verena Rieser
Ioannis Konstas
HILM
27
25
0
25 May 2023
Transferring Visual Attributes from Natural Language to Verified Image
  Generation
Transferring Visual Attributes from Natural Language to Verified Image Generation
Rodrigo Valerio
João Bordalo
Michal Yarom
Yonattan Bitton
Idan Szpektor
João Magalhães
29
5
0
24 May 2023
MuLER: Detailed and Scalable Reference-based Evaluation
MuLER: Detailed and Scalable Reference-based Evaluation
Taelin Karidi
Leshem Choshen
Gal Patel
Omri Abend
40
0
0
24 May 2023
AWESOME: GPU Memory-constrained Long Document Summarization using Memory
  Mechanism and Global Salient Content
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content
Shuyang Cao
Lu Wang
30
5
0
24 May 2023
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by
  Few-Shot Grounding on Wikipedia
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina J. Semnani
Violet Z. Yao
He Zhang
M. Lam
KELM
AI4MH
30
72
0
23 May 2023
LM vs LM: Detecting Factual Errors via Cross Examination
LM vs LM: Detecting Factual Errors via Cross Examination
Roi Cohen
May Hamri
Mor Geva
Amir Globerson
HILM
41
120
0
22 May 2023
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization
  Evaluation
SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark
Shruti Rijhwani
Sebastian Gehrmann
Joshua Maynez
Roee Aharoni
Vitaly Nikolaev
Thibault Sellam
Aditya Siddhant
Dipanjan Das
Ankur P. Parikh
32
38
0
22 May 2023
Pointwise Mutual Information Based Metric and Decoding Strategy for
  Faithful Generation in Document Grounded Dialogs
Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs
Yatin Nandwani
Vineet Kumar
Dinesh Raghu
Sachindra Joshi
Luis Lastras
30
6
0
20 May 2023
TrueTeacher: Learning Factual Consistency Evaluation with Large Language
  Models
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
Zorik Gekhman
Jonathan Herzig
Roee Aharoni
Chen Elkind
Idan Szpektor
HILM
ELM
29
71
0
18 May 2023
What You See is What You Read? Improving Text-Image Alignment Evaluation
What You See is What You Read? Improving Text-Image Alignment Evaluation
Michal Yarom
Yonatan Bitton
Soravit Changpinyo
Roee Aharoni
Jonathan Herzig
Oran Lang
E. Ofek
Idan Szpektor
EGVM
59
74
0
17 May 2023
ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
Wei-Lin Chen
An-Zi Yen
Cheng-Kuang Wu
Hen-Hsen Huang
Hsin-Hsi Chen
ReLM
LRM
24
10
0
12 May 2023
q2d: Turning Questions into Dialogs to Teach Models How to Search
q2d: Turning Questions into Dialogs to Teach Models How to Search
Yonatan Bitton
Shlomi Cohen-Ganor
Ido Hakimi
Yoad Lewenberg
Roee Aharoni
Enav Weinreb
51
4
0
27 Apr 2023
Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9
  and DSTC10
Task-oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10
David Thulke
Nico Daheim
Christian Dugast
Hermann Ney
38
6
0
14 Apr 2023
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of
  Synthetic and Compositional Images
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images
Nitzan Bitton-Guetta
Yonatan Bitton
Jack Hessel
Ludwig Schmidt
Yuval Elovici
Gabriel Stanovsky
Roy Schwartz
VLM
121
66
0
13 Mar 2023
WiCE: Real-World Entailment for Claims in Wikipedia
WiCE: Real-World Entailment for Claims in Wikipedia
Ryo Kamoi
Tanya Goyal
Juan Diego Rodriguez
Greg Durrett
41
81
0
02 Mar 2023
"Why is this misleading?": Detecting News Headline Hallucinations with
  Explanations
"Why is this misleading?": Detecting News Headline Hallucinations with Explanations
Jiaming Shen
Jialu Liu
Daniel Finnie
N. Rahmati
Michael Bendersky
Marc Najork
30
19
0
12 Feb 2023
Opportunities and Challenges in Neural Dialog Tutoring
Opportunities and Challenges in Neural Dialog Tutoring
Jakub Macina
Nico Daheim
Lingzhi Wang
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
24
26
0
24 Jan 2023
Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics
  Without the Reference
Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference
Vilém Zouhar
S. Dhuliawala
Wangchunshu Zhou
Nico Daheim
Tom Kocmi
Yuchen Eleanor Jiang
Mrinmaya Sachan
18
9
0
21 Jan 2023
mFACE: Multilingual Summarization with Factual Consistency Evaluation
mFACE: Multilingual Summarization with Factual Consistency Evaluation
Roee Aharoni
Shashi Narayan
Joshua Maynez
Jonathan Herzig
Elizabeth Clark
Mirella Lapata
HILM
27
44
0
20 Dec 2022
WeCheck: Strong Factual Consistency Checker via Weakly Supervised
  Learning
WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning
Wenhao Wu
Wei Li
Xinyan Xiao
Jiachen Liu
Sujian Li
Yajuan Lv
HILM
28
4
0
20 Dec 2022
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of
  Faithfulness Metrics
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics
Liang Ma
Shuyang Cao
IV RobertL.Logan
Di Lu
Shihao Ran
Kecheng Zhang
Joel R. Tetreault
A. Jaimes
17
6
0
20 Dec 2022
Multilingual Sequence-to-Sequence Models for Hebrew NLP
Multilingual Sequence-to-Sequence Models for Hebrew NLP
Matan Eyal
Hila Noga
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
36
4
0
19 Dec 2022
RISE: Leveraging Retrieval Techniques for Summarization Evaluation
RISE: Leveraging Retrieval Techniques for Summarization Evaluation
David C. Uthus
Jianmo Ni
RALM
16
0
0
17 Dec 2022
Harnessing Knowledge and Reasoning for Human-Like Natural Language
  Generation: A Brief Review
Harnessing Knowledge and Reasoning for Human-Like Natural Language Generation: A Brief Review
Jiangjie Chen
Yanghua Xiao
44
4
0
07 Dec 2022
Converge to the Truth: Factual Error Correction via Iterative
  Constrained Editing
Converge to the Truth: Factual Error Correction via Iterative Constrained Editing
Jiangjie Chen
Rui Xu
Wenyuan Zeng
Changzhi Sun
Lei Li
Yanghua Xiao
KELM
43
9
0
22 Nov 2022
Controllable Factuality in Document-Grounded Dialog Systems Using a
  Noisy Channel Model
Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model
Nico Daheim
David Thulke
Christian Dugast
Hermann Ney
HILM
19
4
0
31 Oct 2022
Controlled Text Reduction
Controlled Text Reduction
Aviv Slobodkin
Paul Roit
Eran Hirsch
Ori Ernst
Ido Dagan
47
10
0
24 Oct 2022
On the Limitations of Reference-Free Evaluations of Generated Text
On the Limitations of Reference-Free Evaluations of Generated Text
Daniel Deutsch
Rotem Dror
Dan Roth
40
45
0
22 Oct 2022
Social Biases in Automatic Evaluation Metrics for NLG
Social Biases in Automatic Evaluation Metrics for NLG
Mingqi Gao
Xiaojun Wan
30
3
0
17 Oct 2022
RARR: Researching and Revising What Language Models Say, Using Language
  Models
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao
Zhuyun Dai
Panupong Pasupat
Anthony Chen
Arun Tejasvi Chaganty
...
Vincent Zhao
Ni Lao
Hongrae Lee
Da-Cheng Juan
Kelvin Guu
HILM
KELM
41
257
0
17 Oct 2022
Retrieval Augmentation for T5 Re-ranker using External Sources
Retrieval Augmentation for T5 Re-ranker using External Sources
Kai Hui
Tao Chen
Zhen Qin
Honglei Zhuang
Fernando Diaz
Michael Bendersky
Donald Metzler
RALM
LRM
28
1
0
11 Oct 2022
Hierarchical3D Adapters for Long Video-to-text Summarization
Hierarchical3D Adapters for Long Video-to-text Summarization
Pinelopi Papalampidi
Mirella Lapata
VGen
31
12
0
10 Oct 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
33
8
0
12 Sep 2022
SMART: Sentences as Basic Units for Text Evaluation
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
32
21
0
01 Aug 2022
QA Is the New KR: Question-Answer Pairs as Knowledge Bases
QA Is the New KR: Question-Answer Pairs as Knowledge Bases
Wenhu Chen
William W. Cohen
Michiel de Jong
Nitish Gupta
Alessandro Presta
Pat Verga
John Wieting
27
7
0
01 Jul 2022
Conditional Generation with a Question-Answering Blueprint
Conditional Generation with a Question-Answering Blueprint
Shashi Narayan
Joshua Maynez
Reinald Kim Amplayo
Kuzman Ganchev
Annie Louis
Fantine Huot
Anders Sandholm
Dipanjan Das
Mirella Lapata
61
47
0
01 Jul 2022
Counterfactual Data Augmentation improves Factuality of Abstractive
  Summarization
Counterfactual Data Augmentation improves Factuality of Abstractive Summarization
Dheeraj Rajagopal
Siamak Shakeri
Cicero Nogueira dos Santos
Eduard H. Hovy
Chung-Ching Chang
HILM
74
10
0
25 May 2022
All You May Need for VQA are Image Captions
All You May Need for VQA are Image Captions
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
32
70
0
04 May 2022
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue
Nouha Dziri
Ehsan Kamalloo
Sivan Milton
Osmar Zaiane
Mo Yu
E. Ponti
Siva Reddy
HILM
29
87
0
22 Apr 2022
Stretching Sentence-pair NLI Models to Reason over Long Documents and
  Clusters
Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Tal Schuster
Sihao Chen
S. Buthpitiya
Alex Fabrikant
Donald Metzler
26
41
0
15 Apr 2022
ASQA: Factoid Questions Meet Long-Form Answers
ASQA: Factoid Questions Meet Long-Form Answers
Ivan Stelmakh
Yi Luan
Bhuwan Dhingra
Ming-Wei Chang
29
160
0
12 Apr 2022
TRUE: Re-evaluating Factual Consistency Evaluation
TRUE: Re-evaluating Factual Consistency Evaluation
Or Honovich
Roee Aharoni
Jonathan Herzig
Hagai Taitelbaum
Doron Kukliansy
Vered Cohen
Thomas Scialom
Idan Szpektor
Avinatan Hassidim
Yossi Matias
HILM
35
3
0
11 Apr 2022
Previous
123
Next