Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.03371
Cited By
Evaluating Coherence in Dialogue Systems using Entailment
6 April 2019
Nouha Dziri
Ehsan Kamalloo
K. Mathewson
Osmar Zaiane
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Coherence in Dialogue Systems using Entailment"
26 / 26 papers shown
Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions
Jekaterina Novikova
Carol Anderson
Borhane Blili-Hamelin
Subhabrata Majumdar
HILM
73
0
0
01 May 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
Shri Kiran Srinivasan
Hong-ye Yu
HILM
86
0
0
25 Apr 2025
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
Ruosen Li
Teerth Patel
Xinya Du
LLMAG
ALM
73
97
0
03 Jan 2025
Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues
Kuanchao Chu
Yi-Pei Chen
Hideki Nakayama
LLMAG
44
2
0
13 Jul 2024
ConvoCache: Smart Re-Use of Chatbot Responses
Conor Atkins
Ian D. Wood
M. Kâafar
Hassan Jameel Asghar
Nardine Basta
Michal Kepkowski
48
0
0
26 Jun 2024
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems
Abbas Ghaddar
David Alfonso-Hermelo
Philippe Langlais
Mehdi Rezagholizadeh
Boxing Chen
Prasanna Parthasarathi
41
0
0
24 May 2024
Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons
Adian Liusie
Vatsal Raina
Yassir Fathullah
Mark Gales
43
10
0
09 May 2024
Inconsistent dialogue responses and how to recover from them
Mian Zhang
Lifeng Jin
Linfeng Song
Haitao Mi
Dong Yu
39
1
0
18 Jan 2024
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
83
1,090
0
29 Mar 2023
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
33
7
0
18 Dec 2022
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong
Yang Liu
Da Yin
Yuning Mao
Yizhu Jiao
Peng Liu
Chenguang Zhu
Heng Ji
Jiawei Han
ELM
50
258
0
13 Oct 2022
Investigating Reasons for Disagreement in Natural Language Inference
Nan-Jiang Jiang
M. Marneffe
27
26
0
07 Sep 2022
Dialogue Evaluation with Offline Reinforcement Learning
Nurul Lubis
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Michael Heck
Shutong Feng
Milica Gavsić
OffRL
27
4
0
02 Sep 2022
Improving Personality Consistency in Conversation by Persona Extending
Yifan Liu
Wei Wei
Jiayi Liu
Xian-Ling Mao
Rui Fang
Dangyang Chen
35
24
0
23 Aug 2022
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
Longxuan Ma
Ziyu Zhuang
Weinan Zhang
Mingda Li
Ting Liu
41
4
0
17 Aug 2022
A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation
Yu Cao
Wei Bi
Meng Fang
Shuming Shi
Dacheng Tao
37
48
0
21 Apr 2022
Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge
Brielen Madureira
David Schlangen
28
4
0
14 Apr 2022
Survey of Hallucination in Natural Language Generation
Ziwei Ji
Nayeon Lee
Rita Frieske
Tiezheng Yu
D. Su
...
Delong Chen
Wenliang Dai
Ho Shu Chan
Andrea Madotto
Pascale Fung
HILM
LRM
82
2,254
0
08 Feb 2022
Automatically Exposing Problems with Neural Dialog Models
Dian Yu
Kenji Sagae
31
9
0
14 Sep 2021
How to Evaluate Your Dialogue Models: A Review of Approaches
Xinmeng Li
Wansen Wu
Long Qin
Quanjun Yin
ELM
30
8
0
03 Aug 2021
DynaEval: Unifying Turn and Dialogue Level Evaluation
Chen Zhang
Yiming Chen
L. F. D’Haro
Yan Zhang
Thomas Friedrichs
Grandee Lee
Haizhou Li
24
73
0
02 Jun 2021
Profile Consistency Identification for Open-domain Dialogue Agents
Haoyu Song
Yan Wang
Weinan Zhang
Zhengyu Zhao
Ting Liu
Xiaojiang Liu
24
29
0
21 Sep 2020
Probing Neural Dialog Models for Conversational Understanding
Abdelrhman Saleh
Tovly Deutsch
Stephen Casper
Yonatan Belinkov
Stuart M. Shieber
21
13
0
07 Jun 2020
Generating Persona Consistent Dialogues by Exploiting Natural Language Inference
Haoyu Song
Weinan Zhang
Jingwen Hu
Ting Liu
27
73
0
14 Nov 2019
Hierarchical Reinforcement Learning for Open-Domain Dialog
Abdelrhman Saleh
Natasha Jaques
Asma Ghandeharioun
J. Shen
Rosalind W. Picard
OffRL
14
59
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
1