ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.13832
  4. Cited By
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation

FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation

25 October 2022
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
ArXivPDFHTML

Papers citing "FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation"

39 / 39 papers shown
Title
Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations
Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations
Karthik Gopalakrishnan
Behnam Hedayatnia
Qinlang Chen
Anna Gottardi
Sanjeev Kwatra
Anu Venkatesh
Raefer Gabriel
Dilek Z. Hakkani-Tür
AI4MH
BDL
65
330
0
23 Aug 2023
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic
  Manipulations
DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations
Sarik Ghazarian
Nuan Wen
Aram Galstyan
Nanyun Peng
41
40
0
18 Mar 2022
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment
  Act Flows
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
Jianqiao Zhao
Yanyang Li
Wanyu Du
Yangfeng Ji
Dong Yu
Michael R. Lyu
Liwei Wang
43
4
0
14 Feb 2022
Human Evaluation of Conversations is an Open Problem: comparing the
  sensitivity of various methods for evaluating dialogue agents
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents
Eric Michael Smith
Orion Hsu
Rebecca Qian
Stephen Roller
Y-Lan Boureau
Jason Weston
55
67
0
12 Jan 2022
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue
  Evaluation
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Chen Zhang
L. F. D’Haro
Thomas Friedrichs
Haizhou Li
ELM
42
18
0
14 Dec 2021
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Chen Zhang
João Sedoc
L. F. D’Haro
Rafael E. Banchs
Alexander I. Rudnicky
38
37
0
03 Nov 2021
Revisiting Self-Training for Few-Shot Learning of Language Model
Revisiting Self-Training for Few-Shot Learning of Language Model
Yiming Chen
Yan Zhang
Chen Zhang
Grandee Lee
Ran Cheng
Haizhou Li
37
42
0
04 Oct 2021
Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Ananya B. Sai
Tanay Dixit
D. Y. Sheth
S. Mohan
Mitesh M. Khapra
AAML
121
58
0
13 Sep 2021
Beyond Goldfish Memory: Long-Term Open-Domain Conversation
Beyond Goldfish Memory: Long-Term Open-Domain Conversation
Jing Xu
Arthur Szlam
Jason Weston
RALM
33
250
0
15 Jul 2021
A Comprehensive Assessment of Dialog Evaluation Metrics
A Comprehensive Assessment of Dialog Evaluation Metrics
Yi-Ting Yeh
M. Eskénazi
Shikib Mehri
53
107
0
07 Jun 2021
Conversations Are Not Flat: Modeling the Dynamic Information Flow across
  Dialogue Utterances
Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances
Zekang Li
Jinchao Zhang
Zhengcong Fei
Yang Feng
Jie Zhou
38
57
0
04 Jun 2021
DynaEval: Unifying Turn and Dialogue Level Evaluation
DynaEval: Unifying Turn and Dialogue Level Evaluation
Chen Zhang
Yiming Chen
L. F. D’Haro
Yan Zhang
Thomas Friedrichs
Grandee Lee
Haizhou Li
39
73
0
02 Jun 2021
Retrieval Augmentation Reduces Hallucination in Conversation
Retrieval Augmentation Reduces Hallucination in Conversation
Kurt Shuster
Spencer Poff
Moya Chen
Douwe Kiela
Jason Weston
HILM
69
721
0
15 Apr 2021
I like fish, especially dolphins: Addressing Contradictions in Dialogue
  Modeling
I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling
Yixin Nie
Mary Williamson
Joey Tianyi Zhou
Douwe Kiela
Jason Weston
53
83
0
24 Dec 2020
Overview of the Ninth Dialog System Technology Challenge: DSTC9
Overview of the Ninth Dialog System Technology Challenge: DSTC9
Chulaka Gunasekara
Seokhwan Kim
L. F. D’Haro
Abhinav Rastogi
Yun-Nung Chen
...
A. Geramifard
Satwik Kottur
Seungwhan Moon
Shivani Poddar
R. Subba
86
75
0
12 Nov 2020
Deconstruct to Reconstruct a Configurable Evaluation Metric for
  Open-Domain Dialogue Systems
Deconstruct to Reconstruct a Configurable Evaluation Metric for Open-Domain Dialogue Systems
Vitou Phy
Yang Zhao
Akiko Aizawa
24
55
0
01 Nov 2020
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating
  Open-Domain Dialogue Systems
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems
Lishan Huang
Zheng Ye
Jinghui Qin
Liang Lin
Xiaodan Liang
28
103
0
08 Oct 2020
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset
  and Large Scale Pretraining
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining
Ananya B. Sai
Akash Kumar Mohankumar
Siddharth Arora
Mitesh M. Khapra
33
74
0
23 Sep 2020
Dialogue Response Ranking Training with Large-Scale Human Feedback Data
Dialogue Response Ranking Training with Large-Scale Human Feedback Data
Xiang Gao
Yizhe Zhang
Michel Galley
Chris Brockett
Bill Dolan
ALM
52
105
0
15 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
151
615
0
10 Sep 2020
Unsupervised Evaluation of Interactive Dialog with DialoGPT
Unsupervised Evaluation of Interactive Dialog with DialoGPT
Shikib Mehri
M. Eskénazi
40
177
0
23 Jun 2020
Learning an Unreferenced Metric for Online Dialogue Evaluation
Learning an Unreferenced Metric for Online Dialogue Evaluation
Koustuv Sinha
Prasanna Parthasarathi
Jasmine Wang
Ryan J. Lowe
William L. Hamilton
Joelle Pineau
OffRL
48
84
0
01 May 2020
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog
  Generation
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
Shikib Mehri
M. Eskénazi
41
220
0
01 May 2020
Recipes for building an open-domain chatbot
Recipes for building an open-domain chatbot
Stephen Roller
Emily Dinan
Naman Goyal
Da Ju
Mary Williamson
...
Myle Ott
Kurt Shuster
Eric Michael Smith
Y-Lan Boureau
Jason Weston
ALM
107
1,001
0
28 Apr 2020
Towards a Human-like Open-Domain Chatbot
Towards a Human-like Open-Domain Chatbot
Daniel De Freitas
Minh-Thang Luong
David R. So
Jamie Hall
Noah Fiedel
...
Zi Yang
Apoorv Kulshreshtha
Gaurav Nemade
Yifeng Lu
Quoc V. Le
61
931
0
27 Jan 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
...
Sasank Chilamkurthy
Benoit Steiner
Lu Fang
Junjie Bai
Soumith Chintala
ODL
242
42,038
0
03 Dec 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
121
7,386
0
02 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
398
24,160
0
26 Jul 2019
Approximating Interactive Human Evaluation with Self-Play for
  Open-Domain Dialog Systems
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
Asma Ghandeharioun
J. Shen
Natasha Jaques
Craig Ferguson
Noah J. Jones
Àgata Lapedriza
Rosalind W. Picard
47
91
0
21 Jun 2019
Survey on Evaluation Methods for Dialogue Systems
Survey on Evaluation Methods for Dialogue Systems
Jan Deriu
Álvaro Rodrigo
Arantxa Otegi
Guillermo Echegoyen
S. Rosset
Eneko Agirre
Mark Cieliebak
50
280
0
10 May 2019
Better Automatic Evaluation of Open-Domain Dialogue Systems with
  Contextualized Embeddings
Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings
Sarik Ghazarian
Johnny Tian-Zheng Wei
Aram Galstyan
Nanyun Peng
36
90
0
24 Apr 2019
What makes a good conversation? How controllable attributes affect human
  judgments
What makes a good conversation? How controllable attributes affect human judgments
A. See
Stephen Roller
Douwe Kiela
Jason Weston
70
287
0
22 Feb 2019
The Second Conversational Intelligence Challenge (ConvAI2)
The Second Conversational Intelligence Challenge (ConvAI2)
Emily Dinan
V. Logacheva
Valentin Malykh
Alexander H. Miller
Kurt Shuster
...
Alexander I. Rudnicky
Jason Williams
Joelle Pineau
Andrey Kravchenko
Jason Weston
DRL
84
363
0
31 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
943
93,936
0
11 Oct 2018
Coherence Models for Dialogue
Coherence Models for Dialogue
Alessandra Cervone
Evgeny A. Stepanov
Giuseppe Riccardi
49
24
0
21 Jun 2018
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Saizheng Zhang
Emily Dinan
Jack Urbanek
Arthur Szlam
Douwe Kiela
Jason Weston
80
1,442
0
22 Jan 2018
DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
Yanran Li
Hui Su
Xiaoyu Shen
Wenjie Li
Ziqiang Cao
Shuzi Niu
48
1,291
0
11 Oct 2017
RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain
  Dialog Systems
RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems
Chongyang Tao
Lili Mou
Dongyan Zhao
Rui Yan
51
217
0
11 Jan 2017
How NOT To Evaluate Your Dialogue System: An Empirical Study of
  Unsupervised Evaluation Metrics for Dialogue Response Generation
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
84
1,292
0
25 Mar 2016
1