ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1603.08023
  4. Cited By
How NOT To Evaluate Your Dialogue System: An Empirical Study of
  Unsupervised Evaluation Metrics for Dialogue Response Generation

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

25 March 2016
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
ArXivPDFHTML

Papers citing "How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation"

50 / 293 papers shown
Title
Generating Relevant and Coherent Dialogue Responses using Self-separated
  Conditional Variational AutoEncoders
Generating Relevant and Coherent Dialogue Responses using Self-separated Conditional Variational AutoEncoders
Bin Sun
Shaoxiong Feng
Yiwei Li
Jiamou Liu
Kan Li
13
31
0
07 Jun 2021
Emotion-aware Chat Machine: Automatic Emotional Response Generation for
  Human-like Emotional Interaction
Emotion-aware Chat Machine: Automatic Emotional Response Generation for Human-like Emotional Interaction
Wei Wei
Jiayi Liu
Xian-Ling Mao
G. Guo
Feida Zhu
Pan Zhou
Yuchong Hu
53
56
0
06 Jun 2021
DynaEval: Unifying Turn and Dialogue Level Evaluation
DynaEval: Unifying Turn and Dialogue Level Evaluation
Chen Zhang
Yiming Chen
L. F. D’Haro
Yan Zhang
Thomas Friedrichs
Grandee Lee
Haizhou Li
24
73
0
02 Jun 2021
HERALD: An Annotation Efficient Method to Detect User Disengagement in
  Social Conversations
HERALD: An Annotation Efficient Method to Detect User Disengagement in Social Conversations
Weixin Liang
Kai-Hui Liang
Zhou Yu
42
15
0
01 Jun 2021
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
20
60
0
19 May 2021
Empathetic Dialog Generation with Fine-Grained Intents
Empathetic Dialog Generation with Fine-Grained Intents
Yubo Xie
P. Pu
VLM
27
26
0
14 May 2021
Semi-Supervised Variational Reasoning for Medical Dialogue Generation
Semi-Supervised Variational Reasoning for Medical Dialogue Generation
Dongdong Li
Zhaochun Ren
Pengjie Ren
Zhumin Chen
M. Fan
Jun Ma
Maarten de Rijke
BDL
DRL
OffRL
MedIm
32
48
0
13 May 2021
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic
  Survey
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey
Jinjie Ni
Tom Young
Vlad Pandelea
Fuzhao Xue
Min Zhang
54
268
0
10 May 2021
LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via
  Crowdsourcing
LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing
Yu Li
Josh Arnold
Feifan Yan
Weiyan Shi
Zhou Yu
ELM
31
11
0
05 May 2021
Meta-evaluation of Conversational Search Evaluation Metrics
Meta-evaluation of Conversational Search Evaluation Metrics
Zeyang Liu
K. Zhou
Max L. Wilson
ELM
32
17
0
27 Apr 2021
Code Structure Guided Transformer for Source Code Summarization
Code Structure Guided Transformer for Source Code Summarization
Shuzheng Gao
Cuiyun Gao
Yulan He
Jichuan Zeng
L. Nie
Xin Xia
Michael R. Lyu
22
96
0
19 Apr 2021
Improving Question Answering Model Robustness with Synthetic Adversarial
  Data Generation
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation
Max Bartolo
Tristan Thrush
Robin Jia
Sebastian Riedel
Pontus Stenetorp
Douwe Kiela
AAML
28
103
0
18 Apr 2021
Crossing the Conversational Chasm: A Primer on Natural Language
  Processing for Multilingual Task-Oriented Dialogue Systems
Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems
E. Razumovskaia
Goran Glavaš
Olga Majewska
Edoardo Ponti
Anna Korhonen
Ivan Vulić
33
32
0
17 Apr 2021
$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues
  via Question Generation and Question Answering
Q2Q^{2}Q2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Or Honovich
Leshem Choshen
Roee Aharoni
Ella Neeman
Idan Szpektor
Omri Abend
HILM
36
138
0
16 Apr 2021
Action-Based Conversations Dataset: A Corpus for Building More In-Depth
  Task-Oriented Dialogue Systems
Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems
Derek Chen
Howard Chen
Yi Yang
A. Lin
Zhou Yu
17
65
0
01 Apr 2021
Advances and Challenges in Conversational Recommender Systems: A Survey
Advances and Challenges in Conversational Recommender Systems: A Survey
Chongming Gao
Wenqiang Lei
Xiangnan He
Maarten de Rijke
Tat-Seng Chua
138
273
0
23 Jan 2021
Towards Facilitating Empathic Conversations in Online Mental Health
  Support: A Reinforcement Learning Approach
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
Ashish Sharma
Inna Wanyin Lin
Adam S. Miner
David C. Atkins
Tim Althoff
AI4MH
25
140
0
19 Jan 2021
CRSLab: An Open-Source Toolkit for Building Conversational Recommender
  System
CRSLab: An Open-Source Toolkit for Building Conversational Recommender System
Kun Zhou
Xiaolei Wang
Yuanhang Zhou
Chenzhang Shang
Yuan Cheng
Wayne Xin Zhao
Yaliang Li
Ji-Rong Wen
35
63
0
04 Jan 2021
Writing Polishment with Simile: Task, Dataset and A Neural Approach
Writing Polishment with Simile: Task, Dataset and A Neural Approach
Jiayi Zhang
Zhi Cui
Xiaoqiang Xia
Yalong Guo
Yanran Li
Chen Wei
Jianwei Cui
20
17
0
15 Dec 2020
Target Guided Emotion Aware Chat Machine
Target Guided Emotion Aware Chat Machine
Wei Wei
Jiayi Liu
Xian-Ling Mao
G. Guo
Feida Zhu
Pan Zhou
Yuchong Hu
Shanshan Feng
30
24
0
15 Nov 2020
Refer, Reuse, Reduce: Generating Subsequent References in Visual and
  Conversational Contexts
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz
Mario Giulianelli
Sandro Pezzelle
Arabella J. Sinclair
Raquel Fernández
20
26
0
09 Nov 2020
Exploring Question-Specific Rewards for Generating Deep Questions
Exploring Question-Specific Rewards for Generating Deep Questions
Yuxi Xie
Liangming Pan
Dongzhe Wang
Min-Yen Kan
Yansong Feng
53
27
0
02 Nov 2020
Deconstruct to Reconstruct a Configurable Evaluation Metric for
  Open-Domain Dialogue Systems
Deconstruct to Reconstruct a Configurable Evaluation Metric for Open-Domain Dialogue Systems
Vitou Phy
Yang Zhao
Akiko Aizawa
14
55
0
01 Nov 2020
PowerTransformer: Unsupervised Controllable Revision for Biased Language
  Correction
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction
Xinyao Ma
Maarten Sap
Hannah Rashkin
Yejin Choi
40
73
0
26 Oct 2020
An Evaluation Protocol for Generative Conversational Systems
An Evaluation Protocol for Generative Conversational Systems
Seolhwa Lee
Heuiseok Lim
Jo˜ao Sedoc
ELM
35
10
0
24 Oct 2020
Self-Supervised Contrastive Learning for Efficient User Satisfaction
  Prediction in Conversational Agents
Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Mohammad Kachuee
Hao Yuan
Young-Bum Kim
Sungjin Lee
27
25
0
21 Oct 2020
PARENTing via Model-Agnostic Reinforcement Learning to Correct
  Pathological Behaviors in Data-to-Text Generation
PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation
Clément Rebuffel
Laure Soulier
Geoffrey Scoutheeten
Patrick Gallinari
8
9
0
21 Oct 2020
Local Knowledge Powered Conversational Agents
Local Knowledge Powered Conversational Agents
Sashank Santhanam
Ming-Yu Liu
Raul Puri
M. Shoeybi
M. Patwary
Bryan Catanzaro
29
4
0
20 Oct 2020
Cue Me In: Content-Inducing Approaches to Interactive Story Generation
Cue Me In: Content-Inducing Approaches to Interactive Story Generation
Faeze Brahman
Alexandru Petrusca
Snigdha Chaturvedi
LRM
24
20
0
20 Oct 2020
What is More Likely to Happen Next? Video-and-Language Future Event
  Prediction
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
33
72
0
15 Oct 2020
Reformulating Unsupervised Style Transfer as Paraphrase Generation
Reformulating Unsupervised Style Transfer as Paraphrase Generation
Kalpesh Krishna
John Wieting
Mohit Iyyer
30
238
0
12 Oct 2020
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Dongyeop Kang
Eduard H. Hovy
LRM
42
24
0
11 Oct 2020
Like hiking? You probably enjoy nature: Persona-grounded Dialog with
  Commonsense Expansions
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions
Bodhisattwa Prasad Majumder
Harsh Jhamtani
Taylor Berg-Kirkpatrick
Julian McAuley
30
85
0
07 Oct 2020
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Shaoxiong Feng
Xuancheng Ren
Hongshen Chen
Bin Sun
Kan Li
Xu Sun
18
20
0
05 Oct 2020
Generating Dialogue Responses from a Semantic Latent Space
Generating Dialogue Responses from a Semantic Latent Space
Wei-Jen Ko
Avik Ray
Yilin Shen
Hongxia Jin
VLM
20
6
0
04 Oct 2020
MIME: MIMicking Emotions for Empathetic Response Generation
MIME: MIMicking Emotions for Empathetic Response Generation
Navonil Majumder
Pengfei Hong
Shanshan Peng
Jiankun Lu
Deepanway Ghosal
Alexander Gelbukh
Rada Mihalcea
Soujanya Poria
25
200
0
04 Oct 2020
Predicting User Engagement Status for Online Evaluation of Intelligent
  Assistants
Predicting User Engagement Status for Online Evaluation of Intelligent Assistants
Rui Meng
Zhen Yue
A. Glass
21
2
0
01 Oct 2020
Pchatbot: A Large-Scale Dataset for Personalized Chatbot
Pchatbot: A Large-Scale Dataset for Personalized Chatbot
Hongjin Qian
Xiaohe Li
Hanxun Zhong
Yu Guo
Yueyuan Ma
Yutao Zhu
Zhanliang Liu
Zhanliang Liu
Ji-Rong Wen
41
43
0
28 Sep 2020
Enhancing Dialogue Generation via Multi-Level Contrastive Learning
Enhancing Dialogue Generation via Multi-Level Contrastive Learning
Xin Li
Piji Li
Yan Wang
Xiaojiang Liu
Wai Lam
26
5
0
19 Sep 2020
GLUCOSE: GeneraLized and COntextualized Story Explanations
GLUCOSE: GeneraLized and COntextualized Story Explanations
N. Mostafazadeh
Aditya Kalyanpur
Lori Moon
David W. Buchanan
Lauren Berkowitz
Or Biran
Jennifer Chu-Carroll
32
121
0
16 Sep 2020
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
Jian Guan
Minlie Huang
29
69
0
16 Sep 2020
Zero-Resource Knowledge-Grounded Dialogue Generation
Zero-Resource Knowledge-Grounded Dialogue Generation
Linxiao Li
Can Xu
Wei Wu
Yufan Zhao
Xueliang Zhao
Chongyang Tao
36
70
0
29 Aug 2020
A Survey of Evaluation Metrics Used for NLG Systems
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
33
230
0
27 Aug 2020
Opinion-aware Answer Generation for Review-driven Question Answering in
  E-Commerce
Opinion-aware Answer Generation for Review-driven Question Answering in E-Commerce
Yang Deng
Wenxuan Zhanng
Wai Lam
16
31
0
27 Aug 2020
CoreGen: Contextualized Code Representation Learning for Commit Message
  Generation
CoreGen: Contextualized Code Representation Learning for Commit Message Generation
L. Nie
Cuiyun Gao
Zhicong Zhong
Wai Lam
Yang Liu
Zenglin Xu
29
46
0
14 Jul 2020
Generating Informative Dialogue Responses with Keywords-Guided Networks
Generating Informative Dialogue Responses with Keywords-Guided Networks
Heng-Da Xu
Xian-Ling Mao
Zewen Chi
Jing-Jing Zhu
Fanshu Sun
Heyan Huang
BDL
14
5
0
03 Jul 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
19
378
0
26 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAG
AI4CE
28
51
0
22 Jun 2020
Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of
  Current Evaluation Protocols
Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols
Sarah E. Finch
Jinho Choi
ELM
29
67
0
10 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Previous
123456
Next