Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1603.08023
Cited By
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
25 March 2016
Chia-Wei Liu
Ryan J. Lowe
Iulian Serban
Michael Noseworthy
Laurent Charlin
Joelle Pineau
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation"
50 / 292 papers shown
Title
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
27
51
0
01 Jun 2022
Commonsense and Named Entity Aware Knowledge Grounded Dialogue Generation
Deeksha Varshney
Akshara Prabhakar
Asif Ekbal
29
18
0
27 May 2022
A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations
Md Mosharaf Hossain
L. Holman
Anusha Kakileti
T. Kao
N. Brito
A. Mathews
Eduardo Blanco
32
3
0
23 May 2022
Computational Storytelling and Emotions: A Survey
Yusuke Mori
Hiroaki Yamane
Yusuke Mukuta
Tatsuya Harada
45
2
0
23 May 2022
CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Bishal Santra
Ravi Ghadia
Manish Gupta
Pawan Goyal
OffRL
23
0
0
21 May 2022
Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation
Prakhar Gupta
Harsh Jhamtani
Jeffrey P. Bigham
49
12
0
19 May 2022
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets
Philippe Laban
Chien-Sheng Wu
Wenhao Liu
Caiming Xiong
43
5
0
13 May 2022
Vector Representations of Idioms in Conversational Systems
Tosin Adewumi
F. Liwicki
Marcus Liwicki
50
8
0
07 May 2022
Balancing Multi-Domain Corpora Learning for Open-Domain Response Generation
Yujie Xing
Jason (Jinglun) Cai
Nils Barlaug
Peng Liu
J. Gulla
31
4
0
05 May 2022
State-of-the-art in Open-domain Conversational AI: A Survey
Tosin Adewumi
F. Liwicki
Marcus Liwicki
32
15
0
02 May 2022
COSPLAY: Concept Set Guided Personalized Dialogue Generation Across Both Party Personas
Chengshi Xu
Pijian Li
Wei Wang
Haoran Yang
Siyun Wang
Chuangbai Xiao
38
26
0
02 May 2022
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation
Sarik Ghazarian
Behnam Hedayatnia
Alexandros Papangelis
Yang Liu
Dilek Z. Hakkani-Tür
30
19
0
25 Mar 2022
Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue Systems
Yi-Lin Tuan
Sajjad Beygi
Maryam Fazel-Zarandi
Qiaozi Gao
Alessandra Cervone
William Yang Wang
LRM
29
23
0
20 Mar 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges
Shikib Mehri
Jinho Choi
L. F. D’Haro
Jan Deriu
M. Eskénazi
...
David Traum
Yi-Ting Yeh
Zhou Yu
Yizhe Zhang
Chen Zhang
34
21
0
18 Mar 2022
RoMe: A Robust Metric for Evaluating Natural Language Generation
Md. Rony
Liubov Kovriguina
Debanjan Chaudhuri
Ricardo Usbeck
Jens Lehmann
22
12
0
17 Mar 2022
Conversational Recommendation: A Grand AI Challenge
Dietmar Jannach
L. Chen
34
18
0
17 Mar 2022
Probing the Robustness of Trained Metrics for Conversational Dialogue Systems
Jan Deriu
Don Tuggener
Pius von Daniken
Mark Cieliebak
AAML
19
9
0
28 Feb 2022
Rethinking and Refining the Distinct Metric
Siyang Liu
Sahand Sabour
Yinhe Zheng
Pei Ke
Xiaoyan Zhu
Minlie Huang
36
11
0
28 Feb 2022
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
Jianqiao Zhao
Yanyang Li
Wanyu Du
Yangfeng Ji
Dong Yu
M. Lyu
Liwei Wang
33
4
0
14 Feb 2022
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
13
611
0
07 Feb 2022
Conversational Agents: Theory and Applications
M. Wahde
M. Virgolin
LLMAG
32
25
0
07 Feb 2022
Towards Personalized Answer Generation in E-Commerce via Multi-Perspective Preference Modeling
Yang Deng
Yaliang Li
Wenxuan Zhang
Bolin Ding
W. Lam
30
36
0
27 Dec 2021
Ditch the Gold Standard: Re-evaluating Conversational Question Answering
Huihan Li
Tianyu Gao
Manan Goenka
Danqi Chen
24
21
0
16 Dec 2021
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
Chen Zhang
L. F. D’Haro
Thomas Friedrichs
Haizhou Li
ELM
25
18
0
14 Dec 2021
Understanding and Improving the Exemplar-based Generation for Open-domain Conversation
Seungju Han
Beomsu Kim
Seokjun Seo
Enkhbayar Erdenee
Buru Chang
36
3
0
13 Dec 2021
Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity
Kurt Shuster
Jack Urbanek
Arthur Szlam
Jason Weston
HILM
24
24
0
10 Dec 2021
CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning
Teyun Kwon
Anandha Gopalan
30
2
0
01 Dec 2021
Learning to Predict Persona Information forDialogue Personalization without Explicit Persona Description
Wangchunshu Zhou
Qifei Li
Chenle Li
21
9
0
30 Nov 2021
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Chen Zhang
João Sedoc
L. F. D’Haro
Rafael E. Banchs
Alexander I. Rudnicky
22
36
0
03 Nov 2021
A Systematic Investigation of Commonsense Knowledge in Large Language Models
Xiang Lorraine Li
A. Kuncoro
Jordan Hoffmann
Cyprien de Masson dÁutume
Phil Blunsom
Aida Nematzadeh
LRM
25
58
0
31 Oct 2021
EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments
Emmanouil Zaranis
Georgios Paraskevopoulos
Athanasios Katsamanis
Alexandros Potamianos
30
9
0
30 Oct 2021
I Do Not Understand What I Cannot Define: Automatic Question Generation With Pedagogically-Driven Content Selection
Tim Steuer
Anna Filighera
Tobias Meuser
Christoph Rensing
24
10
0
08 Oct 2021
Simulated Annealing for Emotional Dialogue Systems
Chengzhang Dong
Chenyang Huang
Osmar Zaïane
Lili Mou
34
5
0
22 Sep 2021
A Plug-and-Play Method for Controlled Text Generation
Damian Pascual
Béni Egressy
Clara Meister
Ryan Cotterell
Roger Wattenhofer
27
89
0
20 Sep 2021
Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules
Forough Arabshahi
Jennifer Lee
Antoine Bosselut
Yejin Choi
Tom Michael Mitchell
LRM
24
17
0
17 Sep 2021
Identifying Untrustworthy Samples: Data Filtering for Open-domain Dialogues with Bayesian Optimization
Lei Shen
Haolan Zhan
Xin Shen
Hongshen Chen
Xiaofang Zhao
Xiao-Dan Zhu
43
17
0
14 Sep 2021
Perturbation CheckLists for Evaluating NLG Evaluation Metrics
Ananya B. Sai
Tanay Dixit
D. Y. Sheth
S. Mohan
Mitesh M. Khapra
AAML
116
58
0
13 Sep 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Zechen Bai
Yuta Nakashima
Noa Garcia
68
43
0
13 Sep 2021
CEM: Commonsense-aware Empathetic Response Generation
Sahand Sabour
Chujie Zheng
Minlie Huang
28
149
0
13 Sep 2021
Generating Personalized Dialogue via Multi-Task Meta-Learning
Jing Yang Lee
Kong Aik Lee
W. Gan
33
14
0
07 Aug 2021
How to Evaluate Your Dialogue Models: A Review of Approaches
Xinmeng Li
Wansen Wu
Long Qin
Quanjun Yin
ELM
30
8
0
03 Aug 2021
WeaSuL: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue
Anant Khandelwal
OffRL
24
6
0
01 Aug 2021
An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers
Lu Wang
Munif Ishad Mujib
Jake Williams
G. Demiris
Jina Huh-Yoo
AI4MH
32
32
0
28 Jul 2021
Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features
Hannah Rashkin
David Reitter
Gaurav Singh Tomar
Dipanjan Das
172
101
0
14 Jul 2021
Productivity, Portability, Performance: Data-Centric Python
Yiheng Wang
Yao Zhang
Yanzhang Wang
Yan Wan
Jiao Wang
Zhongyuan Wu
Yuhao Yang
Bowen She
56
95
0
01 Jul 2021
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text
Elizabeth Clark
Tal August
Sofia Serrano
Nikita Haduong
Suchin Gururangan
Noah A. Smith
DeLMO
54
398
0
30 Jun 2021
Do Encoder Representations of Generative Dialogue Models Encode Sufficient Information about the Task ?
Prasanna Parthasarathi
J. Pineau
Sarath Chandar
13
2
0
20 Jun 2021
Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation
Prakhar Gupta
Yulia Tsvetkov
Jeffrey P. Bigham
42
22
0
10 Jun 2021
A Comprehensive Assessment of Dialog Evaluation Metrics
Yi-Ting Yeh
M. Eskénazi
Shikib Mehri
36
105
0
07 Jun 2021
GTM: A Generative Triple-Wise Model for Conversational Question Generation
Lei Shen
Fandong Meng
Jinchao Zhang
Yang Feng
Jie Zhou
19
13
0
07 Jun 2021
Previous
1
2
3
4
5
6
Next