Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols

10 June 2020

Papers citing "Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols"

28 / 28 papers shown

Title
Towards conversational assistants for health applications: using ChatGPT to generate conversations about heart failure Anuja Tayal Devika Salunke Barbara Di Eugenio Paula Allen-Meares Eulalia P Abril Olga Garcia-Bedoya Carolyn Dickens Andrew D. Boyd LM&MA AI4MH 46 0 0 06 May 2025
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey Longxuan Ma Mingda Li Weinan Zhang Jiapeng Li Ting Liu 45 16 0 14 Nov 2024
Mixed-Session Conversation with Egocentric Memory Jihyoung Jang Taeyoung Kim Hyounghun Kim 28 0 0 03 Oct 2024
It Couldn't Help But Overhear: On the Limits of Modelling Meta-Communicative Grounding Acts with Supervised Learning Brielen Madureira David Schlangen 35 0 0 02 May 2024
DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment Yukun Zhao Lingyong Yan Weiwei Sun Chong Meng Shuaiqiang Wang Zhicong Cheng Zhaochun Ren Dawei Yin ELM 20 0 0 25 Oct 2023
Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue Evaluation Sarah E. Finch James D. Finch Jinho D. Choi 23 0 0 14 Sep 2023
ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems Sarik Ghazarian Yijia Shao Rujun Han Aram Galstyan Nanyun Peng 27 7 0 12 May 2023
A-CAP: Anticipation Captioning with Commonsense Knowledge D. Vo Quoc-An Luong Akihiro Sugimoto Hideki Nakayama 24 2 0 13 Apr 2023
Rewarding Chatbots for Real-World Engagement with Millions of Users R. Irvine D. Boubert Vyas Raina Adian Liusie Ziyi Zhu ... Valentin Assassi Christie-Carol Beauchamp Xiaoding Lu Thomas Rialan W. Beauchamp ALM 22 35 0 10 Mar 2023
Don't Forget Your ABC's: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems Sarah E. Finch James D. Finch Jinho D. Choi 35 12 0 18 Dec 2022
Keep Me Updated! Memory Management in Long-term Conversations Sanghwan Bae Donghyun Kwak Soyoung Kang Min Young Lee Sungdong Kim Yuin Jeong Hyeri Kim Sang-Woo Lee W. Park Nako Sung 40 46 0 17 Oct 2022
Evaluating Conversational Recommender Systems: A Landscape of Research Dietmar Jannach 12 26 0 25 Aug 2022
SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation Longxuan Ma Ziyu Zhuang Weinan Zhang Mingda Li Ting Liu 26 4 0 17 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception Keenan I. Jones Enes ALTUNCU V. N. Franqueira Yi-Chia Wang Shujun Li DeLMO 36 3 0 11 Aug 2022
Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing Metrics, and a Novel Simple Metric Ian Berlot-Attwell Frank Rudzicz 15 1 0 03 Jun 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents Hyunwoo J. Kim Youngjae Yu Liwei Jiang Ximing Lu Daniel Khashabi Gunhee Kim Yejin Choi Maarten Sap 20 117 0 25 May 2022
Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models Sanghwan Bae Donghyun Kwak Sungdong Kim Dong-hyun Ham Soyoung Kang Sang-Woo Lee W. Park ALM 25 37 0 30 Apr 2022
Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges Shikib Mehri Jinho Choi L. F. D’Haro Jan Deriu M. Eskénazi ... David Traum Yi-Ting Yeh Zhou Yu Yizhe Zhang Chen Zhang 30 21 0 18 Mar 2022
Achieving Reliable Human Assessment of Open-Domain Dialogue Systems Tianbo Ji Yvette Graham Gareth J. F. Jones Chenyang Lyu Qun Liu ALM 31 39 0 11 Mar 2022
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents Eric Michael Smith Orion Hsu Rebecca Qian Stephen Roller Y-Lan Boureau Jason Weston 23 66 0 12 Jan 2022
Task-oriented Dialogue Systems: performance vs. quality-optima, a review Ryan Fellows Hisham Ihshaish Steve Battle Ciaran Haines Peter Mayhew J. Ignacio Deza 14 5 0 21 Dec 2021
What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts James D. Finch Sarah E. Finch Jinho D. Choi 14 1 0 31 Oct 2021
Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems Michael Higgins Dominic Widdows Chris Brew Gwen Christian Andrew Maurer ... Akshay Hazare George Bonev Beth Ann Hockey Kristen Howell Joe Bradley 14 0 0 22 Sep 2021
Code-switched inspired losses for generic spoken dialog representations E. Chapuis Pierre Colombo Matthieu Labeau Chloe Clave 19 12 0 27 Aug 2021
DynaEval: Unifying Turn and Dialogue Level Evaluation Chen Zhang Yiming Chen L. F. D’Haro Yan Zhang Thomas Friedrichs Grandee Lee Haizhou Li 24 73 0 02 Jun 2021
Towards Standard Criteria for human evaluation of Chatbots: A Survey Hongru Liang Huaqing Li 18 13 0 24 May 2021
LEGOEval: An Open-Source Toolkit for Dialogue System Evaluation via Crowdsourcing Yu Li Josh Arnold Feifan Yan Weiyan Shi Zhou Yu ELM 26 11 0 05 May 2021
Human-like informative conversations: Better acknowledgements using conditional mutual information Ashwin Paranjape Christopher D. Manning 15 10 0 16 Apr 2021