RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems

RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems

29 December 2020

Papers citing "RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems"

10 / 10 papers shown

Title
Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-training for Noisy Slot Filling Task Jinxu Zhao Guanting Dong Yueyan Qiu Tingfeng Hui Xiaoshuai Song Daichi Guo Weiran Xu 29 1 0 22 Feb 2024
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task Guanting Dong Jinxu Zhao Tingfeng Hui Daichi Guo Wenlong Wan ... Yueyan Qiu Zhuoma Gongque Keqing He Zechen Wang Weiran Xu AAML 35 20 0 10 Oct 2023
Robust Question Answering against Distribution Shifts with Test-Time Adaptation: An Empirical Study Hai Ye Yuyang Ding Juntao Li Hwee Tou Ng OOD TTA 29 9 0 09 Feb 2023
Sources of Noise in Dialogue and How to Deal with Them Derek Chen Zhou Yu 24 2 0 06 Dec 2022
Are Current Task-oriented Dialogue Systems Able to Satisfy Impolite Users? Zhiqiang Hu Roy Ka-Wei Lee Nancy F. Chen 32 4 0 24 Oct 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers Stefan Larson Gordon Lim Yutong Ai David Kuang Kevin Leach OODD OOD 37 18 0 14 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review Dieuwke Hupkes Mario Giulianelli Verna Dankers Mikel Artetxe Yanai Elazar ... Leila Khalatbari Maria Ryskina Rita Frieske Ryan Cotterell Zhijing Jin 127 94 0 06 Oct 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking Léo Jacqmin L. Rojas-Barahona Benoit Favre 43 27 0 29 Jul 2022
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering Aditya Gupta Jiacheng Xu Shyam Upadhyay Diyi Yang Manaal Faruqui 37 33 0 08 Jun 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 299 6,984 0 20 Apr 2018