ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.14666
  4. Cited By
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
  Task-oriented Dialog Systems

RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems

29 December 2020
Baolin Peng
Chunyuan Li
Zhu Zhang
Chenguang Zhu
Jinchao Li
Jianfeng Gao
ArXivPDFHTML

Papers citing "RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems"

10 / 10 papers shown
Title
Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment
  Pre-training for Noisy Slot Filling Task
Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-training for Noisy Slot Filling Task
Jinxu Zhao
Guanting Dong
Yueyan Qiu
Tingfeng Hui
Xiaoshuai Song
Daichi Guo
Weiran Xu
29
1
0
22 Feb 2024
Revisit Input Perturbation Problems for LLMs: A Unified Robustness
  Evaluation Framework for Noisy Slot Filling Task
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task
Guanting Dong
Jinxu Zhao
Tingfeng Hui
Daichi Guo
Wenlong Wan
...
Yueyan Qiu
Zhuoma Gongque
Keqing He
Zechen Wang
Weiran Xu
AAML
35
20
0
10 Oct 2023
Robust Question Answering against Distribution Shifts with Test-Time
  Adaptation: An Empirical Study
Robust Question Answering against Distribution Shifts with Test-Time Adaptation: An Empirical Study
Hai Ye
Yuyang Ding
Juntao Li
Hwee Tou Ng
OOD
TTA
29
9
0
09 Feb 2023
Sources of Noise in Dialogue and How to Deal with Them
Sources of Noise in Dialogue and How to Deal with Them
Derek Chen
Zhou Yu
24
2
0
06 Dec 2022
Are Current Task-oriented Dialogue Systems Able to Satisfy Impolite
  Users?
Are Current Task-oriented Dialogue Systems Able to Satisfy Impolite Users?
Zhiqiang Hu
Roy Ka-Wei Lee
Nancy F. Chen
32
4
0
24 Oct 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson
Gordon Lim
Yutong Ai
David Kuang
Kevin Leach
OODD
OOD
37
18
0
14 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
127
94
0
06 Oct 2022
"Do you follow me?": A Survey of Recent Approaches in Dialogue State
  Tracking
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
Léo Jacqmin
L. Rojas-Barahona
Benoit Favre
43
27
0
29 Jul 2022
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question
  Answering
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
Aditya Gupta
Jiacheng Xu
Shyam Upadhyay
Diyi Yang
Manaal Faruqui
37
33
0
08 Jun 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1