ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05899
  4. Cited By
Achieving Reliable Human Assessment of Open-Domain Dialogue Systems

Achieving Reliable Human Assessment of Open-Domain Dialogue Systems

11 March 2022
Tianbo Ji
Yvette Graham
Gareth J. F. Jones
Chenyang Lyu
Qun Liu
    ALM
ArXivPDFHTML

Papers citing "Achieving Reliable Human Assessment of Open-Domain Dialogue Systems"

29 / 29 papers shown
Title
OnRL-RAG: Real-Time Personalized Mental Health Dialogue System
OnRL-RAG: Real-Time Personalized Mental Health Dialogue System
Ahsan Bilal
Beiyu Lin
OffRL
RALM
AI4MH
46
1
0
02 Apr 2025
Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication
Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication
Weicheng Ma
Hefan Zhang
Ivory Yang
Shiyu Ji
Joice Chen
...
Shubham Mohole
Ethan Gearey
Michael Macy
Saeed Hassanpour
Soroush Vosoughi
64
0
0
13 Feb 2025
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
31
4
0
07 Oct 2024
Language Portability Strategies for Open-domain Dialogue with
  Pre-trained Language Models from High to Low Resource Languages
Language Portability Strategies for Open-domain Dialogue with Pre-trained Language Models from High to Low Resource Languages
Ahmed Njifenjou
Virgile Sucal
Bassam Jabaian
Fabrice Lefèvre
39
0
0
01 Jul 2024
Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain
  Human-Machine Conversation
Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation
Ahmed Njifenjou
Virgile Sucal
Bassam Jabaian
Fabrice Lefèvre
AI4CE
ALM
30
3
0
26 Jun 2024
ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark
ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark
Hiromi Wakaki
Yuki Mitsufuji
Yoshinori Maeda
Yukiko Nishimura
Silin Gao
Mengjie Zhao
Keiichi Yamada
Antoine Bosselut
41
0
0
17 Jun 2024
HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants
HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants
Milan Gritta
Gerasimos Lampouras
Ignacio Iacobacci
ALM
32
1
0
15 May 2024
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems
Ivan Sekulić
Silvia Terragni
Victor Guimaraes
Nghia Khau
Bruna Guedes
Modestas Filipavicius
A. Manso
Roland Mathis
46
5
0
20 Feb 2024
Findings of the First Workshop on Simulating Conversational Intelligence
  in Chat
Findings of the First Workshop on Simulating Conversational Intelligence in Chat
Yvette Graham
Mohammed Rameez Qureshi
Haider Khalid
Gerasimos Lampouras
Ignacio Iacobacci
Qun Liu
LRM
20
0
0
09 Feb 2024
UniRQR: A Unified Model for Retrieval Decision, Query, and Response
  Generation in Internet-Based Knowledge Dialogue Systems
UniRQR: A Unified Model for Retrieval Decision, Query, and Response Generation in Internet-Based Knowledge Dialogue Systems
Zhongtian Hu
Yangqi Chen
Meng Zhao
Ronghan Li
Lifang Wang
RALM
22
0
0
11 Jan 2024
Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain
  Dialogue Systems
Rethinking Response Evaluation from Interlocutor's Eye for Open-Domain Dialogue Systems
Tsuta Yuma
Naoki Yoshinaga
Shoetsu Sato
Masashi Toyoda
31
1
0
04 Jan 2024
A Comprehensive Analysis of the Effectiveness of Large Language Models
  as Automatic Dialogue Evaluators
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang
L. F. D’Haro
Yiming Chen
Malu Zhang
Haizhou Li
ELM
21
29
0
24 Dec 2023
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation
Peiyuan Gong
Jiaxin Mao
ELM
54
10
0
16 Dec 2023
PRODIGy: a PROfile-based DIalogue Generation dataset
PRODIGy: a PROfile-based DIalogue Generation dataset
Daniela Occhipinti
Serra Sinem Tekiroğlu
Marco Guerini
21
3
0
09 Nov 2023
A Multilingual Virtual Guide for Self-Attachment Technique
A Multilingual Virtual Guide for Self-Attachment Technique
Alicia Jiayun Law
Ruoyu Hu
Lisa Alazraki
Anandha Gopalan
Neophytos Polydorou
A. Edalat
11
3
0
25 Oct 2023
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark
Chen Zhang
L. F. D’Haro
Chengguang Tang
Ke Shi
Guohua Tang
Haizhou Li
ELM
43
9
0
13 Oct 2023
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
Zhengliang Shi
Weiwei Sun
Shuo Zhang
Zhen Zhang
Pengjie Ren
Z. Ren
16
8
0
15 Sep 2023
Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue
  Evaluation
Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue Evaluation
Sarah E. Finch
James D. Finch
Jinho Choi
31
0
0
14 Sep 2023
Toward More Accurate and Generalizable Evaluation Metrics for
  Task-Oriented Dialogs
Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs
A. Komma
Nagesh Panyam Chandrasekarasastry
Timothy Leffel
Anuj Kumar Goyal
A. Metallinou
Spyros Matsoukas
Aram Galstyan
33
3
0
06 Jun 2023
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue
  Generation via Over-sampling and Post-evaluation
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation
Junkai Zhou
Liang Pang
Huawei Shen
Xueqi Cheng
27
9
0
18 May 2023
A Paradigm Shift: The Future of Machine Translation Lies with Large
  Language Models
A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models
Chenyang Lyu
Zefeng Du
Jitao Xu
Yitao Duan
Minghao Wu
Teresa Lynn
Alham Fikri Aji
Derek F. Wong
Siyou Liu
Longyue Wang
58
25
0
02 May 2023
Building Multimodal AI Chatbots
Building Multimodal AI Chatbots
Mingyu Lee
29
3
0
21 Apr 2023
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
58
99
0
19 Dec 2022
Don't Forget Your ABC's: Evaluating the State-of-the-Art in
  Chat-Oriented Dialogue Systems
Don't Forget Your ABC's: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems
Sarah E. Finch
James D. Finch
Jinho Choi
38
12
0
18 Dec 2022
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
  Systems
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems
Songbo Hu
Ivan Vulić
Fangyu Liu
Anna Korhonen
39
0
0
07 Nov 2022
Keep Me Updated! Memory Management in Long-term Conversations
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae
Donghyun Kwak
Soyoung Kang
Min Young Lee
Sungdong Kim
Yuin Jeong
Hyeri Kim
Sang-Woo Lee
W. Park
Nako Sung
43
47
0
17 Oct 2022
QAScore -- An Unsupervised Unreferenced Metric for the Question
  Generation Evaluation
QAScore -- An Unsupervised Unreferenced Metric for the Question Generation Evaluation
Tianbo Ji
Chenyang Lyu
Gareth J. F. Jones
Liting Zhou
Yvette Graham
25
21
0
09 Oct 2022
State-of-the-art in Open-domain Conversational AI: A Survey
State-of-the-art in Open-domain Conversational AI: A Survey
Tosin P. Adewumi
F. Liwicki
Marcus Liwicki
32
15
0
02 May 2022
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1