ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.23848
  4. Cited By
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development

SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development

31 March 2025
Minghan Wang
Ye Bai
Yanjie Wang
Thuy-Trang Vu
Ehsan Shareghi
Gholamreza Haffari
ArXiv (abs)PDFHTML

Papers citing "SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development"

27 / 27 papers shown
Title
Reference-free Evaluation Metrics for Text Generation: A Survey
Reference-free Evaluation Metrics for Text Generation: A Survey
Takumi Ito
Kees van Deemter
Jun Suzuki
ELM
127
2
0
21 Jan 2025
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model
  with Frozen LLM
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Xiong Wang
Yangze Li
Chaoyou Fu
Yunhang Shen
Lei Xie
Ke Li
Xing Sun
Long Ma
AuLLMMLLM
154
40
0
01 Nov 2024
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Qinglin Zhang
Luyao Cheng
Chong Deng
Qian Chen
Wen Wang
...
Jiaqing Liu
Hai Yu
Chaohong Tan
Zhihao Du
Shiliang Zhang
SyDaBDLAuLLMVLM
146
20
0
23 Oct 2024
VoiceBench: Benchmarking LLM-Based Voice Assistants
VoiceBench: Benchmarking LLM-Based Voice Assistants
Yiming Chen
Xianghu Yue
Chen Zhang
Xiaoxue Gao
R. Tan
Haoyang Li
ELMAuLLM
118
29
0
22 Oct 2024
Moshi: a speech-text foundation model for real-time dialogue
Moshi: a speech-text foundation model for real-time dialogue
Alexandre Défossez
Laurent Mazaré
Manu Orsini
Amélie Royer
P. Pérez
Hervé Jégou
Edouard Grave
Neil Zeghidour
AuLLM
165
150
0
17 Sep 2024
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from
  Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic
  Speech
The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech
Kaito Baba
Wataru Nakata
Yuki Saito
Hiroshi Saruwatari
VLM
109
17
0
14 Sep 2024
Generating Data with Text-to-Speech and Large-Language Models for
  Conversational Speech Recognition
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
Samuele Cornell
Jordan Darefsky
Zhiyao Duan
Shinji Watanabe
SyDa
93
5
0
17 Aug 2024
HoLLMwood: Unleashing the Creativity of Large Language Models in
  Screenwriting via Role Playing
HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing
Jing Chen
Xinyu Zhu
Cheng Yang
Chufan Shi
Yadong Xi
...
Junjie Wang
Jiashu Pu
Rongsheng Zhang
Yujiu Yang
Tian Feng
90
9
0
17 Jun 2024
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
Zhende Song
Chenchen Wang
Jiamu Sheng
C. Zhang
Gang Yu
Jiayuan Fan
Tao Chen
VGen
91
20
0
03 Mar 2024
Qwen-Audio: Advancing Universal Audio Understanding via Unified
  Large-Scale Audio-Language Models
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Yunfei Chu
Jin Xu
Xiaohuan Zhou
Qian Yang
Shiliang Zhang
Zhijie Yan
Chang Zhou
Jingren Zhou
AuLLM
150
351
0
14 Nov 2023
Towards human-like spoken dialogue generation between AI agents from
  written dialogue
Towards human-like spoken dialogue generation between AI agents from written dialogue
Kentaro Mitsui
Yukiya Hono
Kei Sawada
88
14
0
02 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
634
4,460
0
09 Jun 2023
PLACES: Prompting Language Models for Social Conversation Synthesis
PLACES: Prompting Language Models for Social Conversation Synthesis
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Seokhwan Kim
Andrew Rosenbaum
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
107
86
0
07 Feb 2023
SpeechLMScore: Evaluating speech generation using speech language model
SpeechLMScore: Evaluating speech generation using speech language model
Soumi Maiti
Yifan Peng
Takaaki Saeki
Shinji Watanabe
ALM
78
32
0
08 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
233
3,780
0
06 Dec 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue
  Understanding
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
110
35
0
25 Oct 2022
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong
Yang Liu
Da Yin
Yuning Mao
Yizhu Jiao
Peng Liu
Chenguang Zhu
Heng Ji
Jiawei Han
ELM
115
276
0
13 Oct 2022
CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing
CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing
Andrew Rosenbaum
Saleh Soltan
Wael Hamza
Amir Saffari
Macro Damonte
Isabel Groves
100
32
0
13 Oct 2022
Using Large Language Models to Simulate Multiple Humans and Replicate
  Human Subject Studies
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
207
405
0
18 Aug 2022
DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech
DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech
Keon Lee
Kyumin Park
Daeyoung Kim
LM&MA
115
46
0
03 Jul 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin
  Conversational(RAMC) Speech Dataset
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
69
51
0
31 Mar 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly
  Voice Agent
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Yuki Saito
Yuto Nishimura
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
126
12
0
28 Mar 2022
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Rohola Zandie
Mohammad H. Mahoor
Julia Madsen
Eshrat S. Emamian
66
25
0
15 Jun 2021
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis
  with Graph-based Multi-modal Context Modeling
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling
Jingbei Li
Yi Meng
Chenyi Li
Zhiyong Wu
Helen Meng
Chao Weng
Jane Polak Scowcroft
93
24
0
11 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
141
770
0
08 Jun 2021
Common Voice: A Massively-Multilingual Speech Corpus
Common Voice: A Massively-Multilingual Speech Corpus
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
VLM
120
1,625
0
13 Dec 2019
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset,
  task and baselines
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
Jon Barker
Shinji Watanabe
Emmanuel Vincent
J. Trmal
73
686
0
28 Mar 2018
1