Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.01063
Cited By
v1
v2
v3 (latest)
DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech
3 July 2022
Keon Lee
Kyumin Park
Daeyoung Kim
LM&MA
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech"
32 / 32 papers shown
Title
Do We Still Need Audio? Rethinking Speaker Diarization with a Text-Based Approach Using Multiple Prediction Models
Peilin Wu
Jinho Choi
18
0
0
12 Jun 2025
DialogueAgents: A Hybrid Agent-Based Speech Synthesis Framework for Multi-Party Dialogue
Xuzhao Li
Duyi Pan
Hongru Xiao
Jiawei Han
Jing Tang
Jiabao Ma
Wenjie Wang
Bo Cheng
68
1
0
20 Apr 2025
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development
Minghan Wang
Ye Bai
Yanjie Wang
Thuy-Trang Vu
Ehsan Shareghi
Gholamreza Haffari
97
0
0
31 Mar 2025
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
Minsu Kim
Rodrigo Mira
Honglie Chen
Stavros Petridis
Maja Pantic
115
0
0
13 Mar 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Xiang Wang
Mingqi Jiang
Zejun Ma
Ziyu Zhang
Shixuan Liu
...
Zhifei Li
Xie Chen
Lei Xie
Yu Guo
Wei Xue
132
22
0
03 Mar 2025
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
Weihao Wu
Zhiwei Lin
Yixuan Zhou
Jingbei Li
Rui Niu
Qinghua Wu
Songjun Cao
Long Ma
Zhiyong Wu
DiffM
84
0
0
27 Feb 2025
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Ke-Han Lu
Zhehuai Chen
Szu-Wei Fu
Chao-Han Huck Yang
Jagadeesh Balam
Boris Ginsburg
Yu-Te Wang
Hung-yi Lee
AuLLM
SyDa
170
16
0
28 Jan 2025
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
Rui Liu
Zhenqi Jia
F. Bao
Hong Li
77
2
0
11 Jan 2025
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios
Xize Cheng
Dongjie Fu
Xiaoda Yang
Minghui Fang
Ruofan Hu
...
Rongjie Huang
Linjun Li
Yu Chen
Tao Jin
Zhou Zhao
123
1
0
03 Jan 2025
Intra- and Inter-modal Context Interaction Modeling for Conversational Speech Synthesis
Zhenqi Jia
Rui Liu
68
1
0
25 Dec 2024
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling
Rui Liu
Zhenqi Jia
Jie Yang
Yifan Hu
Hong Li
96
2
0
12 Oct 2024
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
Yiming Chen
Xianghu Yue
Xiaoxue Gao
Chen Zhang
L. F. D’Haro
R. Tan
Haizhou Li
AuLLM
139
2
0
27 Sep 2024
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
464
0
0
14 Sep 2024
PRESENT: Zero-Shot Text-to-Prosody Control
Perry Lam
Huayun Zhang
Nancy F. Chen
Berrak Sisman
Dorien Herremans
83
0
0
13 Aug 2024
Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation
Yinghao Aaron Li
Xilin Jiang
Jordan Darefsky
Ge Zhu
N. Mesgarani
92
4
0
13 Aug 2024
Generative Expressive Conversational Speech Synthesis
Rui Liu
Yifan Hu
Yi Ren
Xiang Yin
Haizhou Li
119
6
0
31 Jul 2024
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Wataru Nakata
Kentaro Seki
Hitomi Yanaka
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
AuLLM
60
2
0
22 Jul 2024
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Rui Liu
Haolin Zuo
Zheng Lian
Xiaofen Xing
Björn W. Schuller
Haizhou Li
95
6
0
03 Jul 2024
Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?
Guan-Ting Lin
Hung-yi Lee
94
6
0
16 Jun 2024
DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage
Kyra Wang
Dorien Herremans
112
0
0
13 Jun 2024
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
Se Jin Park
Chae Won Kim
Hyeongseop Rha
Minsu Kim
Joanna Hong
Jeong Hun Yeo
Yong Man Ro
CVBM
AuLLM
91
14
0
12 Jun 2024
Multi-Sample Dynamic Time Warping for Few-Shot Keyword Spotting
Kevin Wilkinghoff
Alessia Cornaggia
75
0
0
23 Apr 2024
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Leying Zhang
Yao Qian
Long Zhou
Shujie Liu
Dongmei Wang
...
Yanmin Qian
Jinyu Li
Lei He
Sheng Zhao
Michael Zeng
63
2
0
10 Apr 2024
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
Rui Liu
Yifan Hu
Yi Ren
Xiang Yin
Haizhou Li
97
19
0
19 Dec 2023
CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis
Yayue Deng
Jinlong Xue
Yukang Jia
Qifei Li
Yichen Han
Fengping Wang
Yingming Gao
Dengfeng Ke
Ya Li
89
7
0
16 Dec 2023
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
Siddhant Arora
Hayato Futami
Jee-weon Jung
Yifan Peng
Roshan S. Sharma
Yosuke Kashiwagi
E. Tsunoo
Karen Livescu
Shinji Watanabe
ELM
68
9
0
04 Oct 2023
Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech
Rui Liu
Bin Liu
Haizhou Li
53
3
0
21 Sep 2023
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios
Yuyue Wang
Huanhou Xiao
Yihan Wu
Ruihua Song
36
0
0
20 May 2023
TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping
Kevin Wilkinghoff
Alessia Cornaggia
AI4TS
74
2
0
18 May 2023
M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis
Jinlong Xue
Yayue Deng
Fengping Wang
Ya Li
Yingming Gao
J. Tao
Jianqing Sun
Jiaen Liang
57
10
0
03 May 2023
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents
Suhee Jo
Younggun Lee
Yookyung Shin
Yeongtae Hwang
Taesu Kim
45
4
0
15 Mar 2023
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis
Yifan Hu
Rui Liu
Guanglai Gao
Haizhou Li
383
8
0
27 Oct 2022
1