Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.02269
Cited By
M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis
3 May 2023
Jinlong Xue
Yayue Deng
Fengping Wang
Ya Li
Yingming Gao
J. Tao
Jianqing Sun
Jiaen Liang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis"
7 / 7 papers shown
Title
CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang
Y. Qian
Xiaofei Wang
Manthan Thakker
Dongmei Wang
...
Haibin Wu
Yuxuan Hu
Jinyu Li
Yanmin Qian
Sheng Zhao
35
0
0
01 Jun 2025
Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining
Jinlong Xue
Yayue Deng
Yingming Gao
Ya Li
RALM
VLM
136
7
0
06 Jun 2024
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model
Jinlong Xue
Yayue Deng
Yicheng Han
Yingming Gao
Ya Li
95
4
0
06 Jun 2024
Pheme: Efficient and Conversational Speech Generation
Paweł Budzianowski
Taras Sereda
Tomasz Cichy
Ivan Vulić
78
7
0
05 Jan 2024
CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis
Yayue Deng
Jinlong Xue
Yukang Jia
Qifei Li
Yichen Han
Fengping Wang
Yingming Gao
Dengfeng Ke
Ya Li
89
7
0
16 Dec 2023
Towards human-like spoken dialogue generation between AI agents from written dialogue
Kentaro Mitsui
Yukiya Hono
Kei Sawada
88
14
0
02 Oct 2023
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Yinghao Aaron Li
Cong Han
Vinay S. Raghavan
Gavin Mischler
N. Mesgarani
VLM
DiffM
145
126
0
13 Jun 2023
1