DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training

31 July 2023

Papers citing "DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training"

9 / 9 papers shown

Title
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation Ji-Hoon Kim Hong-Sun Yang Yoon-Cheol Ju Il-Hwan Kim Byeong-Yeol Kim Joon Son Chung BDL 54 0 0 31 Dec 2024
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector Deok-Hyeon Cho Hyung-Seok Oh Seung-Bin Kim Seong-Whan Lee 46 4 0 04 Nov 2024
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS Liumeng Xue Frank Soong Shaofei Zhang Linfu Xie 27 23 0 14 Sep 2022
Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems Hyun-Wook Yoon Ohsung Kwon Hoyeon Lee Ryuichi Yamamoto Eunwoo Song Jae-Min Kim Min-Jae Hwang 31 15 0 30 Jun 2022
Diffusion-LM Improves Controllable Text Generation Xiang Lisa Li John Thickstun Ishaan Gulrajani Percy Liang Tatsunori B. Hashimoto AI4CE 173 777 0 27 May 2022
Revisiting Over-Smoothness in Text to Speech Yi Ren Xu Tan Tao Qin Zhou Zhao Tie-Yan Liu 70 61 0 26 Feb 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer Xiaochun An Frank Soong Lei Xie 56 18 0 24 Jan 2022
Emphasis control for parallel neural TTS Shreyas Seshadri T. Raitio D. Castellani Jiangchuan Li 52 11 0 06 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS T. Raitio Jiangchuan Li Shreyas Seshadri 32 22 0 06 Oct 2021