MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis

17 January 2022

Yinjiao Lei

Shan Yang

Xinsheng Wang

Lei Xie

ArXiv PDF HTML

Papers citing "MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis"

26 / 26 papers shown

Title
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector Deok-Hyeon Cho Hyung-Seok Oh Seung-Bin Kim Seong-Whan Lee 59 7 0 04 Nov 2024
Towards Multi-Scale Style Control for Expressive Speech Synthesis Xiang Li Changhe Song Jingbei Li Zhiyong Wu Jia Jia Helen Meng 30 47 0 08 Apr 2021
AdaSpeech: Adaptive Text to Speech for Custom Voice Mingjian Chen Xu Tan Bohan Li Yanqing Liu Tao Qin Sheng Zhao Tie-Yan Liu VLM DiffM 60 189 0 01 Mar 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis Tao Li Shan Yang Liumeng Xue Lei Xie 33 73 0 17 Nov 2020
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 35 55 0 17 Nov 2020
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis C. Chien Hung-yi Lee 37 36 0 12 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement Daxin Tan Tan Lee 58 21 0 08 Nov 2020
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition Xiong Cai Dongyang Dai Zhiyong Wu Xiang Li Jingbei Li Helen Meng 32 66 0 26 Oct 2020
Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis Ting-Yao Hu A. Shrivastava Oncel Tuzel C. Dhir 19 30 0 09 Mar 2020
Emotional speech synthesis with rich and granularized control Seyun Um Sangshin Oh Kyungguen Byun Inseon Jang C. Ahn Hong-Goo Kang 8 89 0 05 Nov 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis Eric Battenberg RJ Skerry-Ryan Soroosh Mariooryad Daisy Stanton David Kao Matt Shannon Tom Bagby 43 114 0 23 Oct 2019
Fine-grained robust prosody transfer for single-speaker neural text-to-speech V. Klimkov S. Ronanki Jonas Rohnke Thomas Drugman AI4TS 29 82 0 04 Jul 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training Peng Wu Zhenhua Ling Li-Juan Liu Yuan Jiang Hong-Chuan Wu Lirong Dai 16 72 0 26 Jun 2019
Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer Azam Rabiee Tae-Ho Kim Soo-Young Lee 6 10 0 13 Jun 2019
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network V. Wan Chun-an Chan Tom Kenter Jakub Vít R. Clark 29 75 0 17 May 2019
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 30 227 0 11 Dec 2018
Robust and fine-grained prosody control of end-to-end speech synthesis Younggun Lee Jonathan Le Roux 29 147 0 06 Nov 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 34 275 0 16 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 753 93,936 0 11 Oct 2018
Predicting Expressive Speaking Style From Text In End-To-End Speech Synthesis Daisy Stanton Yuxuan Wang RJ Skerry-Ryan 34 122 0 04 Aug 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 37 550 0 24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 52 822 0 23 Mar 2018
Efficient Neural Audio Synthesis Nal Kalchbrenner Erich Elsen Karen Simonyan Seb Noury Norman Casagrande Edward Lockhart Florian Stimberg Aaron van den Oord Sander Dieleman Koray Kavukcuoglu 61 866 0 23 Feb 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 59 2,684 0 16 Dec 2017
Emotional End-to-End Neural Speech Synthesizer Younggun Lee Azam Rabiee Soo-Young Lee 39 105 0 15 Nov 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 120 1,817 0 29 Mar 2017