iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre

29 June 2022

Papers citing "iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre"

40 / 40 papers shown

Title
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector Deok-Hyeon Cho Hyung-Seok Oh Seung-Bin Kim Seong-Whan Lee 106 8 0 04 Nov 2024
Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS Yookyung Shin Younggun Lee Suhee Jo Yeongtae Hwang Taesu Kim 63 14 0 13 Jul 2022
Disentangling Style and Speaker Attributes for TTS Style Transfer Xiaochun An Frank Soong Lei Xie 121 18 0 24 Jan 2022
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Pengfei Wu Junjie Pan Chenchang Xu Junhui Zhang Lin Wu Xiang Yin Zejun Ma 33 16 0 08 Oct 2021
A study on the efficacy of model pre-training in developing neural text-to-speech system Guangyan Zhang Yichong Leng Daxin Tan Ying Qin Kaitao Song Xu Tan Sheng Zhao Tan Lee 47 2 0 08 Oct 2021
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis Tao Li Xinsheng Wang Qicong Xie Zhichao Wang Linfu Xie 46 47 0 14 Sep 2021
Applying the Information Bottleneck Principle to Prosodic Representation Learning Guangyan Zhang Ying Qin Daxin Tan Tan Lee 57 4 0 05 Aug 2021
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis Shifeng Pan Lei He 62 23 0 27 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 93 359 0 29 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 114 882 0 11 Jun 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis Tao Li Shan Yang Liumeng Xue Lei Xie 56 74 0 17 Nov 2020
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis Yinjiao Lei Shan Yang Lei Xie 60 56 0 17 Nov 2020
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition Xiong Cai Dongyang Dai Zhiyong Wu Xiang Li Jingbei Li Helen Meng 41 67 0 26 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 177 1,931 0 12 Oct 2020
Controllable neural text-to-speech synthesis using intuitive prosodic features T. Raitio Ramya Rasipuram D. Castellani 56 66 0 14 Sep 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 68 340 0 11 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,396 0 08 Jun 2020
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech S. Karlapati Alexis Moinet Arnaud Joly V. Klimkov Daniel Sáez-Trigueros Thomas Drugman 36 67 0 30 Apr 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck Kaizhi Qian Yang Zhang Shiyu Chang David D. Cox M. Hasegawa-Johnson 77 184 0 23 Apr 2020
Emotional speech synthesis with rich and granularized control Seyun Um Sangshin Oh Kyungguen Byun Inseon Jang C. Ahn Hong-Goo Kang 52 90 0 05 Nov 2019
Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency M. Whitehill Shuang Ma Daniel J. McDuff Yale Song 66 35 0 25 Oct 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training Peng Wu Zhenhua Ling Li-Juan Liu Yuan Jiang Hong-Chuan Wu Lirong Dai 42 72 0 26 Jun 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss Kaizhi Qian Yang Zhang Shiyu Chang Xuesong Yang M. Hasegawa-Johnson 78 465 0 14 May 2019
Multi-reference Tacotron by Intercross Training for Style Disentangling,Transfer and Control in Speech Synthesis Yanyao Bian Changbin Chen Yongguo Kang Zhenglin Pan 42 46 0 04 Apr 2019
Exploring Transfer Learning for Low Resource Emotional TTS Noé Tits Kevin El Haddad Thierry Dutoit 48 61 0 14 Jan 2019
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 254 830 0 12 Jun 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron RJ Skerry-Ryan Eric Battenberg Y. Xiao Yuxuan Wang Daisy Stanton Joel Shor Ron J. Weiss R. Clark Rif A. Saurous 54 554 0 24 Mar 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 66 826 0 23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,697 0 16 Dec 2017
Emotional End-to-End Neural Speech Synthesizer Younggun Lee Azam Rabiee Soo-Young Lee 56 106 0 15 Nov 2017
Neural Discrete Representation Learning Aaron van den Oord Oriol Vinyals Koray Kavukcuoglu BDL SSL OCL 226 5,008 0 02 Nov 2017
Generalized End-to-End Loss for Speaker Verification Li Wan Quan Wang Alan Papir Ignacio López Moreno VLM 68 926 0 28 Oct 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 692 131,526 0 12 Jun 2017
Deep Voice 2: Multi-Speaker Neural Text-to-Speech Sercan O. Arik G. Diamos Andrew Gibiansky John Miller Kainan Peng Ming-Yu Liu Jonathan Raiman Yanqi Zhou 72 496 0 24 May 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 155 1,823 0 29 Mar 2017
Categorical Reparameterization with Gumbel-Softmax Eric Jang S. Gu Ben Poole BDL 317 5,364 0 03 Nov 2016
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables Chris J. Maddison A. Mnih Yee Whye Teh BDL 186 2,531 0 02 Nov 2016
Perceptual Losses for Real-Time Style Transfer and Super-Resolution Justin Johnson Alexandre Alahi Li Fei-Fei SupR 234 10,247 0 27 Mar 2016
Domain-Adversarial Training of Neural Networks Yaroslav Ganin E. Ustinova Hana Ajakan Pascal Germain Hugo Larochelle François Laviolette M. Marchand Victor Lempitsky GAN OOD 372 9,486 0 28 May 2015
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.8K 150,039 0 22 Dec 2014