U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning

6 October 2023

Jian Cong

Lei Xie

Papers citing "U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning"

22 / 22 papers shown

Title
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models Minki Kang Wooseok Han Sung Ju Hwang Eunho Yang DiffM 84 19 0 23 May 2023
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis Chunyu Qiang Peng Yang Hao Che Ying Zhang Xiaorui Wang Zhong-ming Wang 72 9 0 14 Mar 2023
SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech Byoung Jin Choi Myeonghun Jeong Joun Yeop Lee N. Kim 104 13 0 30 Nov 2022
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Minki Kang Dong Min Sung Ju Hwang DiffM 98 50 0 17 Nov 2022
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis Tao Li Xinsheng Wang Qicong Xie Zhichao Wang Ming Jiang Linfu Xie 90 16 0 04 Jul 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 238 415 0 04 Dec 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 100 158 0 27 Oct 2021
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dong Min Dong Bok Lee Eunho Yang Sung Ju Hwang 125 175 0 06 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov DiffM 110 543 0 13 May 2021
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model Edresson Casanova C. Shulby Eren Golge Nicolas Müller F. S. Oliveira Arnaldo Cândido Júnior A. S. Soares S. Aluísio M. Ponti 58 100 0 02 Apr 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis Tao Li Shan Yang Liumeng Xue Lei Xie 79 74 0 17 Nov 2020
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset Kun Zhou Berrak Sisman Rui Liu Haizhou Li 89 192 0 28 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 179 1,952 0 12 Oct 2020
Conformer: Convolution-augmented Transformer for Speech Recognition Anmol Gulati James Qin Chung-Cheng Chiu Niki Parmar Yu Zhang ... Wei Han Shibo Wang Zhengdong Zhang Yonghui Wu Ruoming Pang 229 3,164 0 16 May 2020
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens Rafael Valle Jason Chun Lok Li R. Prenger Bryan Catanzaro 72 149 0 26 Oct 2019
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 71 229 0 11 Dec 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 77 276 0 16 Oct 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 266 837 0 12 Jun 2018
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning Ming-Yu Liu Kainan Peng Andrew Gibiansky Sercan O. Arik Ajay Kannan Sharan Narang Jonathan Raiman John Miller 79 309 0 20 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 173 1,833 0 29 Mar 2017
Instance Normalization: The Missing Ingredient for Fast Stylization Dmitry Ulyanov Andrea Vedaldi Victor Lempitsky OOD 180 3,714 0 27 Jul 2016
Unsupervised Domain Adaptation by Backpropagation Yaroslav Ganin Victor Lempitsky OOD 258 6,043 0 26 Sep 2014