CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural
Text-to-Speech

CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech

30 April 2020

Daniel Sáez-Trigueros

Papers citing "CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech"

17 / 17 papers shown

Title
Creating New Voices using Normalizing Flows Piotr Bilinski Thomas Merritt Abdelhamid Ezzerg Kamil Pokora Sebastian Cygert K. Yanagisawa Roberto Barra-Chicote Daniel Korzekwa 26 17 0 22 Dec 2023
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin Tao Li Chenxu Hu Jian Cong Xinfa Zhu Jingbei Li Qiao Tian Yuping Wang Linfu Xie DiffM 48 8 0 02 Sep 2023
Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation K. Lakshminarayana C. Dittmar N. Pia Emanuel Habets 34 0 0 16 Jun 2023
Do Prosody Transfer Models Transfer Prosody? A. Sigurgeirsson Simon King DiffM 12 7 0 07 Mar 2023
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre Guangyan Zhang Ying Qin Wenbo Zhang Jialun Wu Mei Li Yu Gai Feijun Jiang Tan Lee 50 26 0 29 Jun 2022
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer S. Karlapati Penny Karanasou Mateusz Lajszczak Ammar Abbas Alexis Moinet Peter Makarov Raymond Li Arent van Korlaar Simon Slangen Thomas Drugman 24 15 0 27 Jun 2022
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion Xintao Zhao Feng Liu Changhe Song Zhiyong Wu Shiyin Kang Deyi Tuo Helen Meng 26 21 0 24 Mar 2022
Text-free non-parallel many-to-many voice conversion using normalising flows Thomas Merritt Abdelhamid Ezzerg Piotr Bilinski Magdalena Proszewska Kamil Pokora Roberto Barra-Chicote Daniel Korzekwa 36 14 0 15 Mar 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module Adam Gabry's Goeric Huybrechts M. Ribeiro C. Chien Julian Roth Giulia Comini Roberto Barra-Chicote Bartek Perz Jaime Lorenzo-Trueba 41 21 0 16 Feb 2022
Distribution augmentation for low-resource expressive text-to-speech Mateusz Lajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli Antonio Bonafonte ... M. Nicolis Alexis Moinet Thomas Drugman Trevor Wood Elena Sokolova 33 7 0 13 Feb 2022
Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Qicong Xie Tao Li Xinsheng Wang Zhichao Wang Lei Xie Guoqiao Yu Guanglu Wan 32 11 0 23 Dec 2021
Emotional Prosody Control for Speech Generation S. Sivaprasad Saiteja Kosgi Vineet Gandhi 12 17 0 07 Nov 2021
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis Shifeng Pan Lei He 19 22 0 27 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 353 0 29 Jun 2021
Low-resource expressive text-to-speech using data augmentation Goeric Huybrechts Thomas Merritt Giulia Comini Bartek Perz Raahil Shah Jaime Lorenzo-Trueba 26 50 0 11 Nov 2020
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement Daxin Tan Tan Lee 29 21 0 08 Nov 2020
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech S. Karlapati Ammar Abbas Zack Hodari Alexis Moinet Arnaud Joly Panagiota Karanasou Thomas Drugman 25 19 0 04 Nov 2020