Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

29 June 2022

Papers citing "Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody"

20 / 20 papers shown

Title
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer S. Karlapati Penny Karanasou Mateusz Lajszczak Ammar Abbas Alexis Moinet Peter Makarov Raymond Li Arent van Korlaar Simon Slangen Thomas Drugman 53 15 0 27 Jun 2022
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS Ye Jia Heiga Zen Jonathan Shen Yu Zhang Yonghui Wu SSL 85 84 0 28 Mar 2021
Universal Neural Vocoding with Parallel WaveNet Yunlong Jiao Adam Gabry's Georgi Tinchev Bartosz Putrycz Daniel Korzekwa V. Klimkov 73 42 0 01 Feb 2021
Using previous acoustic context to improve Text-to-Speech synthesis Pilar Oplustil Gallegos Simon King 62 11 0 07 Dec 2020
s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis Xi Wang Huaiping Ming Lei He Frank Soong 33 5 0 17 Nov 2020
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis Guanghui Xu Wei Song Zhengchen Zhang Chao Zhang Xiaodong He Bowen Zhou 47 50 0 06 Nov 2020
MultiSpeech: Multi-Speaker Text to Speech with Transformer Mingjian Chen Xu Tan Yi Ren Jin Xu Hao Sun Sheng Zhao Tao Qin Tie-Yan Liu 65 110 0 08 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,396 0 08 Jun 2020
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs R. Clark Hanna Silén Tom Kenter Ralph Leith ELM 53 45 0 09 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis Chengzhu Yu Heng Lu Na Hu Meng Yu Chao Weng ... Deyi Tuo Shiyin Kang Guangzhi Lei Dan Su Dong Yu CVBM 48 118 0 04 Sep 2019
Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models Wei Fang Yu-An Chung James R. Glass 51 27 0 17 Jun 2019
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data N. Prateek Mateusz Lajszczak Roberto Barra-Chicote Thomas Drugman Jaime Lorenzo-Trueba Thomas Merritt S. Ronanki Trevor Wood 41 30 0 04 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 94,891 0 11 Oct 2018
An Empirical Analysis of the Correlation of Syntax and Prosody Arne Köhn Timo Baumann Oskar Dörfler 42 11 0 15 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 256 830 0 12 Jun 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,698 0 16 Dec 2017
Generalized End-to-End Loss for Speaker Verification Li Wan Quan Wang Alan Papir Ignacio López Moreno VLM 68 927 0 28 Oct 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 701 131,652 0 12 Jun 2017
Deep Voice 2: Multi-Speaker Neural Text-to-Speech Sercan O. Arik G. Diamos Andrew Gibiansky John Miller Kainan Peng Ming-Yu Liu Jonathan Raiman Yanqi Zhou 72 496 0 24 May 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 160 1,825 0 29 Mar 2017