Accented Text-to-Speech Synthesis with a Conditional Variational
Autoencoder

v1v2 (latest)

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

7 November 2022

J. Melechovský

Dorien Herremans

ArXiv (abs)PDF HTML

Papers citing "Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder"

15 / 15 papers shown

Title
Controllable Accented Text-to-Speech Synthesis Rui Liu Berrak Sisman Guanglai Gao Haizhou Li 74 6 0 22 Sep 2022
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data Alena Aksenova Zhehuai Chen Chung-Cheng Chiu D. Esch Pavel Golik ... Levi King Bhuvana Ramabhadran Andrew Rosenberg Suzan Schwartz Gary Wang 100 23 0 16 May 2022
Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition Xun Gong Y. Qian Houjun Huang Yanmin Qian 69 46 0 21 Apr 2022
Cross-speaker style transfer for text-to-speech using data augmentation M. Ribeiro Julian Roth Giulia Comini Goeric Huybrechts Adam Gabry's Jaime Lorenzo-Trueba 64 21 0 10 Feb 2022
One TTS Alignment To Rule Them All Rohan Badlani A. Lancucki Kevin J. Shih Rafael Valle Ming-Yu Liu Bryan Catanzaro 71 84 0 23 Aug 2021
Accent and Speaker Disentanglement in Many-to-many Voice Conversion Zhichao Wang Wenshuo Ge Xiong Wang Shan Yang Wendong Gan Haitao Chen Hai Li Lei Xie Xiulin Li CVBM 78 33 0 17 Nov 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 109 323 0 09 Aug 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,401 0 08 Jun 2020
Learning latent representations for style control and transfer in end-to-end speech synthesis Ya-Jie Zhang Shifeng Pan Lei He Zhenhua Ling BDL SSL DRL 53 229 0 11 Dec 2018
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 72 275 0 16 Oct 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis Yuxuan Wang Daisy Stanton Yu Zhang RJ Skerry-Ryan Eric Battenberg Joel Shor Y. Xiao Fei Ren Ye Jia Rif A. Saurous 66 826 0 23 Mar 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,701 0 16 Dec 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 160 1,826 0 29 Mar 2017
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 406 7,405 0 12 Sep 2016
Neural Machine Translation by Jointly Learning to Align and Translate Dzmitry Bahdanau Kyunghyun Cho Yoshua Bengio AIMat 573 27,311 0 01 Sep 2014