DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level
and Utterance-Level Acoustic Representation Learning

v1v2 (latest)

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

29 March 2022

Kentaro Tachibana

Ryuichi Yamamoto

ArXiv (abs)PDF HTML

Papers citing "DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning"

11 / 11 papers shown

Title
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification Shinnosuke Takamichi Ludwig Kurzinger Takaaki Saeki Sayaka Shiota Shinji Watanabe 38 23 0 17 Dec 2021
Environment Aware Text-to-Speech Synthesis Daxin Tan Guangyan Zhang Tan Lee 45 4 0 08 Oct 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 122 884 0 11 Jun 2021
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,396 0 08 Jun 2020
WHAM!: Extending Speech Separation to Noisy Environments Gordon Wichern J. Antognini Michael Flynn Licheng Richard Zhu E. McQuinn Dwight Crow Ethan Manilow Jonathan Le Roux 82 345 0 02 Jul 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech Heiga Zen Viet Dang R. Clark Yu Zhang Ron J. Weiss Ye Jia Zhiwen Chen Yonghui Wu 104 954 0 05 Apr 2019
Hierarchical Generative Modeling for Controllable Speech Synthesis Wei-Ning Hsu Yu Zhang Ron J. Weiss Heiga Zen Yonghui Wu ... Ye Jia Zhiwen Chen Jonathan Shen Patrick Nguyen Ruoming Pang BDL 72 275 0 16 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Yi Luo N. Mesgarani 159 1,787 0 20 Sep 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 79 2,698 0 16 Dec 2017
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms Robin Scheibler Eric Bezzam Ivan Dokmanić 59 517 0 11 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis Yuxuan Wang RJ Skerry-Ryan Daisy Stanton Yonghui Wu Ron J. Weiss ... Samy Bengio Quoc V. Le Yannis Agiomyrgiannakis R. Clark Rif A. Saurous 160 1,825 0 29 Mar 2017