ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.15683
  4. Cited By
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level
  and Utterance-Level Acoustic Representation Learning
v1v2 (latest)

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

29 March 2022
Takaaki Saeki
Kentaro Tachibana
Ryuichi Yamamoto
ArXiv (abs)PDFHTML

Papers citing "DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning"

11 / 11 papers shown
Title
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech
  recognition and speaker verification
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
Shinnosuke Takamichi
Ludwig Kurzinger
Takaaki Saeki
Sayaka Shiota
Shinji Watanabe
38
23
0
17 Dec 2021
Environment Aware Text-to-Speech Synthesis
Environment Aware Text-to-Speech Synthesis
Daxin Tan
Guangyan Zhang
Tan Lee
45
4
0
08 Oct 2021
Conditional Variational Autoencoder with Adversarial Learning for
  End-to-End Text-to-Speech
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
122
884
0
11 Jun 2021
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,396
0
08 Jun 2020
WHAM!: Extending Speech Separation to Noisy Environments
WHAM!: Extending Speech Separation to Noisy Environments
Gordon Wichern
J. Antognini
Michael Flynn
Licheng Richard Zhu
E. McQuinn
Dwight Crow
Ethan Manilow
Jonathan Le Roux
82
345
0
02 Jul 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
104
954
0
05 Apr 2019
Hierarchical Generative Modeling for Controllable Speech Synthesis
Hierarchical Generative Modeling for Controllable Speech Synthesis
Wei-Ning Hsu
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
...
Ye Jia
Zhiwen Chen
Jonathan Shen
Patrick Nguyen
Ruoming Pang
BDL
72
275
0
16 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
159
1,787
0
20 Sep 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
79
2,698
0
16 Dec 2017
Pyroomacoustics: A Python package for audio room simulations and array
  processing algorithms
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms
Robin Scheibler
Eric Bezzam
Ivan Dokmanić
59
517
0
11 Oct 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
160
1,825
0
29 Mar 2017
1