ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.01080
  4. Cited By
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based
  Non-Autoregressive TTS
v1v2 (latest)

A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS

2 March 2022
Haohan Guo
Hui Lu
Xixin Wu
Helen Meng
ArXiv (abs)PDFHTML

Papers citing "A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS"

21 / 21 papers shown
Title
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
136
3
0
16 Oct 2024
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive
  Text-to-Speech Synthesis
VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis
Hui Lu
Zhiyong Wu
Xixin Wu
Xu Li
Shiyin Kang
Xunying Liu
Helen Meng
58
12
0
07 Jul 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
125
37
0
29 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
97
132
0
15 Jun 2021
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong
Hyeongju Kim
Sung Jun Cheon
Byoung Jin Choi
N. Kim
DiffM
65
197
0
03 Apr 2021
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
68
103
0
22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,952
0
12 Oct 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,410
0
08 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
105
497
0
22 May 2020
A U-Net Based Discriminator for Generative Adversarial Networks
A U-Net Based Discriminator for Generative Adversarial Networks
Edgar Schönfeld
Bernt Schiele
Anna Khoreva
GAN
83
296
0
28 Feb 2020
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
Leyuan Sheng
Dong-Yan Huang
Evgeny Nikolaevich Pavlovskiy
74
15
0
03 Dec 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
165
958
0
08 Oct 2019
On the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
294
1,908
0
08 Aug 2019
Lookahead Optimizer: k steps forward, 1 step back
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
157
732
0
19 Jul 2019
A New GAN-based End-to-End TTS Training Algorithm
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
67
47
0
09 Apr 2019
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram
  Predictions
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
85
2,704
0
16 Dec 2017
Statistical Parametric Speech Synthesis Incorporating Generative
  Adversarial Networks
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
71
199
0
23 Sep 2017
Statistical Parametric Speech Synthesis Using Generative Adversarial
  Networks Under A Multi-task Learning Framework
Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework
Shan Yang
Lei Xie
Xiao Chen
Xiaoyan Lou
Xuan Zhu
Dongyan Huang
Haizhou Li
GAN
64
45
0
06 Jul 2017
Tacotron: Towards End-to-End Speech Synthesis
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
166
1,831
0
29 Mar 2017
Weight Normalization: A Simple Reparameterization to Accelerate Training
  of Deep Neural Networks
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Tim Salimans
Diederik P. Kingma
ODL
196
1,945
0
25 Feb 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg3DV
1.9K
77,441
0
18 May 2015
1