ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.05106
  4. Cited By
Multi-band MelGAN: Faster Waveform Generation for High-Quality
  Text-to-Speech

Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

11 May 2020
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
ArXivPDFHTML

Papers citing "Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech"

50 / 107 papers shown
Title
A Hierarchical Speaker Representation Framework for One-shot Singing
  Voice Conversion
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion
Xu Li
Shansong Liu
Ying Shan
32
13
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
23
27
0
27 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
17
225
0
09 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
19
5
0
01 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
29
24
0
20 May 2022
Macedonian Speech Synthesis for Assistive Technology Applications
Macedonian Speech Synthesis for Assistive Technology Applications
B. Sofronievski
Elena Velovska
Martin Velichkovski
Violeta Argirova
Tea Veljkovikj
...
Kristijan Lazarev
Toni Bachvarovski
Z. Ivanovski
Dimitar Tashkovski
B. Gerazov
8
0
0
18 May 2022
Muskits: an End-to-End Music Processing Toolkit for Singing Voice
  Synthesis
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Jiatong Shi
Shuai Guo
Tao Qian
Nan Huo
Tomoki Hayashi
...
Xuankai Chang
Hua-Wei Li
Peter Wu
Shinji Watanabe
Qin Jin
VLM
17
26
0
09 May 2022
SVTS: Scalable Video-to-Speech Synthesis
SVTS: Scalable Video-to-Speech Synthesis
Rodrigo Mira
A. Haliassos
Stavros Petridis
Björn W. Schuller
M. Pantic
14
32
0
04 May 2022
Parallel Synthesis for Autoregressive Speech Generation
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
34
5
0
25 Apr 2022
LibriS2S: A German-English Speech-to-Speech Translation Corpus
LibriS2S: A German-English Speech-to-Speech Translation Corpus
Pedro Jeuris
J. Niehues
AuLLM
17
3
0
22 Apr 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with
  Adaptive Noise Spectral Shaping
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
30
45
0
31 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality
  Speech Synthesis
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
34
92
0
25 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating
  Inverse Short-Time Fourier Transform
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
17
60
0
04 Mar 2022
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
H.C.M. Turner
Giulio Lovisotto
Simon Eberz
Ivan Martinovic
11
1
0
13 Feb 2022
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded
  Speech
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech
Srikanth Korse
N. Pia
Kishan Gupta
Guillaume Fuchs
44
14
0
31 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion
  with Harmonic Signals
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
50
16
0
25 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale
  Corpus
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
28
98
0
20 Dec 2021
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Ehab A. AlBadawy
Andrew Gibiansky
Qing He
Jilong Wu
Ming-Ching Chang
Siwei Lyu
20
12
0
06 Dec 2021
Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing
  Linguistic Information and Noisy Data
Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data
Zhu Li
Yuqing Zhang
Mengxi Nie
Ming Yan
Mengnan He
Ruixiong Zhang
Caixia Gong
13
3
0
15 Nov 2021
RAVE: A variational autoencoder for fast and high-quality neural audio
  synthesis
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis
Antoine Caillon
P. Esling
DRL
19
109
0
09 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
ESPnet2-TTS: Extending the Edge of TTS Research
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
50
60
0
15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice
  Generation
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
105
62
0
14 Oct 2021
Pitch Preservation In Singing Voice Synthesis
Pitch Preservation In Singing Voice Synthesis
Shujun Liu
Hai Zhu
Kun Wang
Huajun Wang
23
0
0
11 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
24
2
0
07 Oct 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for
  Speech Synthesis
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Manh Luong
Viet-Anh Tran
6
2
0
27 Sep 2021
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice
  Cloning
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning
Rui Li
dong Pu
Minnie Huang
Bill Huang
50
14
0
23 Sep 2021
Bilateral Denoising Diffusion Models
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Dan Su
Dong Yu
DiffM
27
42
0
26 Aug 2021
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary
  Person
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person
Xinsheng Wang
Qicong Xie
Jihua Zhu
Lei Xie
O. Scharenborg
31
16
0
09 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
15
18
0
09 Aug 2021
Adversarial Auto-Encoding for Packet Loss Concealment
Adversarial Auto-Encoding for Packet Loss Concealment
Santiago Pascual
Joan Serrà
Jordi Pons
31
27
0
07 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery
AI based Presentation Creator With Customized Audio Content Delivery
Muvazima Mansoor
Srikanth Chandar
Ramamoorthy Srinath
18
0
0
27 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Zhengxi Liu
Y. Qian
DRL
14
10
0
25 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational
  Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Jian Cong
Shan Yang
Lei Xie
Dan Su
DRL
18
29
0
21 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
21
88
0
17 Jun 2021
DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech
  Enhancement
DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement
Shubo Lv
Yanxin Hu
Shimin Zhang
Lei Xie
24
93
0
16 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
32
125
0
15 Jun 2021
HUI-Audio-Corpus-German: A high quality TTS dataset
HUI-Audio-Corpus-German: A high quality TTS dataset
Pascal Puchtler
Johannes Wirth
René Peinl
9
21
0
11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
René Peinl
19
0
0
11 Jun 2021
DPT-FSNet: Dual-path Transformer Based Full-band and Sub-band Fusion
  Network for Speech Enhancement
DPT-FSNet: Dual-path Transformer Based Full-band and Sub-band Fusion Network for Speech Enhancement
Feng Dang
Hangting Chen
Pengyuan Zhang
76
96
0
27 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
26
24
0
20 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On
  Factorization of Quasi-Periodic Parallel WaveGAN
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN
Reo Yoneyama
Yi-Chiao Wu
T. Toda
14
12
0
10 Apr 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic
  LeastSquare GAN
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN
Cong Wang
Yu Chen
Bin Wang
Yi Shi
27
1
0
26 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
J. You
Dalhyun Kim
Gyuhyeon Nam
Geumbyeol Hwang
Gyeongsu Chae
13
27
0
09 Mar 2021
Efficient neural networks for real-time modeling of analog dynamic range
  compression
Efficient neural networks for real-time modeling of analog dynamic range compression
C. Steinmetz
Joshua D. Reiss
57
27
0
11 Feb 2021
Improved parallel WaveGAN vocoder with perceptually weighted spectrogram
  loss
Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Eunwoo Song
Ryuichi Yamamoto
Min-Jae Hwang
Jin-Seob Kim
Ohsung Kwon
Jae-Min Kim
11
14
0
19 Jan 2021
The 2020 ESPnet update: new features, broadened applications,
  performance improvements, and future plans
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Shinji Watanabe
Florian Boyer
Xuankai Chang
Pengcheng Guo
Tomoki Hayashi
...
Shigeki Karita
Chenda Li
Jing Shi
Aswin Shanmugam Subramanian
Wangyou Zhang
VLM
41
38
0
23 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at
  Pitch
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch
Joseph P. Turian
Max Henry
24
29
0
08 Dec 2020
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via
  Adversarial Training
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Haohan Guo
Heng Lu
Na Hu
Chunlei Zhang
Shan Yang
Lei Xie
Dan Su
Dong Yu
AAML
19
12
0
03 Dec 2020
Previous
123
Next