ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.11480
  4. Cited By
Parallel WaveGAN: A fast waveform generation model based on generative
  adversarial networks with multi-resolution spectrogram
v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
ArXiv (abs)PDFHTML

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown
Title
Speech Enhancement for Wake-Up-Word detection in Voice Assistants
Speech Enhancement for Wake-Up-Word detection in Voice Assistants
David Bonet
Guillermo Cámbara
Fernando López
Pablo Gómez
Carlos Segura
Jordi Luque
62
11
0
29 Jan 2021
Improved parallel WaveGAN vocoder with perceptually weighted spectrogram
  loss
Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Eunwoo Song
Ryuichi Yamamoto
Min-Jae Hwang
Jin-Seob Kim
Ohsung Kwon
Jae-Min Kim
61
14
0
19 Jan 2021
The 2020 ESPnet update: new features, broadened applications,
  performance improvements, and future plans
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Shinji Watanabe
Florian Boyer
Xuankai Chang
Pengcheng Guo
Tomoki Hayashi
...
Shigeki Karita
Chenda Li
Jing Shi
Aswin Shanmugam Subramanian
Wangyou Zhang
VLM
108
38
0
23 Dec 2020
DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling
DenoiSpeech: Denoising Text to Speech with Frame-Level Noise Modeling
Chen Zhang
Yi Ren
Xu Tan
Jinglin Liu
Ke-jun Zhang
Tao Qin
Sheng Zhao
Tie-Yan Liu
DiffM
94
38
0
17 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at
  Pitch
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch
Joseph P. Turian
Max Henry
49
31
0
08 Dec 2020
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via
  Adversarial Training
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Haohan Guo
Heng Lu
Na Hu
Chunlei Zhang
Shan Yang
Lei Xie
Jane Polak Scowcroft
Dong Yu
AAML
68
12
0
03 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on
  Location-Variable Convolution
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
44
8
0
03 Dec 2020
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform
  Generation in Multiple Domains
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Won Jang
D. Lim
Jaesam Yoon
60
34
0
19 Nov 2020
Single channel voice separation for unknown number of speakers under
  reverberant and noisy settings
Single channel voice separation for unknown number of speakers under reverberant and noisy settings
Shlomo E. Chazan
Lior Wolf
Eliya Nachmani
Yossi Adi
78
29
0
04 Nov 2020
Learning Explicit Prosody Models and Deep Speaker Embeddings for
  Atypical Voice Conversion
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion
Disong Wang
Songxiang Liu
Lifa Sun
Xixin Wu
Xunying Liu
Helen Meng
30
9
0
03 Nov 2020
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with
  Temporal Adaptive Normalization
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
Ahmed Mustafa
N. Pia
Guillaume Fuchs
91
73
0
03 Nov 2020
Learning to Maximize Speech Quality Directly Using MOS Prediction for
  Neural Text-to-Speech
Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech
Yeunju Choi
Youngmoon Jung
Youngjoo Suh
Hoirin Kim
125
6
0
02 Nov 2020
CVC: Contrastive Learning for Non-parallel Voice Conversion
CVC: Contrastive Learning for Non-parallel Voice Conversion
Tingle Li
Yichen Liu
Chenxu Hu
Hang Zhao
DRL
100
13
0
02 Nov 2020
Speech Synthesis and Control Using Differentiable DSP
Speech Synthesis and Control Using Differentiable DSP
Giorgio Fabbro
Vladimir Golkov
Thomas Kemp
Zorah Lähner
78
12
0
28 Oct 2020
Parallel waveform synthesis based on generative adversarial networks
  with voicing-aware conditional discriminators
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Ryuichi Yamamoto
Eunwoo Song
Min-Jae Hwang
Jae-Min Kim
74
18
0
27 Oct 2020
Recent Developments on ESPnet Toolkit Boosted by Conformer
Recent Developments on ESPnet Toolkit Boosted by Conformer
Pengcheng Guo
Florian Boyer
Xuankai Chang
Tomoki Hayashi
Yosuke Higuchi
...
Jing Shi
Shinji Watanabe
Kun Wei
Wangyou Zhang
Yuekai Zhang
89
263
0
26 Oct 2020
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality
  Speech Synthesis
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis
Min-Jae Hwang
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
44
32
0
26 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
  Discrete Speech Representations
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
Tomoki Toda
BDL
111
38
0
23 Oct 2020
NU-GAN: High resolution neural upsampling with GAN
NU-GAN: High resolution neural upsampling with GAN
Rithesh Kumar
Kundan Kumar
Vicki Anand
Yoshua Bengio
Aaron Courville
65
26
0
22 Oct 2020
BERT for Joint Multichannel Speech Dereverberation with Spatial-aware
  Tasks
BERT for Joint Multichannel Speech Dereverberation with Spatial-aware Tasks
Yang Jiao
29
0
0
21 Oct 2020
Automatic multitrack mixing with a differentiable mixing console of
  neural audio effects
Automatic multitrack mixing with a differentiable mixing console of neural audio effects
C. Steinmetz
Jordi Pons
Santiago Pascual
Joan Serrà
113
51
0
20 Oct 2020
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with
  Self-adaptive Training
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Renjie Zheng
Mingbo Ma
Baigong Zheng
Kaibo Liu
Jiahong Yuan
Kenneth Church
Liang Huang
49
14
0
20 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
181
1,954
0
12 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Toda
79
8
0
09 Oct 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic
  Variational Autoencoder and Parallel WaveGAN
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN
Patrick Lumban Tobing
Yi-Chiao Wu
Tomoki Toda
DRL
55
14
0
09 Oct 2020
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed
  Langevin Dynamics
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
Shogo Seki
DiffM
124
21
0
06 Oct 2020
The Academia Sinica Systems of Voice Conversion for VCC2020
The Academia Sinica Systems of Voice Conversion for VCC2020
Yu-Huai Peng
Cheng-Hung Hu
A. Kang
Hung-Shin Lee
Pin-Yuan Chen
Yu Tsao
Hsin-Min Wang
46
2
0
06 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
Tomoki Toda
DRL
81
40
0
06 Oct 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong
Ming-Yu Liu
Jiaji Huang
Kexin Zhao
Bryan Catanzaro
DiffMBDL
216
1,471
0
21 Sep 2020
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
Jiawei Chen
Xu Tan
Jian Luan
Tao Qin
Tie-Yan Liu
VLM
102
93
0
03 Sep 2020
WaveGrad: Estimating Gradients for Waveform Generation
WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
William Chan
DiffMBDL
149
795
0
02 Sep 2020
Hierarchical Timbre-Painting and Articulation Generation
Hierarchical Timbre-Painting and Articulation Generation
Michael Michelashvili
Lior Wolf
71
12
0
30 Aug 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
88
211
0
28 Aug 2020
Nonparallel Voice Conversion with Augmented Classifier Star Generative
  Adversarial Networks
Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
99
20
0
27 Aug 2020
Audio Dequantization for High Fidelity Audio Generation in Flow-based
  Neural Vocoder
Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Hyun-Wook Yoon
Sang-Hoon Lee
Hyeong-Rae Noh
Seong-Whan Lee
108
11
0
16 Aug 2020
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Ravichander Vipperla
Sangjun Park
Kihyun Choo
Samin S. Ishtiaq
Kyoungbo Min
S. Bhattacharya
Abhinav Mehrotra
Alberto Gil C. P. Ramos
Nicholas D. Lane
72
26
0
11 Aug 2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition
Jin Xu
Xu Tan
Yi Ren
Tao Qin
Jian Li
Sheng Zhao
Tie-Yan Liu
VLM
70
91
0
09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
139
329
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
Tomoki Toda
118
40
0
07 Aug 2020
Unsupervised Cross-Domain Singing Voice Conversion
Unsupervised Cross-Domain Singing Voice Conversion
Adam Polyak
Lior Wolf
Yossi Adi
Yaniv Taigman
58
44
0
06 Aug 2020
HooliGAN: Robust, High Quality Neural Vocoding
HooliGAN: Robust, High Quality Neural Vocoding
Ollie McCarthy
Zo Ahmed
95
14
0
06 Aug 2020
A Spectral Energy Distance for Parallel Speech Synthesis
A Spectral Energy Distance for Parallel Speech Synthesis
A. Gritsenko
Tim Salimans
Rianne van den Berg
Jasper Snoek
Nal Kalchbrenner
71
70
0
03 Aug 2020
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested
  Adversarial Network
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Jinhyeok Yang
Junmo Lee
Young-Ik Kim
Hoonyoung Cho
Injung Kim
82
73
0
30 Jul 2020
Translate Reverberated Speech to Anechoic Ones: Speech Dereverberation
  with BERT
Translate Reverberated Speech to Anechoic Ones: Speech Dereverberation with BERT
Yang Jiao
38
1
0
16 Jul 2020
Real Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform Domain
Alexandre Défossez
Gabriel Synnaeve
Yossi Adi
109
466
0
23 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
155
1,414
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
85
187
0
05 Jun 2020
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
An ASR Guided Speech Intelligibility Measure for TTS Model Selection
Arun Baby
Saranya Vinnaitherthan
Nagaraj Adiga
Pranav Jawale
Sumukh Badam
Sharath Adavanne
Srikanth Konjeti
38
7
0
02 Jun 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive
  Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
Tomoki Toda
73
4
0
18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with
  Monotonic Boundary Search
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming-Yuan Liu
Ming Zhou
50
6
0
18 May 2020
Previous
123...1089
Next