ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.11480
  4. Cited By
Parallel WaveGAN: A fast waveform generation model based on generative
  adversarial networks with multi-resolution spectrogram
v1v2 (latest)

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
ArXiv (abs)PDFHTML

Papers citing "Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"

50 / 464 papers shown
Title
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for
  Pronunciation Enhancement
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee
Tae-Woo Kim
Hanbin Bae
Min-Ji Lee
Young-Ik Kim
Hoon-Young Cho
VLM
79
20
0
29 Jun 2021
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech
  Synthesis
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Jinhyeok Yang
Jaesung Bae
Taejun Bak
Young-Ik Kim
Hoon-Young Cho
134
37
0
29 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Zhengxi Liu
Y. Qian
DRL
49
10
0
25 Jun 2021
Distilling the Knowledge from Conditional Normalizing Flows
Distilling the Knowledge from Conditional Normalizing Flows
Dmitry Baranchuk
Vladimir Aliev
Artem Babenko
BDL
85
2
0
24 Jun 2021
UniTTS: Residual Learning of Unified Embedding Space for Speech Style
  Control
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control
M. Kang
Sungjae Kim
Injung Kim
77
3
0
21 Jun 2021
Non-native English lexicon creation for bilingual speech synthesis
Non-native English lexicon creation for bilingual speech synthesis
Arun Baby
Pranav Jawale
Saranya Vinnaitherthan
Sumukh Badam
Nagaraj Adiga
Sharath Adavanne
44
8
0
21 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational
  Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Jian Cong
Shan Yang
Lei Xie
Jane Polak Scowcroft
DRL
107
29
0
21 Jun 2021
Improving robustness of one-shot voice conversion with deep
  discriminative speaker encoder
Improving robustness of one-shot voice conversion with deep discriminative speaker encoder
Hongqiang Du
Lei Xie
64
6
0
19 Jun 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised
  Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Disong Wang
Liqun Deng
Y. Yeung
Xiao Chen
Xunying Liu
Helen Meng
DRL
84
141
0
18 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
97
88
0
17 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
66
35
0
17 Jun 2021
Improving the expressiveness of neural vocoding with non-affine
  Normalizing Flows
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows
Adam Gabry's
Yunlong Jiao
V. Klimkov
Daniel Korzekwa
Roberto Barra-Chicote
45
1
0
16 Jun 2021
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
Rohola Zandie
Mohammad H. Mahoor
Julia Madsen
Eshrat S. Emamian
63
25
0
15 Jun 2021
Pathological voice adaptation with autoencoder-based voice conversion
Pathological voice adaptation with autoencoder-based voice conversion
M. Illa
B. Halpern
Rob van Son
Laureano Moro-Velazquez
O. Scharenborg
40
13
0
15 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
116
132
0
15 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with
  Data-Dependent Adaptive Prior
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
Sang-gil Lee
Heeseung Kim
Chaehun Shin
Xu Tan
Chang-Shu Liu
Qi Meng
Tao Qin
Wei Chen
Sung-Hoon Yoon
Tie-Yan Liu
DiffM
85
89
0
11 Jun 2021
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
Sprachsynthese -- State-of-the-Art in englischer und deutscher Sprache
René Peinl
45
0
0
11 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Seong-Whan Lee
104
54
0
04 Jun 2021
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker
  Identity in Dysarthric Voice Conversion
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion
Wen-Chin Huang
Kazuhiro Kobayashi
Yu-Huai Peng
Ching-Feng Liu
Yu Tsao
Hsin-Min Wang
Tomoki Toda
65
11
0
02 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion
NVC-Net: End-to-End Adversarial Voice Conversion
Bac Nguyen Cong
Fabien Cardinaux
AAML
124
42
0
02 Jun 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on
  Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform
  Modeling
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling
Patrick Lumban Tobing
Tomoki Toda
62
8
0
20 May 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All
  You Need For Audio Generation
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation
Shoule Wu
Ziqiang Shi
DiffM
157
11
0
17 May 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
117
544
0
13 May 2021
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu
Chengxi Li
Yi Ren
Feiyang Chen
Zhou Zhao
DiffM
183
271
0
06 May 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial
  Networks
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Rodrigo Mira
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Björn W. Schuller
Maja Pantic
112
47
0
27 Apr 2021
One Billion Audio Sounds from GPU-enabled Modular Synthesis
One Billion Audio Sounds from GPU-enabled Modular Synthesis
Joseph P. Turian
Jordie Shier
George Tzanetakis
K. McNally
Max Henry
103
22
0
27 Apr 2021
Phrase break prediction with bidirectional encoder representations in
  Japanese text-to-speech synthesis
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis
Kosuke Futamata
Byeong-Cheol Park
Ryuichi Yamamoto
Kentaro Tachibana
35
14
0
26 Apr 2021
An Adaptive Learning based Generative Adversarial Network for One-To-One
  Voice Conversion
An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion
Sandipan Dhar
N. D. Jana
Swagatam Das
54
18
0
25 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLMALM
94
25
0
20 Apr 2021
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
Saida Mussakhojayeva
Aigerim Janaliyeva
A. Mirzakhmetov
Yerbolat Khassanov
H. A. Varol
61
14
0
17 Apr 2021
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice
  Conversion
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka
Kou Tanaka
Takuhiro Kaneko
81
21
0
14 Apr 2021
Non-autoregressive sequence-to-sequence voice conversion
Non-autoregressive sequence-to-sequence voice conversion
Tomoki Hayashi
Wen-Chin Huang
Kazuhiro Kobayashi
Tomoki Toda
41
24
0
14 Apr 2021
NoiseVC: Towards High Quality Zero-Shot Voice Conversion
NoiseVC: Towards High Quality Zero-Shot Voice Conversion
Shijun Wang
Damian Borth
DRL
75
6
0
13 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On
  Factorization of Quasi-Periodic Parallel WaveGAN
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
73
12
0
10 Apr 2021
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features
Mahsa Elyasi
Gaurav Bharaj
44
2
0
08 Apr 2021
The AS-NU System for the M2VoC Challenge
The AS-NU System for the M2VoC Challenge
Cheng-Hung Hu
Yi-Chiao Wu
Wen-Chin Huang
Yu-Huai Peng
Yu-Wen Chen
Pin-Jui Ku
Tomoki Toda
Yu Tsao
Hsin-Min Wang
54
1
0
07 Apr 2021
Fast DCTTS: Efficient Deep Convolutional Text-to-Speech
Fast DCTTS: Efficient Deep Convolutional Text-to-Speech
M. Kang
Jihyun Lee
Simin Kim
Injung Kim
54
6
0
01 Apr 2021
Adversarial Attacks and Defenses for Speech Recognition Systems
Adversarial Attacks and Defenses for Speech Recognition Systems
Piotr Żelasko
Sonal Joshi
Yiwen Shao
Jesus Villalba
J. Trmal
Najim Dehak
Sanjeev Khudanpur
AAML
60
29
0
31 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic
  LeastSquare GAN
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN
Cong Wang
Yu Chen
Bin Wang
Yi Shi
146
1
0
26 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
J. You
Dalhyun Kim
Gyuhyeon Nam
Geumbyeol Hwang
Gyeongsu Chae
68
27
0
09 Mar 2021
crank: An Open-Source Software for Nonparallel Voice Conversion Based on
  Vector-Quantized Variational Autoencoder
crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder
Kazuhiro Kobayashi
Wen-Chin Huang
Yi-Chiao Wu
Patrick Lumban Tobing
Tomoki Hayashi
Tomoki Toda
BDLDRL
65
19
0
04 Mar 2021
Deepfakes Generation and Detection: State-of-the-art, open challenges,
  countermeasures, and way forward
Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward
Momina Masood
M. Nawaz
K. Malik
A. Javed
Aun Irtaza
AAML
202
323
0
25 Feb 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in
  Frames
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
73
60
0
25 Feb 2021
Alternate Endings: Improving Prosody for Incremental Neural TTS with
  Predicted Future Text Input
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Brooke Stephenson
Thomas Hueber
Laurent Girin
Laurent Besacier
89
10
0
19 Feb 2021
PeriodNet: A non-autoregressive waveform generation model with a
  structure separating periodic and aperiodic components
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Yukiya Hono
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
69
16
0
15 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
  VAE with Residual Attention
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Jane Polak Scowcroft
95
22
0
12 Feb 2021
LightSpeech: Lightweight and Fast Text to Speech with Neural
  Architecture Search
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
Renqian Luo
Xu Tan
Rui Wang
Tao Qin
Jinzhu Li
Sheng Zhao
Enhong Chen
Tie-Yan Liu
64
61
0
08 Feb 2021
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
EMA2S: An End-to-End Multimodal Articulatory-to-Speech System
Yu-Wen Chen
Kuo-Hsuan Hung
Shang-Yi Chuang
Jonathan Sherman
Wen-Chin Huang
Xugang Lu
Yu Tsao
48
16
0
07 Feb 2021
Universal Neural Vocoding with Parallel WaveNet
Universal Neural Vocoding with Parallel WaveNet
Yunlong Jiao
Adam Gabry's
Georgi Tinchev
Bartosz Putrycz
Daniel Korzekwa
V. Klimkov
81
42
0
01 Feb 2021
High Fidelity Speech Regeneration with Application to Speech Enhancement
High Fidelity Speech Regeneration with Application to Speech Enhancement
Adam Polyak
Lior Wolf
Yossi Adi
Ori Kabeli
Yaniv Taigman
55
19
0
31 Jan 2021
Previous
123...10789
Next