Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11480
Cited By
v1
v2 (latest)
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"
50 / 464 papers shown
Title
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Ehab A. AlBadawy
Andrew Gibiansky
Qing He
Jilong Wu
Ming-Ching Chang
Siwei Lyu
58
12
0
06 Dec 2021
Steerable discovery of neural audio effects
C. Steinmetz
Joshua D. Reiss
52
6
0
06 Dec 2021
High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Aimilios Chalamandaris
Georgia Maniati
Panos Kakoulidis
S. Raptis
June Sig Sung
Hyoungmin Park
Pirros Tsiakoulis
139
37
0
17 Nov 2021
AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion
Damien Ronssin
Milos Cernak
78
11
0
12 Nov 2021
RAVE: A variational autoencoder for fast and high-quality neural audio synthesis
Antoine Caillon
P. Esling
DRL
68
112
0
09 Nov 2021
Speaker Generation
Daisy Stanton
Matt Shannon
Soroosh Mariooryad
RJ Skerry-Ryan
Eric Battenberg
Tom Bagby
David Kao
77
30
0
07 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
204
131
0
04 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses
Shengyuan Xu
Wenxiao Zhao
Jing Guo
63
12
0
01 Nov 2021
Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution
Jaechang Kim
Yunjoo Lee
Seunghoon Hong
Jungseul Ok
SupR
CLL
70
13
0
30 Oct 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
83
11
0
27 Oct 2021
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining
Viet-Anh Nguyen
Anh H. T. Nguyen
Andy W. H. Khong
60
22
0
26 Oct 2021
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du
Berrak Sisman
Kun Zhou
Haizhou Li
91
24
0
20 Oct 2021
Speech Enhancement-assisted Voice Conversion in Noisy Environments
Yun-Ju Chan
Chiang-Jen Peng
Syu-Siang Wang
Hsin-Min Wang
Yu Tsao
T. Chi
46
2
0
19 Oct 2021
Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks
Marco Comunità
Huy Phan
Joshua D. Reiss
GAN
49
11
0
18 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
90
6
0
18 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
99
43
0
15 Oct 2021
Towards Identity Preserving Normal to Dysarthric Voice Conversion
Wen-Chin Huang
B. Halpern
Lester Phillip Violeta
O. Scharenborg
Tomoki Toda
106
23
0
15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
85
63
0
15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
203
63
0
14 Oct 2021
FedSpeech: Federated Text-to-Speech with Continual Learning
Ziyue Jiang
Yi Ren
Ming Lei
Zhou Zhao
FedML
166
28
0
14 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
162
202
0
14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis
Soonbeom Choi
Juhan Nam
67
14
0
13 Oct 2021
Source Mixing and Separation Robust Audio Steganography
Naoya Takahashi
M. Singh
Yuki Mitsufuji
60
6
0
11 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
56
2
0
07 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
E. Hortal
Rodrigo Brechard Alarcia
GAN
46
2
0
06 Oct 2021
Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Max Morrison
Zeyu Jin
Nicholas J. Bryan
Juan-Pablo Caceres
Bryan Pardo
73
14
0
05 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
69
6
0
04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
122
79
0
30 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Haohe Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
87
58
0
28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates
Kentaro Mitsui
Kei Sawada
109
0
0
28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Manh Luong
Viet-Anh Tran
24
2
0
27 Sep 2021
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion
Yi-Syuan Liou
Wen-Chin Huang
Ming-Chi Yen
S. Tsai
Yu-Huai Peng
Tomoki Toda
Yu Tsao
Hsin-Min Wang
66
1
0
08 Sep 2021
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Jane Polak Scowcroft
Dong Yu
DiffM
83
43
0
26 Aug 2021
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Hong G Jung
Seong-Whan Lee
162
6
0
16 Aug 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction
Zhan Zhang
Yuehai Wang
Jianyi Yang
116
3
0
12 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
129
19
0
09 Aug 2021
Applying the Information Bottleneck Principle to Prosodic Representation Learning
Guangyan Zhang
Ying Qin
Daxin Tan
Tan Lee
77
4
0
05 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
74
9
0
03 Aug 2021
Creation and Detection of German Voice Deepfakes
Vanessa Barnekow
Dominik Binder
Niclas Kromrey
Pascal Munaretto
A. Schaad
Felix Schmieder
23
3
0
02 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
57
7
0
01 Aug 2021
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language
Huiyan Li
Haohong Lin
You Wang
Hengyang Wang
Ming Zhang
Han Gao
Qing Ai
Zhiyuan Luo
Guang Li
63
14
0
31 Jul 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Yinghao Aaron Li
A. Zare
N. Mesgarani
97
101
0
21 Jul 2021
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
Joanna Rownicka
Kilian Sprenkamp
A. Tripiana
Volodymyr Gromoglasov
Timo P. Kunz
26
0
0
21 Jul 2021
On Prosody Modeling for ASR+TTS based Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Xinjian Li
Shinji Watanabe
Tomoki Toda
73
9
0
20 Jul 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Cheng-Hung Hu
Yu-Huai Peng
Junichi Yamagishi
Yu Tsao
Hsin-Min Wang
48
5
0
20 Jul 2021
Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant Speech
C. Steinmetz
V. Ithapu
P. Calamia
83
40
0
15 Jul 2021
Neural Waveshaping Synthesis
B. Hayes
C. Saitis
Gyorgy Fazekas
85
28
0
11 Jul 2021
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion
Daxin Tan
Liqun Deng
Y. Yeung
Xin Jiang
Xiao Chen
Tan Lee
91
41
0
04 Jul 2021
Adversarial Sample Detection for Speaker Verification by Neural Vocoders
Haibin Wu
Po-Chun Hsu
Ji Gao
Shanshan Zhang
Shen Huang
Jian Kang
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
93
21
0
01 Jul 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
Previous
1
2
3
...
10
6
7
8
9
Next