Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11480
Cited By
v1
v2 (latest)
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"
50 / 464 papers shown
Title
BEHM-GAN: Bandwidth Extension of Historical Music using Generative Adversarial Networks
Eloi Moliner
Vesa Valimaki
59
19
0
13 Apr 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
23
2
0
12 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
56
0
0
08 Apr 2022
AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification
Sonal Joshi
Saurabh Kataria
Jesus Villalba
Najim Dehak
AAML
86
7
0
08 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
45
0
0
05 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
53
0
0
01 Apr 2022
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Guangyan Zhang
Kaitao Song
Xu Tan
Daxin Tan
Yuzi Yan
...
G. Wang
Wei Zhou
Tao Qin
Tan Lee
Sheng Zhao
SSL
84
21
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
93
53
0
31 Mar 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
72
20
0
31 Mar 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
99
49
0
31 Mar 2022
An Overview & Analysis of Sequence-to-Sequence Emotional Voice Conversion
Zijiang Yang
Xin Jing
Andreas Triantafyllopoulos
Meishu Song
Ilhan Aslan
Björn W. Schuller
64
14
0
29 Mar 2022
Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition Systems
Nicholas Mehlman
Anirudh Sreeram
Raghuveer Peri
Shrikanth Narayanan
AAML
165
4
0
29 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Jane Polak Scowcroft
Dong Yu
DiffM
98
97
0
25 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data
Gašper Beguš
Alan Zhou
SSL
115
5
0
22 Mar 2022
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling
Bac Nguyen
Fabien Cardinaux
Stefan Uhlich
23
2
0
21 Mar 2022
AdaVocoder: Adaptive Vocoder for Custom Voice
Xin Yuan
Yongbin Feng
Mingming Ye
Cheng Tuo
Minghang Zhang
117
3
0
18 Mar 2022
A
3
^3
3
T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Richard He Bai
Renjie Zheng
Junkun Chen
Xintong Li
Mingbo Ma
Liang Huang
119
53
0
18 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
99
29
0
07 Mar 2022
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Tao Wang
Ruibo Fu
Jiangyan Yi
J. Tao
Zhengqi Wen
21
2
0
05 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
89
62
0
04 Mar 2022
MANNER: Multi-view Attention Network for Noise Erasure
Hyun Joon Park
Byung Ha Kang
Wooseok Shin
Jin Sob Kim
S. W. Han
92
50
0
04 Mar 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis
Pengyu Cheng
Zhenhua Ling
72
3
0
02 Mar 2022
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
131
4
0
01 Mar 2022
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
146
64
0
26 Feb 2022
Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph
Dacheng Yin
Xuanchi Ren
Chong Luo
Yuwang Wang
Zhiwei Xiong
Wenjun Zeng
114
13
0
24 Feb 2022
Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement
Doyeon Kim
Hyewon Han
Hyeon-Kyeong Shin
Soo-Whan Chung
Hong-Goo Kang
23
5
0
24 Feb 2022
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing
Tao Wang
Jiangyan Yi
Ruibo Fu
J. Tao
Zhengqi Wen
KELM
69
20
0
21 Feb 2022
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
89
195
0
20 Feb 2022
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Disong Wang
Songxiang Liu
Xixin Wu
Hui Lu
Lifa Sun
Xunying Liu
Helen Meng
54
5
0
18 Feb 2022
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion
Disong Wang
Shan Yang
Jane Polak Scowcroft
Xunying Liu
Dong Yu
Helen Meng
60
11
0
18 Feb 2022
On loss functions and evaluation metrics for music source separation
Enric Gusó
Jordi Pons
Santiago Pascual
Joan Serrà
132
21
0
16 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
89
63
0
15 Feb 2022
textless-lib: a Library for Textless Spoken Language Processing
Eugene Kharitonov
Jade Copet
Kushal Lakhotia
Tu Nguyen
Paden Tomasello
...
A. Elkahky
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
121
34
0
15 Feb 2022
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
77
58
0
14 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
69
31
0
08 Feb 2022
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech
Srikanth Korse
N. Pia
Kishan Gupta
Guillaume Fuchs
91
15
0
31 Jan 2022
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
Shoule Wu
Ziqiang Shi
DiffM
451
9
0
29 Jan 2022
Noise-robust voice conversion with domain adversarial training
Hongqiang Du
Lei Xie
Haizhou Li
66
12
0
26 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
97
16
0
25 Jan 2022
Polyphone disambiguation and accent prediction using pre-trained language models in Japanese TTS front-end
Rem Hida
Masaki Hamada
Chie Kamada
E. Tsunoo
Toshiyuki Sekiya
Toshiyuki Kumakura
34
7
0
24 Jan 2022
KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
81
13
0
15 Jan 2022
MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder
Shoutong Wang
Jinglin Liu
Yi Ren
Zhen Wang
Changliang Xu
Zhou Zhao
40
7
0
11 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
172
58
0
10 Jan 2022
Improved Input Reprogramming for GAN Conditioning
Tuan Dinh
Daewon Seo
Zhixu Du
Liang Shang
Kangwook Lee
AI4CE
103
8
0
07 Jan 2022
Audio representations for deep learning in sound synthesis: A review
Anastasia Natsiou
Seán O'Leary
AI4TS
65
18
0
07 Jan 2022
IQDUBBING: Prosody modeling based on discrete self-supervised speech representation for expressive voice conversion
Wendong Gan
Bolong Wen
Yin Yan
Haitao Chen
Zhichao Wang
Hongqiang Du
Lei Xie
Kaixuan Guo
Hai Li
85
14
0
02 Jan 2022
Self-Supervised Learning based Monaural Speech Enhancement with Complex-Cycle-Consistent
Yi Li
Yang Sun
S. M. Naqvi
62
1
0
21 Dec 2021
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
94
104
0
20 Dec 2021
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
69
15
0
08 Dec 2021
Dilated convolution with learnable spacings
Ismail Khalfaoui-Hassani
Thomas Pellegrini
T. Masquelier
123
32
0
07 Dec 2021
Previous
1
2
3
...
10
5
6
7
8
9
Next