Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.06711
Cited By
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
8 October 2019
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis"
50 / 222 papers shown
Title
cMelGAN: An Efficient Conditional Generative Model Based on Mel Spectrograms
Tracy Qian
Jackson Kaunismaa
Tony Chung
MGen
GAN
MedIm
21
5
0
15 May 2022
The ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts
Alice Baird
Panagiotis Tzirakis
Gauthier Gidel
Marco Jiralerspong
Eilif B. Muller
Kory W. Mathewson
Björn Schuller
Min Zhang
D. Keltner
Alan S. Cowen
VLM
33
30
0
03 May 2022
How does a spontaneously speaking conversational agent affect user behavior?
Takahisa Iizuka
H. Mori
13
2
0
02 May 2022
Time Domain Adversarial Voice Conversion for ADD 2022
Cheng Wen
Tingwei Guo
Xi Tan
Rui Yan
Shuran Zhou
Chuandong Xie
Wei Zou
Xiangang Li
18
4
0
19 Apr 2022
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration
Haohe Liu
Xubo Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
18
51
0
12 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
30
0
0
08 Apr 2022
FFC-SE: Fast Fourier Convolution for Speech Enhancement
Ivan Shchekotov
Pavel Andreev
Oleg Ivanov
Aibek Alanov
Dmitry Vetrov
37
23
0
06 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
16
32
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
19
51
0
31 Mar 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Yuma Koizumi
Heiga Zen
Kohei Yatabe
Nanxin Chen
M. Bacchiani
DiffM
33
45
0
31 Mar 2022
Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park
Kihyun Choo
Joohyung Lee
A. Porov
Konstantin Osipov
June Sig Sung
16
6
0
27 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
39
92
0
25 Mar 2022
HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement
Pavel Andreev
Aibek Alanov
Oleg Ivanov
Dmitry Vetrov
38
38
0
24 Mar 2022
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
32
2
0
08 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
25
29
0
07 Mar 2022
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
23
4
0
01 Mar 2022
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
16
26
0
15 Feb 2022
Unsupervised word-level prosody tagging for controllable speech synthesis
Yiwei Guo
Chenpeng Du
Kai Yu
26
15
0
15 Feb 2022
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
21
56
0
14 Feb 2022
Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong
Cong Zhou
Taylor Berg-Kirkpatrick
Julian McAuley
27
17
0
12 Feb 2022
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality
Daniel Watson
William Chan
Jonathan Ho
Mohammad Norouzi
DiffM
BDL
36
179
0
11 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
31
28
0
08 Feb 2022
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech
Srikanth Korse
N. Pia
Kishan Gupta
Guillaume Fuchs
52
14
0
31 Jan 2022
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Dan Su
Dong Yu
DiffM
75
65
0
28 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
52
16
0
25 Jan 2022
A sinusoidal signal reconstruction method for the inversion of the mel-spectrogram
Anastasia Natsiou
Seán O'Leary
22
3
0
07 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
39
13
0
04 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
33
100
0
20 Dec 2021
Soundify: Matching Sound Effects to Video
David Chuan-En Lin
Anastasis Germanidis
Cristobal Valenzuela
Yining Shi
Nikolas Martelaro
30
16
0
17 Dec 2021
VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Ehab A. AlBadawy
Andrew Gibiansky
Qing He
Jilong Wu
Ming-Ching Chang
Siwei Lyu
27
12
0
06 Dec 2021
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
Zahra Khanjani
Gabrielle Watson
V. P Janeja
25
25
0
28 Nov 2021
V2C: Visual Voice Cloning
Qi Chen
Yuanqing Li
Yuankai Qi
Jiaqiu Zhou
Mingkui Tan
Qi Wu
VGen
33
23
0
25 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
20
56
0
07 Nov 2021
Emotional Prosody Control for Speech Generation
S. Sivaprasad
Saiteja Kosgi
Vineet Gandhi
12
17
0
07 Nov 2021
SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines
Haozhe Zhang
Zexin Cai
Xiaoyi Qin
Ming Li
54
15
0
06 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
24
162
0
05 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
125
0
04 Nov 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu
Qiao Tian
Tingle Li
Yuping Wang
Yuxuan Wang
Hang Zhao
DiffM
VGen
36
39
0
15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
50
60
0
15 Oct 2021
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Haitong Zhang
Yue Lin
26
0
0
14 Oct 2021
Discovery of Single Independent Latent Variable
Uri Shaham
Jonathan Svirsky
Ori Katz
Ronen Talmon
CML
28
2
0
12 Oct 2021
Source Mixing and Separation Robust Audio Steganography
Naoya Takahashi
M. Singh
Yuki Mitsufuji
34
6
0
11 Oct 2021
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE using Mel-spectrograms
Chien-Feng Liao
Jen-Yu Liu
Yi-Hsuan Yang
27
5
0
08 Oct 2021
Cloning one's voice using very limited data in the wild
Dongyang Dai
Yuan-Jui Chen
Li Chen
Ming Tu
Lu Liu
Rui Xia
Qiao Tian
Yuping Wang
Yuxuan Wang
SyDa
27
9
0
07 Oct 2021
Towards Universal Neural Vocoding with a Multi-band Excited WaveNet
Axel Roebel
F. Bous
29
2
0
07 Oct 2021
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks
E. Hortal
Rodrigo Brechard Alarcia
GAN
26
2
0
06 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
26
6
0
04 Oct 2021
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Dan Su
Dong Yu
DiffM
29
42
0
26 Aug 2021
StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition
Shoki Sakamoto
Akira Taniguchi
T. Taniguchi
Hirokazu Kameoka
BDL
31
5
0
10 Aug 2021
Previous
1
2
3
4
5
Next