Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.06711
Cited By
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
8 October 2019
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis"
50 / 225 papers shown
Title
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Ze Chen
Yihan Wu
Yichong Leng
Jiawei Chen
Haohe Liu
...
Ke Wang
Lei He
Sheng Zhao
Jiang Bian
Danilo Mandic
DiffM
37
22
0
30 Dec 2022
MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset
Kailin Liang
Bin Liu
Yifan Hu
Rui Liu
F. Bao
Guanglai Gao
28
1
0
11 Dec 2022
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong
Liang Li
Yuankai Qi
Zhengjun Zha
Qi Wu
Wen-yu Wang
Bin Jiang
Ming Yang
Qin Huang
75
25
0
08 Dec 2022
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech
Dominik Wagner
Sebastian P. Bayerl
H. A. C. Maruri
Tobias Bocklet
24
7
0
04 Dec 2022
AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary
Pengfei Hu
Zhuang Hui
P. Santhalingam
Riccardo Spolaor
Parth H. Pathak
Guoming Zhang
Xiuzhen Cheng
AAML
23
38
0
02 Dec 2022
Deep Fake Detection, Deterrence and Response: Challenges and Opportunities
Amin Azmoodeh
Ali Dehghantanha
45
2
0
26 Nov 2022
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices
O. Watts
Lovisa Wihlborg
Cassia Valentini-Botinhao
38
3
0
25 Nov 2022
Efficient Incremental Text-to-Speech on GPUs
Muyang Du
Chuan Liu
Jiaxing Qi
Junjie Lai
24
1
0
25 Nov 2022
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?
Xuan Shi
Erica Cooper
Xin Wang
Junichi Yamagishi
Shrikanth Narayanan
27
1
0
25 Nov 2022
VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement
Chenye Cui
Yi Ren
Jinglin Liu
Rongjie Huang
Zhou Zhao
VGen
38
14
0
19 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
43
18
0
17 Nov 2022
A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training
Yang Xiang
Jesper Lisby Højvang
M. Rasmussen
M. G. Christensen
DRL
23
5
0
16 Nov 2022
GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning
Gaku Narita
Junichi Shimizu
Taketo Akama
GAN
29
11
0
10 Nov 2022
I Hear Your True Colors: Image Guided Audio Generation
Roy Sheffer
Yossi Adi
VLM
18
74
0
06 Nov 2022
HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks
Filip Szatkowski
Karol J. Piczak
Przemysław Spurek
Jacek Tabor
Tomasz Trzciñski
23
12
0
03 Nov 2022
Autoregressive GAN for Semantic Unconditional Head Motion Generation
Louis Airale
Xavier Alameda-Pineda
Stéphane Lathuilière
Dominique Vaufreydaz
25
3
0
02 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
35
0
0
31 Oct 2022
Cover Reproducible Steganography via Deep Generative Models
Kejiang Chen
Hang Zhou
Yaofei Wang
Meng Li
Weiming Zhang
Neng H. Yu
DiffM
31
9
0
26 Oct 2022
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation
Chunhui Wang
Chang Zeng
Jun Chen
Xingji He
54
7
0
23 Oct 2022
Adversarial Permutation Invariant Training for Universal Sound Separation
Emilian Postolache
Jordi Pons
Santiago Pascual
Joan Serrà
VLM
28
6
0
21 Oct 2022
Robust One-Shot Singing Voice Conversion
Naoya Takahashi
M. Singh
Yuki Mitsufuji
DiffM
30
8
0
20 Oct 2022
Modeling Animal Vocalizations through Synthesizers
Masato Hagiwara
M. Cusimano
Jen-Yu Liu
41
4
0
19 Oct 2022
SpecRNet: Towards Faster and More Accessible Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
37
14
0
12 Oct 2022
Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Byoung Jin Choi
Myeonghun Jeong
Minchan Kim
Sung Hwan Mun
N. Kim
DiffM
27
5
0
12 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
20
53
0
06 Oct 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
49
29
0
03 Oct 2022
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
27
289
0
30 Sep 2022
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech
Yusuke Nakai
Yuki Saito
K. Udagawa
Hiroshi Saruwatari
AAML
25
1
0
26 Sep 2022
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline
Yifan Hu
Pengkai Yin
Rui Liu
F. Bao
Guanglai Gao
18
5
0
22 Sep 2022
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Liumeng Xue
Frank Soong
Shaofei Zhang
Linfu Xie
27
23
0
14 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
73
573
0
07 Sep 2022
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
28
6
0
26 Aug 2022
Music Separation Enhancement with Generative Modeling
N. Schaffer
Boaz Cogan
Ethan Manilow
Max Morrison
Prem Seetharaman
Bryan Pardo
34
9
0
26 Aug 2022
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
34
0
0
18 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
34
24
0
09 Aug 2022
AdaCat: Adaptive Categorical Discretization for Autoregressive Models
Qiyang Li
Ajay Jain
Pieter Abbeel
OffRL
45
4
0
03 Aug 2022
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Naoki Makishima
Satoshi Suzuki
Atsushi Ando
Ryo Masumura
146
4
0
11 Jul 2022
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Yanqing Liu
Rui Xue
Lei He
Xu Tan
Sheng Zhao
28
24
0
11 Jul 2022
End-to-End Binaural Speech Synthesis
Wen-Chin Huang
Dejan Marković
Alexander Richard
I. D. Gebru
Anjali Menon
29
8
0
08 Jul 2022
NESC: Robust Neural End-2-End Speech Coding with GANs
N. Pia
Kishan Gupta
Srikanth Korse
M. Multrus
Guillaume Fuchs
33
15
0
07 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
55
12
0
05 Jul 2022
TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory Generative Adversarial Network
Yuansheng Guan
Guochen Yu
Andong Li
C. Zheng
Jie Wang
59
9
0
04 Jul 2022
Generating gender-ambiguous voices for privacy-preserving speech recognition
Dimitrios Stoidis
Andrea Cavallaro
36
14
0
03 Jul 2022
A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion
Xu Li
Shansong Liu
Ying Shan
35
13
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
25
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
49
23
0
27 Jun 2022
Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms
Marco Jiralerspong
Gauthier Gidel
VLM
27
3
0
25 Jun 2022
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Yihan Wu
Xi Wang
S. Zhang
Lei He
Ruihua Song
J. Nie
42
15
0
25 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
23
49
0
11 Jun 2022
AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation
Kun Song
Heyang Xue
Xinsheng Wang
Jian Cong
Yongmao Zhang
Linfu Xie
Bing Yang
Xiong Zhang
Dan Su
19
5
0
01 Jun 2022
Previous
1
2
3
4
5
Next