Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11646
Cited By
High Fidelity Speech Synthesis with Adversarial Networks
25 September 2019
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High Fidelity Speech Synthesis with Adversarial Networks"
50 / 149 papers shown
Title
Towards Error-Resilient Neural Speech Coding
Huaying Xue
Xiulian Peng
Xue Jiang
Yan Lu
21
7
0
03 Jul 2022
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
Liumeng Xue
Shan Yang
Na Hu
Dan Su
Linfu Xie
21
2
0
02 Jul 2022
Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models
Fan Bao
Chongxuan Li
Jiacheng Sun
Jun Zhu
Bo Zhang
DiffM
36
72
0
15 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
17
225
0
09 Jun 2022
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Zhenzi Weng
Zhijin Qin
Xiaoming Tao
Chengkang Pan
Guangyi Liu
Geoffrey Ye Li
35
132
0
09 May 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Fei Wu
29
27
0
25 Apr 2022
The Sillwood Technologies System for the VoiceMOS Challenge 2022
Jiameng Gao
23
0
0
08 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
20
0
0
05 Apr 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
18
30
0
31 Mar 2022
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
DiffM
34
92
0
25 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data
Gašper Beguš
Alan Zhou
SSL
24
4
0
22 Mar 2022
Reproducible Subjective Evaluation
Max Morrison
Brian Tang
Gefei Tan
Bryan Pardo
20
6
0
08 Mar 2022
Practical cognitive speech compression
Reza Lotfidereshgi
P. Gournay
32
2
0
08 Mar 2022
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko
Kou Tanaka
Hirokazu Kameoka
Shogo Seki
19
60
0
04 Mar 2022
Revisiting Over-Smoothness in Text to Speech
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
70
61
0
26 Feb 2022
It's Raw! Audio Generation with State-Space Models
Karan Goel
Albert Gu
Chris Donahue
Christopher Ré
14
186
0
20 Feb 2022
Attributable-Watermarking of Speech Generative Models
Yongbaek Cho
Changhoon Kim
Yezhou Yang
Yi Ren
14
6
0
17 Feb 2022
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Adam Gabry's
Goeric Huybrechts
M. Ribeiro
C. Chien
Julian Roth
Giulia Comini
Roberto Barra-Chicote
Bartek Perz
Jaime Lorenzo-Trueba
30
21
0
16 Feb 2022
InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Zehua Chen
Xu Tan
Ke Wang
Shifeng Pan
Danilo Mandic
Lei He
Sheng Zhao
DiffM
25
28
0
08 Feb 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Shinnosuke Takamichi
Wataru Nakata
Naoko Tanji
Hiroshi Saruwatari
AuLLM
25
6
0
26 Jan 2022
Improved Input Reprogramming for GAN Conditioning
Tuan Dinh
Daewon Seo
Zhixu Du
Liang Shang
Kangwook Lee
AI4CE
22
8
0
07 Jan 2022
Audio representations for deep learning in sound synthesis: A review
Anastasia Natsiou
Seán O'Leary
AI4TS
19
18
0
07 Jan 2022
Semantic Communications: Principles and Challenges
Zhijin Qin
Xiaoming Tao
Jianhua Lu
Wen Tong
Geoffrey Ye Li
31
338
0
30 Dec 2021
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
28
98
0
20 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
185
378
0
04 Dec 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffM
BDL
19
107
0
23 Nov 2021
High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency
Nikolaos Ellinas
G. Vamvoukakis
K. Markopoulos
Aimilios Chalamandaris
Georgia Maniati
Panos Kakoulidis
S. Raptis
June Sig Sung
Hyoungmin Park
Pirros Tsiakoulis
6
36
0
17 Nov 2021
Generating Diverse Realistic Laughter for Interactive Art
Mehdi Park Eric Paquette Étienne Gidel Gauthier Mathewso Afsar
Eric Park
Étienne Paquette
Gauthier Gidel
Kory W. Mathewson
Eilif B. Muller
20
7
0
04 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
129
123
0
04 Nov 2021
Towards Language Modelling in the Speech Domain Using Sub-word Linguistic Units
Anurag Katakkar
A. Black
AuLLM
27
1
0
31 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
36
68
0
19 Oct 2021
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection
Zhenyu Zhang
Yewei Gu
Xiaowei Yi
Xianfeng Zhao
27
24
0
18 Oct 2021
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
28
121
0
17 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
50
60
0
15 Oct 2021
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example
Hieu-Thi Luong
Junichi Yamagishi
44
9
0
11 Oct 2021
Denoising Diffusion Gamma Models
Eliya Nachmani
S. Robin
Lior Wolf
DiffM
VLM
20
30
0
10 Oct 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Manh Luong
Viet-Anh Tran
6
2
0
27 Sep 2021
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Dan Su
Dong Yu
DiffM
27
42
0
26 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
18
18
0
09 Aug 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis with GANs
J. Nistal
Stefan Lattner
G. Richard
21
8
0
03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
24
7
0
01 Aug 2021
Generative Models for Security: Attacks, Defenses, and Opportunities
L. A. Bauer
Vincent Bindschaedler
25
4
0
21 Jul 2021
Filtered Noise Shaping for Time Domain Room Impulse Response Estimation From Reverberant Speech
C. Steinmetz
V. Ithapu
P. Calamia
46
39
0
15 Jul 2021
Adversarial Auto-Encoding for Packet Loss Concealment
Santiago Pascual
Joan Serrà
Jordi Pons
31
27
0
07 Jul 2021
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
AI based Presentation Creator With Customized Audio Content Delivery
Muvazima Mansoor
Srikanth Chandar
Ramamoorthy Srinath
18
0
0
27 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Jian Cong
Shan Yang
Lei Xie
Dan Su
DRL
18
29
0
21 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Nanxin Chen
Yu Zhang
Heiga Zen
Ron J. Weiss
Mohammad Norouzi
Najim Dehak
William Chan
DiffM
21
88
0
17 Jun 2021
Non Gaussian Denoising Diffusion Models
Eliya Nachmani
Robin San Roman
Lior Wolf
VLM
DiffM
32
48
0
14 Jun 2021
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example
Gal Greshler
Tamar Rott Shaham
T. Michaeli
18
25
0
11 Jun 2021
Previous
1
2
3
Next