Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.05646
Cited By
v1
v2 (latest)
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
12 October 2020
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"
50 / 1,154 papers shown
Title
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
K. Udagawa
Yuki Saito
Hiroshi Saruwatari
28
6
0
21 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
44
0
0
20 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
95
10
0
18 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
75
8
0
16 Jun 2022
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Ziqian Dai
Jianwei Yu
Yan Wang
Nuo Chen
Yanyao Bian
Guangzhi Li
Deng Cai
Dong Yu
421
8
0
16 Jun 2022
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Jane Polak Scowcroft
70
7
0
15 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
78
9
0
15 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
171
255
0
09 Jun 2022
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Alexander Waibel
M. Behr
Fevziye Irem Eyiokur
Dogucan Yaman
Tuan-Nam Nguyen
Carlos Mullov
Mehmet Arif Demirtas
Alperen Kantarci
Stefan Constantin
H. K. Ekenel
CVBM
69
16
0
09 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
73
5
0
05 Jun 2022
Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining Unsupervised and Supervised Phonetic Representations
Chang Liu
Zhenhua Ling
Linghui Chen
75
3
0
02 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
114
40
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
255
53
0
30 May 2022
SUSing: SU-net for Singing Voice Synthesis
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
84
13
0
24 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
80
28
0
20 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
88
8
0
19 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
101
16
0
18 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
197
34
0
15 May 2022
Talking Face Generation with Multilingual TTS
Hyoung-Kyu Song
Sanghyun Woo
Junhyeok Lee
S. Yang
Hyunjae Cho
Youseong Lee
Dongho Choi
Kang-Wook Kim
CVBM
93
22
0
13 May 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
78
14
0
12 May 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
58
23
0
11 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
159
221
0
09 May 2022
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech
Yongqian Li
Cheng Yu
Guangzhi Sun
Hua Jiang
Fanglei Sun
Weiqin Zu
Ying Wen
Yang Yang
Jun Wang
71
7
0
09 May 2022
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Jiatong Shi
Shuai Guo
Tao Qian
Nan Huo
Tomoki Hayashi
...
Xuankai Chang
Hua-Wei Li
Peter Wu
Shinji Watanabe
Qin Jin
VLM
118
27
0
09 May 2022
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
82
5
0
25 Apr 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Leilei Gan
90
28
0
25 Apr 2022
Improving Self-Supervised Learning-based MOS Prediction Networks
Bálint Gyires-Tóth
Csaba Zainkó
SSL
47
1
0
23 Apr 2022
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Detai Xin
Shinnosuke Takamichi
T. Okamoto
Hisashi Kawai
Hiroshi Saruwatari
36
0
0
22 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang
Max W. Y. Lam
Jun Wang
Jane Polak Scowcroft
Dong Yu
Yi Ren
Zhou Zhao
DiffM
78
172
0
21 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
77
113
0
20 Apr 2022
Audio Deep Fake Detection System with Neural Stitching for ADD 2022
Rui Yan
Cheng Wen
Shuran Zhou
Tingwei Guo
Wei Zou
Xiangang Li
49
24
0
19 Apr 2022
Time Domain Adversarial Voice Conversion for ADD 2022
Cheng Wen
Tingwei Guo
Xi Tan
Rui Yan
Shuran Zhou
Chuandong Xie
Wei Zou
Xiangang Li
85
4
0
19 Apr 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
34
2
0
12 Apr 2022
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration
Haohe Liu
Xubo Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
76
59
0
12 Apr 2022
INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge
Lorenz Diener
Sten Sootla
Solomiya Branets
Ando Saabas
R. Aichner
Ross Cutler
75
43
0
11 Apr 2022
Correcting Mispronunciations in Speech using Spectrogram Inpainting
Talia Ben Simon
Felix Kreuk
Faten Awwad
Jacob T. Cohen
Joseph Keshet
80
2
0
07 Apr 2022
Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch Encoder
Juheon Lee
Hyeong-Seok Choi
Kyogu Lee
45
7
0
07 Apr 2022
FFC-SE: Fast Fourier Convolution for Speech Enhancement
Ivan Shchekotov
Pavel Andreev
Oleg Ivanov
Aibek Alanov
Dmitry Vetrov
70
24
0
06 Apr 2022
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Jiankun Hu
Zhiyong Wu
Shiyin Kang
Helen Meng
81
10
0
06 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
47
0
0
05 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
124
54
0
04 Apr 2022
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Yixuan Zhou
Changhe Song
Xiang Li
Lu Zhang
Zhiyong Wu
Yanyao Bian
Jane Polak Scowcroft
Helen Meng
146
23
0
03 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
118
46
0
01 Apr 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Yihan Wu
Xu Tan
Bohan Li
Lei He
Sheng Zhao
Ruihua Song
Tao Qin
Tie-Yan Liu
VLM
DiffM
91
69
0
01 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
69
0
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
89
33
0
31 Mar 2022
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Shuai Guo
Jiatong Shi
Tao Qian
Shinji Watanabe
Qin Jin
137
13
0
31 Mar 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
75
6
0
31 Mar 2022
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Hubert Siuzdak
Piotr Dura
Pol van Rijn
Nori Jacoby
AI4TS
142
30
0
31 Mar 2022
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
D. Lim
Sunghee Jung
Eesung Kim
95
53
0
31 Mar 2022
Previous
1
2
3
...
19
20
21
22
23
24
Next