ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05646
  4. Cited By
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
v1v2 (latest)

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

12 October 2020
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
ArXiv (abs)PDFHTML

Papers citing "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"

50 / 1,154 papers shown
Title
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising
  Diffusion GANs
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Songxiang Liu
Jane Polak Scowcroft
Dong Yu
DiffM
162
67
0
28 Jan 2022
The MSXF TTS System for ICASSP 2022 ADD Challenge
The MSXF TTS System for ICASSP 2022 ADD Challenge
Chunyong Yang
Pengfei Liu
Yanli Chen
Hongbin Wang
Min Liu
46
0
0
27 Jan 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
Shinnosuke Takamichi
Wataru Nakata
Naoko Tanji
Hiroshi Saruwatari
AuLLM
79
7
0
26 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion
  with Harmonic Signals
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals
Haohan Guo
Zhiping Zhou
Fanbo Meng
Kai-Chun Liu
111
16
0
25 Jan 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with
  imperfect transcription
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcription
Dabiao Ma
Yitong Zhang
Meng Li
Feng Ye
41
1
0
19 Jan 2022
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for
  Singing Voice Synthesis
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis
Yu Wang
Xinsheng Wang
Pengcheng Zhu
Jie Wu
Hanzhao Li
Heyang Xue
Yongmao Zhang
Lei Xie
Mengxiao Bi
122
103
0
19 Jan 2022
Improved Input Reprogramming for GAN Conditioning
Improved Input Reprogramming for GAN Conditioning
Tuan Dinh
Daewon Seo
Zhixu Du
Liang Shang
Kangwook Lee
AI4CE
105
8
0
07 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale
  Corpus
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
Rongjie Huang
Feiyang Chen
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
103
104
0
20 Dec 2021
Discretization and Re-synthesis: an alternative method to solve the
  Cocktail Party Problem
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem
Jing Shi
Xuankai Chang
Tomoki Hayashi
Yen-Ju Lu
Shinji Watanabe
Bo Xu
112
19
0
17 Dec 2021
Textless Speech-to-Speech Translation on Real Data
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
122
150
0
15 Dec 2021
Audio Deepfake Perceptions in College Going Populations
Audio Deepfake Perceptions in College Going Populations
Gabrielle Watson
Zahra Khanjani
V. P Janeja
81
7
0
06 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
  Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
281
415
0
04 Dec 2021
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey
Zahra Khanjani
Gabrielle Watson
V. P Janeja
61
27
0
28 Nov 2021
V2C: Visual Voice Cloning
V2C: Visual Voice Cloning
Qi Chen
Yuanqing Li
Yuankai Qi
Jiaqiu Zhou
Mingkui Tan
Qi Wu
VGen
81
27
0
25 Nov 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance
Heeseung Kim
Sungwon Kim
Sungroh Yoon
DiffMBDL
139
112
0
23 Nov 2021
Textless Speech Emotion Conversion using Discrete and Decomposed
  Representations
Textless Speech Emotion Conversion using Discrete and Decomposed Representations
Felix Kreuk
Adam Polyak
Jade Copet
Eugene Kharitonov
Tu Nguyen
M. Rivière
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
Yossi Adi
119
34
0
14 Nov 2021
Meta-Voice: Fast few-shot style transfer for expressive voice cloning
  using meta learning
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning
Songxiang Liu
Jane Polak Scowcroft
Dong Yu
64
10
0
14 Nov 2021
Speaker Generation
Speaker Generation
Daisy Stanton
Matt Shannon
Soroosh Mariooryad
RJ Skerry-Ryan
Eric Battenberg
Tom Bagby
David Kao
96
30
0
07 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
204
131
0
04 Nov 2021
Voice Conversion Can Improve ASR in Very Low-Resource Settings
Voice Conversion Can Improve ASR in Very Low-Resource Settings
Matthew Baas
Herman Kamper
103
17
0
04 Nov 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice
  Conversion
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
Benjamin van Niekerk
M. Carbonneau
Julian Zaïdi
Matthew Baas
Hugo Seuté
Herman Kamper
DRL
122
123
0
03 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with
  Highly Accurate Pitch and Intensity Responses
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses
Shengyuan Xu
Wenxiao Zhao
Jing Guo
66
12
0
01 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
116
158
0
27 Oct 2021
Controllable and Interpretable Singing Voice Decomposition via Assem-VC
Controllable and Interpretable Singing Voice Decomposition via Assem-VC
Kang-Wook Kim
Junhyeok Lee
54
0
0
25 Oct 2021
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard
  Challenge 2021
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Yanqing Liu
Rui Shao
G. Wang
Kuan Chen
Bohan Li
Pong C. Yuen
Jinzhu Li
Lei He
Sheng Zhao
98
55
0
25 Oct 2021
Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs
  Theory
Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory
T. Chen
Guan-Horng Liu
Evangelos A. Theodorou
DiffMOT
304
181
0
21 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
135
72
0
19 Oct 2021
Neural Synthesis of Footsteps Sound Effects with Generative Adversarial
  Networks
Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks
Marco Comunità
Huy Phan
Joshua D. Reiss
GAN
81
11
0
18 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice
  in karaoke
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
90
6
0
18 Oct 2021
VISinger: Variational Inference with Adversarial Learning for End-to-End
  Singing Voice Synthesis
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis
Yongmao Zhang
Jian Cong
Heyang Xue
Lei Xie
Pengcheng Zhu
Mengxiao Bi
108
77
0
17 Oct 2021
Direct Simultaneous Speech-to-Speech Translation with Variational
  Monotonic Multihead Attention
Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention
Xutai Ma
Hongyu Gong
Danni Liu
Ann Lee
Yun Tang
Peng-Jen Chen
Wei-Ning Hsu
P. Koehn
J. Pino
106
9
0
15 Oct 2021
From Start to Finish: Latency Reduction Strategies for Incremental
  Speech Synthesis in Simultaneous Speech-to-Speech Translation
From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation
Danni Liu
Changhan Wang
Hongyu Gong
Xutai Ma
Yun Tang
J. Pino
107
4
0
15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research
ESPnet2-TTS: Extending the Edge of TTS Research
Tomoki Hayashi
Ryuichi Yamamoto
Takenori Yoshimura
Peter Wu
Jiatong Shi
Takaaki Saeki
Yooncheol Ju
Yusuke Yasuda
Shinnosuke Takamichi
Shinji Watanabe
VLM
85
63
0
15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice
  Generation
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
203
63
0
14 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language
  Processing
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
183
203
0
14 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual
  Text-to-Speech
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech
Haoyue Zhan
Xinyuan Yu
Haitong Zhang
Yang Zhang
Yue Lin
69
5
0
14 Oct 2021
Revisiting IPA-based Cross-lingual Text-to-speech
Revisiting IPA-based Cross-lingual Text-to-speech
Haitong Zhang
Haoyue Zhan
Yang Zhang
Xinyuan Yu
Yue Lin
68
7
0
14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis
A Melody-Unsupervision Model for Singing Voice Synthesis
Soonbeom Choi
Juhan Nam
67
14
0
13 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised
  Speech Representations
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Hung-yi Lee
Shinji Watanabe
Tomoki Toda
78
40
0
12 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning
Adapting TTS models For New Speakers using Transfer Learning
Paarth Neekhara
Jason Chun Lok Li
Boris Ginsburg
144
15
0
12 Oct 2021
LaughNet: synthesizing laughter utterances from waveform silhouettes and
  a single laughter example
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example
Hieu-Thi Luong
Junichi Yamagishi
72
9
0
11 Oct 2021
Towards High-fidelity Singing Voice Conversion with Acoustic Reference
  and Contrastive Predictive Coding
Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding
Chao Wang
Zhonghao Li
Benlai Tang
Xiang Yin
Yuan Wan
Yibiao Yu
Zejun Ma
71
18
0
10 Oct 2021
Environment Aware Text-to-Speech Synthesis
Environment Aware Text-to-Speech Synthesis
Daxin Tan
Guangyan Zhang
Tan Lee
86
4
0
08 Oct 2021
EdiTTS: Score-based Editing for Controllable Text-to-Speech
EdiTTS: Score-based Editing for Controllable Text-to-Speech
Jaesung Tae
Hyeongju Kim
Taesu Kim
DiffM
279
40
0
06 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and
  Prosody in Speech Synthesis
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
71
6
0
04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Yi Ren
Jinglin Liu
Zhou Zhao
139
79
0
30 Sep 2021
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling
  Scheme
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
Jiansheng Wei
DiffMBDL
162
136
0
28 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder
VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Haohe Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
100
58
0
28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates
MSR-NV: Neural Vocoder Using Multiple Sampling Rates
Kentaro Mitsui
Kei Sawada
111
0
0
28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for
  Speech Synthesis
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis
Manh Luong
Viet-Anh Tran
29
2
0
27 Sep 2021
Previous
123...21222324
Next