ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05646
  4. Cited By
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

12 October 2020
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
ArXivPDFHTML

Papers citing "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"

50 / 1,107 papers shown
Title
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
40
19
0
24 Jun 2022
A Study on the Evaluation of Generative Models
A Study on the Evaluation of Generative Models
Eyal Betzalel
Coby Penso
Aviv Navon
Ethan Fetaya
EGVM
33
48
0
22 Jun 2022
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
K. Udagawa
Yuki Saito
Hiroshi Saruwatari
6
5
0
21 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis
Yi Wang
Yi Si
28
0
0
20 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks
  on Speaker Verification Systems
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
23
10
0
18 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis
  Using Linguistic and Prosodic Contexts of Dialogue History
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
30
7
0
16 Jun 2022
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Ziqian Dai
Jianwei Yu
Yan Wang
Nuo Chen
Yanyao Bian
Guangzhi Li
Deng Cai
Dong Yu
223
7
0
16 Jun 2022
End-to-End Voice Conversion with Information Perturbation
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Dan Su
43
7
0
15 Jun 2022
Streaming non-autoregressive model for any-to-many voice conversion
Streaming non-autoregressive model for any-to-many voice conversion
Ziyi Chen
Haoran Miao
Pengyuan Zhang
27
8
0
15 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Sang-gil Lee
Ming-Yu Liu
Boris Ginsburg
Bryan Catanzaro
Sung-Hoon Yoon
33
232
0
09 Jun 2022
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Alexander Waibel
M. Behr
Fevziye Irem Eyiokur
Dogucan Yaman
Tuan-Nam Nguyen
Carlos Mullov
Mehmet Arif Demirtas
Alperen Kantarci
Stefan Constantin
H. K. Ekenel
CVBM
15
14
0
09 Jun 2022
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for
  Text-to-Speech
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Ziyue Jiang
Zhe Su
Zhou Zhao
Qian Yang
Yi Ren
Jinglin Liu
Zhe Ye
26
4
0
05 Jun 2022
Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining
  Unsupervised and Supervised Phonetic Representations
Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining Unsupervised and Supervised Phonetic Representations
Chang Liu
Zhenhua Ling
Linghui Chen
31
3
0
02 Jun 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse
  Text-to-Speech Synthesis
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Yinghao Aaron Li
Cong Han
N. Mesgarani
51
38
0
30 May 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech
  with Untranscribed Data
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Sungwon Kim
Heeseung Kim
Sung-Hoon Yoon
DiffM
204
52
0
30 May 2022
SUSing: SU-net for Singing Voice Synthesis
SUSing: SU-net for Singing Voice Synthesis
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
37
13
0
24 May 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
38
24
0
20 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
37
8
0
19 May 2022
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech
  Translation
Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Qibing Bai
Yu Zhang
51
16
0
18 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain
  Text-to-Speech
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
117
34
0
15 May 2022
Talking Face Generation with Multilingual TTS
Talking Face Generation with Multilingual TTS
Hyoung-Kyu Song
Sanghyun Woo
Junhyeok Lee
S. Yang
Hyunjae Cho
Youseong Lee
Dongho Choi
Kang-Wook Kim
CVBM
42
21
0
13 May 2022
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation
  Generation
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama
Yi-Chiao Wu
Tomoki Toda
30
14
0
12 May 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
26
23
0
11 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level
  Quality
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
49
213
0
09 May 2022
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech
Yongqian Li
Cheng Yu
Guangzhi Sun
Hua Jiang
Fanglei Sun
Weiqin Zu
Ying Wen
Yang Yang
Jun Wang
31
7
0
09 May 2022
Muskits: an End-to-End Music Processing Toolkit for Singing Voice
  Synthesis
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Jiatong Shi
Shuai Guo
Tao Qian
Nan Huo
Tomoki Hayashi
...
Xuankai Chang
Hua-Wei Li
Peter Wu
Shinji Watanabe
Qin Jin
VLM
25
27
0
09 May 2022
Parallel Synthesis for Autoregressive Speech Generation
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
42
5
0
25 Apr 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Fei Wu
46
27
0
25 Apr 2022
Improving Self-Supervised Learning-based MOS Prediction Networks
Improving Self-Supervised Learning-based MOS Prediction Networks
Bálint Gyires-Tóth
Csaba Zainkó
SSL
16
1
0
23 Apr 2022
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Detai Xin
Shinnosuke Takamichi
T. Okamoto
Hisashi Kawai
Hiroshi Saruwatari
32
0
0
22 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech
  Synthesis
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
Yi Ren
Zhou Zhao
DiffM
28
166
0
21 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
Audio Deep Fake Detection System with Neural Stitching for ADD 2022
Audio Deep Fake Detection System with Neural Stitching for ADD 2022
Rui Yan
Cheng Wen
Shuran Zhou
Tingwei Guo
Wei Zou
Xiangang Li
15
22
0
19 Apr 2022
Time Domain Adversarial Voice Conversion for ADD 2022
Time Domain Adversarial Voice Conversion for ADD 2022
Cheng Wen
Tingwei Guo
Xi Tan
Rui Yan
Shuran Zhou
Chuandong Xie
Wei Zou
Xiangang Li
23
4
0
19 Apr 2022
A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Zhe-ming Lu
Mengnan He
Ruixiong Zhang
Caixia Gong
GAN
19
2
0
12 Apr 2022
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration
VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration
Haohe Liu
Xubo Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
21
51
0
12 Apr 2022
INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge
INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge
Lorenz Diener
Sten Sootla
Solomiya Branets
Ando Saabas
R. Aichner
Ross Cutler
23
40
0
11 Apr 2022
Correcting Mispronunciations in Speech using Spectrogram Inpainting
Correcting Mispronunciations in Speech using Spectrogram Inpainting
Talia Ben Simon
Felix Kreuk
Faten Awwad
Jacob T. Cohen
Joseph Keshet
28
2
0
07 Apr 2022
Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch
  Encoder
Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch Encoder
Juheon Lee
Hyeong-Seok Choi
Kyogu Lee
14
5
0
07 Apr 2022
FFC-SE: Fast Fourier Convolution for Speech Enhancement
FFC-SE: Fast Fourier Convolution for Speech Enhancement
Ivan Shchekotov
Pavel Andreev
Oleg Ivanov
Aibek Alanov
Dmitry Vetrov
40
23
0
06 Apr 2022
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context
  Information for Mandarin Speech Synthesis
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Jiankun Hu
Zhiyong Wu
Shiyin Kang
Helen Meng
27
10
0
06 Apr 2022
Adversarial Learning of Intermediate Acoustic Feature for End-to-End
  Lightweight Text-to-Speech
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon
Seyun Um
Changwhan Kim
Hong-Goo Kang
28
0
0
05 Apr 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
33
51
0
04 Apr 2022
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker
  Adaptation in Text-to-Speech Synthesis
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Yixuan Zhou
Changhe Song
Xiang Li
Lu Zhang
Zhiyong Wu
Yanyao Bian
Dan Su
Helen Meng
34
22
0
03 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
33
44
0
01 Apr 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Yihan Wu
Xu Tan
Bohan Li
Lei He
Sheng Zhao
Ruihua Song
Tao Qin
Tie-Yan Liu
VLM
DiffM
19
67
0
01 Apr 2022
Universal Adaptor: Converting Mel-Spectrograms Between Different
  Configurations for Speech Synthesis
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis
Fan Wang
Po-Chun Hsu
Da-Rong Liu
Hung-yi Lee
18
0
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
  by Re-Synthesis
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
20
32
0
31 Mar 2022
SingAug: Data Augmentation for Singing Voice Synthesis with
  Cycle-consistent Training Strategy
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy
Shuai Guo
Jiatong Shi
Tao Qian
Shinji Watanabe
Qin Jin
36
13
0
31 Mar 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
34
5
0
31 Mar 2022
Previous
123...181920212223
Next