ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05646
  4. Cited By
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

12 October 2020
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
ArXivPDFHTML

Papers citing "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"

50 / 1,107 papers shown
Title
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on
  Fixed-Point Iteration
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Yuma Koizumi
Kohei Yatabe
Heiga Zen
M. Bacchiani
DiffM
53
29
0
03 Oct 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
27
290
0
30 Sep 2022
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural
  Text-to-Speech
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech
Yusuke Nakai
Yuki Saito
K. Udagawa
Hiroshi Saruwatari
AAML
30
1
0
26 Sep 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
31
17
0
24 Sep 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on
  Pitch and Speed
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
30
10
0
23 Sep 2022
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural
  TTS
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Haohan Guo
Fenglong Xie
Frank Soong
Xixin Wu
Helen M. Meng
42
11
0
22 Sep 2022
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and
  Accompanied Baseline
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline
Yifan Hu
Pengkai Yin
Rui Liu
F. Bao
Guanglai Gao
23
5
0
22 Sep 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
47
6
0
22 Sep 2022
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic
  Wasserstein GAN
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN
Yin-Ping Cho
Yu Tsao
Hsin-Min Wang
Yi-Wen Liu
DiffM
40
9
0
21 Sep 2022
MVNet: Memory Assistance and Vocal Reinforcement Network for Speech
  Enhancement
MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
Jianrong Wang
Xiaomin Li
Xuewei Li
Mei Yu
Qiang Fang
Li Liu
44
0
0
15 Sep 2022
Open Challenges in Synthetic Speech Detection
Open Challenges in Synthetic Speech Detection
Luca Cuccovillo
Christoforos Papastergiopoulos
Anastasios Vafeiadis
Artem Yaroshchuk
P. Aichroth
K. Votis
Dimitrios Tzovaras
46
28
0
15 Sep 2022
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in
  Paragraph-based TTS
ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Liumeng Xue
Frank Soong
Shaofei Zhang
Linfu Xie
32
23
0
14 Sep 2022
Deep Speech Synthesis from Articulatory Representations
Deep Speech Synthesis from Articulatory Representations
Peter Wu
Shinji Watanabe
Louis Goldstein
A. Black
Gopala K. Anumanchipalli
39
25
0
13 Sep 2022
DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Ruibin Yuan
Yuxuan Wu
Jacob Li
Jaxter Kim
37
5
0
09 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
73
575
0
07 Sep 2022
Exploiting Pre-trained Feature Networks for Generative Adversarial
  Networks in Audio-domain Loop Generation
Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation
Yen-Tung Yeh
Bo-Yu Chen
Yi-Hsuan Yang
47
6
0
05 Sep 2022
Mel Spectrogram Inversion with Stable Pitch
Mel Spectrogram Inversion with Stable Pitch
Bruno Di Giorgi
M. Levy
Richard Sharp
28
6
0
26 Aug 2022
Music Separation Enhancement with Generative Modeling
Music Separation Enhancement with Generative Modeling
N. Schaffer
Boaz Cogan
Ethan Manilow
Max Morrison
Prem Seetharaman
Bryan Pardo
34
9
0
26 Aug 2022
Are disentangled representations all you need to build speaker
  anonymization systems?
Are disentangled representations all you need to build speaker anonymization systems?
Pierre Champion
D. Jouvet
Anthony Larcher
40
20
0
22 Aug 2022
An Initial Investigation for Detecting Vocoder Fingerprints of Fake
  Audio
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio
Xin Yan
Jiangyan Yi
J. Tao
Chenglong Wang
Haoxin Ma
Tao Wang
Shiming Wang
Ruibo Fu
30
26
0
20 Aug 2022
Pathway to Future Symbiotic Creativity
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
39
0
0
18 Aug 2022
Musika! Fast Infinite Waveform Music Generation
Musika! Fast Infinite Waveform Music Generation
Marco Pasini
Jan Schluter
MGen
20
29
0
18 Aug 2022
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Xiang Li
Changhe Song
X. Wei
Zhiyong Wu
Jia Jia
Helen Meng
29
4
0
10 Aug 2022
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A
  Comprehensive Evaluation
DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation
Da-Yi Wu
Wen-Yi Hsiao
Fu-Rong Yang
Oscar D. Friedman
Warren Jackson
Scott Bruzenak
Yi-Wen Liu
Yi-Hsuan Yang
DiffM
39
24
0
09 Aug 2022
BSDGAN: Balancing Sensor Data Generative Adversarial Networks for Human
  Activity Recognition
BSDGAN: Balancing Sensor Data Generative Adversarial Networks for Human Activity Recognition
Yifan Hu
Yu Wang
36
7
0
07 Aug 2022
Customs Import Declaration Datasets
Customs Import Declaration Datasets
Chae-Seong Jeong
Sundong Kim
Jaewoo Park
Yeonsoo Choi
31
3
0
04 Aug 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
36
297
0
20 Jul 2022
Latent-Domain Predictive Neural Speech Coding
Latent-Domain Predictive Neural Speech Coding
Xue Jiang
Xiulian Peng
Huaying Xue
Yuan Zhang
Yan Lu
41
17
0
18 Jul 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
44
195
0
13 Jul 2022
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu
Qiao Tian
Chenxu Hu
Xudong Liu
Meng-Che Wu
Yuping Wang
Hang Zhao
Yuxuan Wang
36
10
0
13 Jul 2022
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning
  to Separate
SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Nabarun Goswami
Tatsuya Harada
26
5
0
13 Jul 2022
Text-driven Emotional Style Control and Cross-speaker Style Transfer in
  Neural TTS
Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS
Yookyung Shin
Younggun Lee
Suhee Jo
Yeongtae Hwang
Taesu Kim
25
14
0
13 Jul 2022
CFAD: A Chinese Dataset for Fake Audio Detection
CFAD: A Chinese Dataset for Fake Audio Detection
Haoxin Ma
Jiangyan Yi
Chenglong Wang
Xin Yan
J. Tao
Tao Wang
Shiming Wang
Ruibo Fu
24
26
0
12 Jul 2022
End-to-end speech recognition modeling from de-identified data
End-to-end speech recognition modeling from de-identified data
M. Flechl
Shou-Chun Yin
Junho Park
Peter Skala
17
4
0
12 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
24
5
0
11 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
34
27
0
11 Jul 2022
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial
  Vector-Quantized Auto-Encoders
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Yanqing Liu
Rui Xue
Lei He
Xu Tan
Sheng Zhao
28
24
0
11 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Tomoki Toda
21
15
0
10 Jul 2022
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech
  Synthesis
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis
Yongqiang Wang
Zhou Zhao
19
10
0
08 Jul 2022
End-to-End Binaural Speech Synthesis
End-to-End Binaural Speech Synthesis
Wen-Chin Huang
Dejan Marković
Alexander Richard
I. D. Gebru
Anjali Menon
32
8
0
08 Jul 2022
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi
Michael Chinen
Tom Denton
W. Kleijn
Jan Skoglund
27
8
0
05 Jul 2022
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer
  Conditional Adversarial Training
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training
Zewang Zhang
Yibin Zheng
Xinhui Li
Li Lu
DiffM
31
11
0
05 Jul 2022
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and
  Any-to-any Voice Conversion
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion
Yinjiao Lei
Shan Yang
Jian Cong
Linfu Xie
Dan Su
DiffM
64
12
0
05 Jul 2022
Learning Noise-independent Speech Representation for High-quality Voice
  Conversion for Noisy Target Speakers
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers
Liumeng Xue
Shan Yang
Na Hu
Dan Su
Linfu Xie
37
2
0
02 Jul 2022
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Kyle Kastner
Aaron Courville
38
0
0
30 Jun 2022
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for
  Speech Synthesis based on Disentanglement between Prosody and Timbre
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre
Guangyan Zhang
Ying Qin
Wenbo Zhang
Jialun Wu
Mei Li
Yu Gai
Feijun Jiang
Tan Lee
55
26
0
29 Jun 2022
Data Redaction from Pre-trained GANs
Data Redaction from Pre-trained GANs
Zhifeng Kong
Kamalika Chaudhuri
65
16
0
29 Jun 2022
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech
  Insertion
RetrieverTTS: Modeling Decomposed Factors for Text-Based Speech Insertion
Dacheng Yin
Chuanxin Tang
Yanqing Liu
Xiaoqiang Wang
Zhiyuan Zhao
Yucheng Zhao
Zhiwei Xiong
Sheng Zhao
Chong Luo
31
12
0
28 Jun 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Taejun Bak
Junmo Lee
Hanbin Bae
Jinhyeok Yang
Jaesung Bae
Young-Sun Joo
27
28
0
27 Jun 2022
Attack Agnostic Dataset: Towards Generalization and Stabilization of
  Audio DeepFake Detection
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa
Marcin Plata
P. Syga
AAML
49
23
0
27 Jun 2022
Previous
123...171819...212223
Next