v1v2 (latest)

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

12 October 2020

Papers citing "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"

50 / 1,154 papers shown

Title
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs Songxiang Liu Jane Polak Scowcroft Dong Yu DiffM 162 67 0 28 Jan 2022
The MSXF TTS System for ICASSP 2022 ADD Challenge Chunyong Yang Pengfei Liu Yanli Chen Hongbin Wang Min Liu 46 0 0 27 Jan 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis Shinnosuke Takamichi Wataru Nakata Naoko Tanji Hiroshi Saruwatari AuLLM 79 7 0 26 Jan 2022
Improving Adversarial Waveform Generation based Singing Voice Conversion with Harmonic Signals Haohan Guo Zhiping Zhou Fanbo Meng Kai-Chun Liu 111 16 0 25 Jan 2022
MHTTS: Fast multi-head text-to-speech for spontaneous speech with imperfect transcription Dabiao Ma Yitong Zhang Meng Li Feng Ye 41 1 0 19 Jan 2022
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis Yu Wang Xinsheng Wang Pengcheng Zhu Jie Wu Hanzhao Li Heyang Xue Yongmao Zhang Lei Xie Mengxiao Bi 122 103 0 19 Jan 2022
Improved Input Reprogramming for GAN Conditioning Tuan Dinh Daewon Seo Zhixu Du Liang Shang Kangwook Lee AI4CE 105 8 0 07 Jan 2022
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus Rongjie Huang Feiyang Chen Yi Ren Jinglin Liu Chenye Cui Zhou Zhao 103 104 0 20 Dec 2021
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem Jing Shi Xuankai Chang Tomoki Hayashi Yen-Ju Lu Shinji Watanabe Bo Xu 112 19 0 17 Dec 2021
Textless Speech-to-Speech Translation on Real Data Ann Lee Hongyu Gong Paul-Ambroise Duquenne Holger Schwenk Peng-Jen Chen ... Sravya Popuri Yossi Adi J. Pino Jiatao Gu Wei-Ning Hsu 122 150 0 15 Dec 2021
Audio Deepfake Perceptions in College Going Populations Gabrielle Watson Zahra Khanjani V. P Janeja 81 7 0 06 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 281 415 0 04 Dec 2021
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey Zahra Khanjani Gabrielle Watson V. P Janeja 61 27 0 28 Nov 2021
V2C: Visual Voice Cloning Qi Chen Yuanqing Li Yuankai Qi Jiaqiu Zhou Mingkui Tan Qi Wu VGen 81 27 0 25 Nov 2021
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Heeseung Kim Sungwon Kim Sungroh Yoon DiffM BDL 139 112 0 23 Nov 2021
Textless Speech Emotion Conversion using Discrete and Decomposed Representations Felix Kreuk Adam Polyak Jade Copet Eugene Kharitonov Tu Nguyen M. Rivière Wei-Ning Hsu Abdel-rahman Mohamed Emmanuel Dupoux Yossi Adi 119 34 0 14 Nov 2021
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning Songxiang Liu Jane Polak Scowcroft Dong Yu 64 10 0 14 Nov 2021
Speaker Generation Daisy Stanton Matt Shannon Soroosh Mariooryad RJ Skerry-Ryan Eric Battenberg Tom Bagby David Kao 96 30 0 07 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank Lea Schonherr DiffM 204 131 0 04 Nov 2021
Voice Conversion Can Improve ASR in Very Low-Resource Settings Matthew Baas Herman Kamper 103 17 0 04 Nov 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion Benjamin van Niekerk M. Carbonneau Julian Zaïdi Matthew Baas Hugo Seuté Herman Kamper DRL 122 123 0 03 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses Shengyuan Xu Wenxiao Zhao Jing Guo 66 12 0 01 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 116 158 0 27 Oct 2021
Controllable and Interpretable Singing Voice Decomposition via Assem-VC Kang-Wook Kim Junhyeok Lee 54 0 0 25 Oct 2021
DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021 Yanqing Liu Rui Shao G. Wang Kuan Chen Bohan Li Pong C. Yuen Jinzhu Li Lei He Sheng Zhao 98 55 0 25 Oct 2021
Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory T. Chen Guan-Horng Liu Evangelos A. Theodorou DiffM OT 304 181 0 21 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis Max Morrison Rithesh Kumar Kundan Kumar Prem Seetharaman Aaron Courville Yoshua Bengio GAN 135 72 0 19 Oct 2021
Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks Marco Comunità Huy Phan Joshua D. Reiss GAN 81 11 0 18 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke Xiaobin Zhuang Huiran Yu Weifeng Zhao Tao Jiang Peng Hu 90 6 0 18 Oct 2021
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis Yongmao Zhang Jian Cong Heyang Xue Lei Xie Pengcheng Zhu Mengxiao Bi 108 77 0 17 Oct 2021
Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention Xutai Ma Hongyu Gong Danni Liu Ann Lee Yun Tang Peng-Jen Chen Wei-Ning Hsu P. Koehn J. Pino 106 9 0 15 Oct 2021
From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation Danni Liu Changhan Wang Hongyu Gong Xutai Ma Yun Tang J. Pino 107 4 0 15 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 85 63 0 15 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation Rongjie Huang Chenye Cui Feiyang Chen Yi Ren Jinglin Liu Zhou Zhao Baoxing Huai N. Yuan GAN 203 63 0 14 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Junyi Ao Rui Wang Long Zhou Chengyi Wang Shuo Ren ... Yu Zhang Zhihua Wei Yao Qian Jinyu Li Furu Wei 183 203 0 14 Oct 2021
Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech Haoyue Zhan Xinyuan Yu Haitong Zhang Yang Zhang Yue Lin 69 5 0 14 Oct 2021
Revisiting IPA-based Cross-lingual Text-to-speech Haitong Zhang Haoyue Zhan Yang Zhang Xinyuan Yu Yue Lin 68 7 0 14 Oct 2021
A Melody-Unsupervision Model for Singing Voice Synthesis Soonbeom Choi Juhan Nam 67 14 0 13 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations Wen-Chin Huang Shu-Wen Yang Tomoki Hayashi Hung-yi Lee Shinji Watanabe Tomoki Toda 78 40 0 12 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning Paarth Neekhara Jason Chun Lok Li Boris Ginsburg 144 15 0 12 Oct 2021
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example Hieu-Thi Luong Junichi Yamagishi 72 9 0 11 Oct 2021
Towards High-fidelity Singing Voice Conversion with Acoustic Reference and Contrastive Predictive Coding Chao Wang Zhonghao Li Benlai Tang Xiang Yin Yuan Wan Yibiao Yu Zejun Ma 71 18 0 10 Oct 2021
Environment Aware Text-to-Speech Synthesis Daxin Tan Guangyan Zhang Tan Lee 86 4 0 08 Oct 2021
EdiTTS: Score-based Editing for Controllable Text-to-Speech Jaesung Tae Hyeongju Kim Taesu Kim DiffM 279 40 0 06 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis Cheng-I Jeff Lai Erica Cooper Yang Zhang Shiyu Chang Kaizhi Qian ... Yung-Sung Chuang Alexander H. Liu Junichi Yamagishi David D. Cox James R. Glass 71 6 0 04 Oct 2021
PortaSpeech: Portable and High-Quality Generative Text-to-Speech Yi Ren Jinglin Liu Zhou Zhao 139 79 0 30 Sep 2021
Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme Vadim Popov Ivan Vovk Vladimir Gogoryan Tasnima Sadekova Mikhail Kudinov Jiansheng Wei DiffM BDL 162 136 0 28 Sep 2021
VoiceFixer: Toward General Speech Restoration with Neural Vocoder Haohe Liu Qiuqiang Kong Qiao Tian Yan Zhao DeLiang Wang Chuanzeng Huang Yuxuan Wang 100 58 0 28 Sep 2021
MSR-NV: Neural Vocoder Using Multiple Sampling Rates Kentaro Mitsui Kei Sawada 111 0 0 28 Sep 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis Manh Luong Viet-Anh Tran 29 2 0 27 Sep 2021