End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions

19 May 2022

Papers citing "End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions"

29 / 29 papers shown

Title
HiFi-VC: High Quality ASR-Based Voice Conversion A. Kashkin I. Karpukhin S. Shishkin 61 6 0 31 Mar 2022
ADD 2022: the First Audio Deep Synthesis Detection Challenge Jiangyan Yi Ruibo Fu J. Tao Shuai Nie Haoxin Ma ... Le Xu Zhengqi Wen Haizhou Li Zheng Lian Bin Liu 55 183 0 17 Feb 2022
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 81 158 0 27 Oct 2021
ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection Junichi Yamagishi Xin Wang Massimiliano Todisco Md. Sahidullah J. Patino ... Xuechen Liu Kong Aik Lee Tomi Kinnunen Nicholas W. D. Evans Héctor Delgado 65 347 0 01 Sep 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Won Jang D. Lim Jaesam Yoon Bongwan Kim Juntae Kim 91 131 0 15 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion Bac Nguyen Cong Fabien Cardinaux AAML 105 42 0 02 Jun 2021
Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning Siyang Yuan Pengyu Cheng Ruiyi Zhang Weituo Hao Zhe Gan Lawrence Carin DRL 59 61 0 17 Mar 2021
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization Yen-Hao Chen Da-Yi Wu Tsung-Han Wu Hung-yi Lee 80 108 0 31 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 177 1,936 0 12 Oct 2020
Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020 Hee-Soo Heo Bong-Jin Lee Jaesung Huh Joon Son Chung 47 133 0 29 Sep 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 104 322 0 09 Aug 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition Alexis Conneau Alexei Baevski R. Collobert Abdel-rahman Mohamed Michael Auli SSL 149 781 0 24 Jun 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Alexei Baevski Henry Zhou Abdel-rahman Mohamed Michael Auli SSL 282 5,801 0 20 Jun 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge Benjamin van Niekerk Leanne Nortje Herman Kamper 76 116 0 19 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition Anmol Gulati James Qin Chung-Cheng Chiu Niki Parmar Yu Zhang ... Wei Han Shibo Wang Zhengdong Zhang Yonghui Wu Ruoming Pang 223 3,139 0 16 May 2020
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder Kaizhi Qian Zeyu Jin M. Hasegawa-Johnson G. J. Mysore 59 107 0 15 Apr 2020
In defence of metric learning for speaker recognition Joon Son Chung Jaesung Huh Seongkyu Mun Minjae Lee Hee-Soo Heo Soyeon Choe Chiheon Ham Sung-Ye Jung Bong-Jin Lee Icksang Han 64 436 0 26 Mar 2020
Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues Trisha Mittal Uttaran Bhattacharya Rohan Chandra Aniket Bera Tianyi Zhou 71 261 0 14 Mar 2020
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram Ryuichi Yamamoto Eunwoo Song Jae-Min Kim 56 818 0 25 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Kundan Kumar Rithesh Kumar T. Boissière L. Gestin Wei Zhen Teoh Jose M. R. Sotelo A. D. Brébisson Yoshua Bengio Aaron Courville GAN 159 953 0 08 Oct 2019
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion Joan Serrà Santiago Pascual Carlos Segura CVBM 65 84 0 03 Jun 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss Kaizhi Qian Yang Zhang Shiyu Chang Xuesong Yang M. Hasegawa-Johnson 81 465 0 14 May 2019
ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection Massimiliano Todisco Xin Wang Ville Vestman Md. Sahidullah Héctor Delgado A. Nautsch Junichi Yamagishi Nicholas W. D. Evans Tomi Kinnunen Kong Aik Lee 66 609 0 09 Apr 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 63 260 0 09 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech Heiga Zen Viet Dang R. Clark Yu Zhang Ron J. Weiss Ye Jia Zhiwen Chen Yonghui Wu 104 954 0 05 Apr 2019
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 353 2,279 0 14 Jun 2018
Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks Chin-Cheng Hsu Hsin-Te Hwang Yi-Chiao Wu Yu Tsao H. Wang DRL 85 314 0 04 Apr 2017
Least Squares Generative Adversarial Networks Xudong Mao Qing Li Haoran Xie Raymond Y. K. Lau Zhen Wang Stephen Paul Smolley GAN 329 4,574 0 13 Nov 2016
WaveNet: A Generative Model for Raw Audio Aaron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner A. Senior Koray Kavukcuoglu DiffM 406 7,399 0 12 Sep 2016