Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge

19 May 2020

Papers citing "Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge"

50 / 71 papers shown

Title
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion Na Li Chuke Wang Yu Gu Zhifeng Li 59 0 0 11 Apr 2025
Textless NLP -- Zero Resource Challenge with Low Resource Compute Krithiga Ramadass Abrit Pal Singh Srihari J Sheetal Kalyani VLM 31 0 0 24 Sep 2024
Discrete Unit based Masking for Improving Disentanglement in Voice Conversion Philip H. Lee Ismail Rasim Ulgen Berrak Sisman 35 0 0 17 Sep 2024
Improved Visually Prompted Keyword Localisation in Real Low-Resource Settings Leanne Nortje Dan Oneaţă Herman Kamper VLM 43 0 0 09 Sep 2024
Visually Grounded Speech Models have a Mutual Exclusivity Bias Leanne Nortje Dan Oneaţă Yevgen Matusevych Herman Kamper SSL 47 0 0 20 Mar 2024
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT Cheol Jun Cho Abdelrahman Mohamed Shang-Wen Li Alan W. Black Gopala K. Anumanchipalli 39 8 0 16 Oct 2023
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment Zheng-Yan Sheng Yang Ai Yan-Nian Chen Zhenhua Ling CVBM 19 4 0 18 Sep 2023
From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion Robin San Roman Yossi Adi Antoine Deleforge Romain Serizel Gabriel Synnaeve Alexandre Défossez DiffM 27 21 0 02 Aug 2023
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications Varun Krishna T. Sai Sriram Ganapathy SSL 32 2 0 14 Jul 2023
Rhythm Modeling for Voice Conversion Benjamin van Niekerk M. Carbonneau Herman Kamper 40 5 0 12 Jul 2023
Visually grounded few-shot word learning in low-resource settings Leanne Nortje Dan Oneaţă Herman Kamper VLM 17 4 0 20 Jun 2023
Privacy in Speech Technology Tomas Bäckström 29 4 0 09 May 2023
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models Yinghao Aaron Li Cong Han N. Mesgarani 24 18 0 29 Dec 2022
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models Sung-Lin Yeh Hao Tang SSL BDL 35 1 0 29 Oct 2022
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge Ewan Dunbar Nicolas Hamilakis Emmanuel Dupoux SSL 34 30 0 27 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings Jian Zhu Zuoyu Tian Yadong Liu Cong Zhang Chia-wen Lo SSL 32 2 0 23 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning Tzu-hsun Feng Annie Dong Ching-Feng Yeh Shu-Wen Yang Tzu-Quan Lin ... Xuankai Chang Shinji Watanabe Abdel-rahman Mohamed Shang-Wen Li Hung-yi Lee ELM SSL 36 33 0 16 Oct 2022
Towards visually prompted keyword localisation for zero-resource spoken languages Leanne Nortje Herman Kamper 29 6 0 12 Oct 2022
Non-Parallel Voice Conversion for ASR Augmentation Gary Wang Andrew Rosenberg Bhuvana Ramabhadran Fadi Biadsy Yinghui Huang Jesse Emond P. M. Mengibar 26 2 0 15 Sep 2022
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions Yeonjong Choi Chao Xie T. Toda DiffM 33 2 0 30 Jun 2022
A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery W. V. D. Merwe Herman Kamper J. D. Preez 22 2 0 23 Jun 2022
Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE Marc-Antoine Georges J. Schwartz Thomas Hueber SSL 14 5 0 17 Jun 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 137 352 0 21 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions Wonjune Kang M. Hasegawa-Johnson D. Roy 32 8 0 19 May 2022
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization Yuhta Takida Takashi Shibuya Wei-Hsiang Liao Chieh-Hsin Lai Junki Ohmura Toshimitsu Uesaka Naoki Murata Shusuke Takahashi Toshiyuki Kumakura Yuki Mitsufuji BDL 23 61 0 16 May 2022
Autoregressive Co-Training for Learning Discrete Speech Representations Sung-Lin Yeh Hao Tang SSL 27 6 0 29 Mar 2022
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data Gašper Beguš Alan Zhou SSL 27 5 0 22 Mar 2022
Modelling word learning and recognition using visually grounded speech Danny Merkx Sebastiaan Scholten S. Frank M. Ernestus O. Scharenborg SSL 37 0 0 14 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 19 11 0 01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring Herman Kamper 34 25 0 24 Feb 2022
AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning Huaizhen Tang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao 12 54 0 21 Feb 2022
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion Disong Wang Shan Yang Dan Su Xunying Liu Dong Yu Helen Meng 15 11 0 18 Feb 2022
Robust Vector Quantized-Variational Autoencoder Chieh-Hsin Lai Dongmian Zou Gilad Lerman DRL 32 5 0 04 Feb 2022
Unsupervised Multimodal Word Discovery based on Double Articulation Analysis with Co-occurrence cues Akira Taniguchi Hiroaki Murakami Ryo Ozaki T. Taniguchi 23 2 0 18 Jan 2022
Non-Intrusive Binaural Speech Intelligibility Prediction from Discrete Latent Representations Alex F. McKinney Benjamin Cauchi 20 3 0 24 Nov 2021
Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion Chao Xie Yi-Chiao Wu Patrick Lumban Tobing Wen-Chin Huang T. Toda 21 9 0 13 Nov 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion Benjamin van Niekerk M. Carbonneau Julian Zaïdi Matthew Baas Hugo Seuté Herman Kamper DRL 27 111 0 03 Nov 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning Shijun Wang Dimche Kostadinov Damian Borth 29 11 0 27 Oct 2021
Interpreting intermediate convolutional layers in unsupervised acoustic word classification Gašper Beguš Alan Zhou FAtt SSL 33 5 0 05 Oct 2021
Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding Saurabhchand Bhati Jesús Villalba Piotr Żelasko Laureano Moro Velázquez Najim Dehak SSL 53 22 0 05 Oct 2021
Noisy-to-Noisy Voice Conversion Framework with Denoising Model Chao Xie Yi-Chiao Wu Patrick Lumban Tobing Wen-Chin Huang T. Toda 23 7 0 22 Sep 2021
Masked Acoustic Unit for Mispronunciation Detection and Correction Zhan Zhang Yuehai Wang Jianyi Yang 30 3 0 12 Aug 2021
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing Benjamin van Niekerk Leanne Nortje Matthew Baas Herman Kamper SSL 33 31 0 02 Aug 2021
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer Zongyang Du Berrak Sisman Kun Zhou Haizhou Li 32 20 0 08 Jul 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion Disong Wang Liqun Deng Y. Yeung Xiao Chen Xunying Liu Helen Meng DRL 22 136 0 18 Jun 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 39 57 0 09 Jun 2021
Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation Saurabhchand Bhati Jesús Villalba Piotr Żelasko Laureano Moro Velázquez Najim Dehak SSL 19 37 0 03 Jun 2021
Unsupervised Speech Recognition Alexei Baevski Wei-Ning Hsu Alexis Conneau Michael Auli SSL 26 270 0 24 May 2021
Discrete representations in neural models of spoken language Bertrand Higy Lieke Gelderloos A. Alishahi Grzegorz Chrupała 21 6 0 12 May 2021
VQCPC-GAN: Variable-Length Adversarial Audio Synthesis Using Vector-Quantized Contrastive Predictive Coding J. Nistal Cyran Aouameur Stefan Lattner G. Richard 19 7 0 04 May 2021