Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction

28 January 2020

Papers citing "Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction"

33 / 33 papers shown

Title
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers Hosein Mohebbi Grzegorz Chrupała Willem H. Zuidema A. Alishahi 36 12 0 15 Oct 2023
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation Gene-Ping Yang Yue Gu Qingming Tang Dongsu Du Yuzong Liu 22 5 0 06 Jul 2023
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach Xulong Zhang Jianzong Wang Ning Cheng Kexin Zhu Jing Xiao 21 0 0 25 Oct 2022
Self-Supervised Speech Representation Learning: A Review Abdel-rahman Mohamed Hung-yi Lee Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin ... Shang-Wen Li Karen Livescu Lars Maaløe Tara N. Sainath Shinji Watanabe SSL AI4TS 137 350 0 21 May 2022
On-demand compute reduction with stochastic wav2vec 2.0 Apoorv Vyas Wei-Ning Hsu Michael Auli Alexei Baevski 32 13 0 25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers Kaizhi Qian Yang Zhang Heting Gao Junrui Ni Cheng-I Jeff Lai David D. Cox M. Hasegawa-Johnson Shiyu Chang DRL 30 110 0 20 Apr 2022
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data Junyi Ao Zi-Hua Zhang Long Zhou Shujie Liu Haizhou Li Tom Ko Lirong Dai Jinyu Li Yao Qian Furu Wei SSL 25 19 0 31 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks Yizhou Lu Mingkun Huang Xinghua Qu Pengfei Wei Zejun Ma 27 19 0 09 Mar 2022
Compressed Predictive Information Coding Rui Meng Tianyi Luo K. Bouchard 24 1 0 03 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 19 11 0 01 Mar 2022
Assessing the State of Self-Supervised Human Activity Recognition using Wearables H. Haresamudram Irfan Essa Thomas Plötz SSL 42 86 0 22 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training Wenyong Huang Zhenhe Zhang Y. Yeung Xin Jiang Qun Liu 35 23 0 25 Jan 2022
Self-Supervised Learning for speech recognition with Intermediate layer supervision Chengyi Wang Yu-Huan Wu Sanyuan Chen Shujie Liu Jinyu Li Yao Qian Zhenglu Yang SSL 26 28 0 16 Dec 2021
Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription Nikolai Vogler J. Allen M. Miller Taylor Berg-Kirkpatrick 32 5 0 16 Dec 2021
Textless Speech Emotion Conversion using Discrete and Decomposed Representations Felix Kreuk Adam Polyak Jade Copet Eugene Kharitonov Tu Nguyen M. Rivière Wei-Ning Hsu Abdel-rahman Mohamed Emmanuel Dupoux Yossi Adi 25 29 0 14 Nov 2021
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining Viet-Anh Nguyen Anh H. T. Nguyen Andy W. H. Khong 27 22 0 26 Oct 2021
Contrastively Disentangled Sequential Variational Autoencoder M. Kiener Weiran Wang Michael Gerndt CoGe DRL 27 40 0 22 Oct 2021
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing Benjamin van Niekerk Leanne Nortje Matthew Baas Herman Kamper SSL 33 31 0 02 Aug 2021
Layer-wise Analysis of a Self-supervised Speech Representation Model Ankita Pasad Ju-Chieh Chou Karen Livescu SSL 26 288 0 10 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis Shammur A. Chowdhury Nadir Durrani Ahmed M. Ali 41 12 0 01 Jul 2021
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model Apoorv Vyas S. Madikeri H. Bourlard 19 15 0 06 Apr 2021
Improving speech recognition models with small samples for air traffic control systems Yi Lin Qin Li Bo Yang Zhen Yan Huachun Tan Zhengmao Chen 26 32 0 16 Feb 2021
Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR Ruchao Fan Amber Afshan Abeer Alwan 32 14 0 12 Feb 2021
Contrastive Predictive Coding for Human Activity Recognition H. Haresamudram Irfan Essa Thomas Ploetz 32 118 0 09 Dec 2020
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies Alexander H. Liu Yu-An Chung James R. Glass SSL 27 87 0 01 Nov 2020
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning Dongwei Jiang Wubo Li Miao Cao Wei Zou Xiangang Li SSL 21 65 0 27 Oct 2020
Similarity Analysis of Self-Supervised Speech Representations Yu-An Chung Yonatan Belinkov James R. Glass SSL 36 36 0 22 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components Junwen Bai Weiran Wang Yingbo Zhou Caiming Xiong SSL AI4TS 27 12 0 07 Oct 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech Andy T. Liu Shang-Wen Li Hung-yi Lee SSL 62 356 0 12 Jul 2020
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain Eugene Kharitonov M. Rivière Gabriel Synnaeve Lior Wolf Pierre-Emmanuel Mazaré Matthijs Douze Emmanuel Dupoux 23 117 0 02 Jul 2020
Vector-quantized neural networks for acoustic unit discovery in the ZeroSpeech 2020 challenge Benjamin van Niekerk Leanne Nortje Herman Kamper 13 115 0 19 May 2020
Generative Pre-Training for Speech with Autoregressive Predictive Coding Yu-An Chung James R. Glass SSL 29 173 0 23 Oct 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 299 6,984 0 20 Apr 2018