An Unsupervised Autoregressive Model for Speech Representation Learning

5 April 2019

Hao Tang

Papers citing "An Unsupervised Autoregressive Model for Speech Representation Learning"

50 / 148 papers shown

Title
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information Chiyu Feng Po-Chun Hsu Hung-yi Lee SSL 36 8 0 08 May 2022
Sound Localization by Self-Supervised Time Delay Estimation Ziyang Chen David Fouhey Andrew Owens SSL 32 19 0 26 Apr 2022
On-demand compute reduction with stochastic wav2vec 2.0 Apoorv Vyas Wei-Ning Hsu Michael Auli Alexei Baevski 32 13 0 25 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers Kaizhi Qian Yang Zhang Heting Gao Junrui Ni Cheng-I Jeff Lai David D. Cox M. Hasegawa-Johnson Shiyu Chang DRL 30 110 0 20 Apr 2022
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores Wei-Cheng Tseng Wei-Tsung Kao Hung-yi Lee 24 21 0 07 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Tiantian Feng Raghuveer Peri Shrikanth Narayanan FedML 20 28 0 05 Apr 2022
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation Marc-Antoine Georges Julien Diard Laurent Girin J. Schwartz Thomas Hueber 25 7 0 05 Apr 2022
Autoregressive Co-Training for Learning Discrete Speech Representations Sung-Lin Yeh Hao Tang SSL 29 6 0 29 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification Meng Feng Chieh-Chi Kao Qingming Tang Ming Sun Viktor Rozgic Spyros Matsoukas Chao Wang 44 11 0 22 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling Tiantian Feng Shrikanth Narayanan 38 17 0 15 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities Hsiang-Sheng Tsai Heng-Jui Chang Wen-Chin Huang Zili Huang Kushal Lakhotia ... Hsuan-Jui Chen Shang-Wen Li Shinji Watanabe Abdel-rahman Mohamed Hung-yi Lee 31 109 0 14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks Yizhou Lu Mingkun Huang Xinghua Qu Pengfei Wei Zejun Ma 32 19 0 09 Mar 2022
Audio Self-supervised Learning: A Survey Shuo Liu Adria Mallol-Ragolta Emilia Parada-Cabeleiro Kun Qian Xingshuo Jing Alexander Kathan Bin Hu Bjoern W. Schuller SSL 45 106 0 02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training Ramon Sanabria Wei-Ning Hsu Alexei Baevski Michael Auli 27 7 0 01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning Lasse Borgholt Jakob Drachmann Havtorn Joakim Edin Lars Maaløe Christian Igel BDL AI4TS SSL 21 11 0 01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring Herman Kamper 34 25 0 24 Feb 2022
Assessing the State of Self-Supervised Human Activity Recognition using Wearables H. Haresamudram Irfan Essa Thomas Plötz SSL 47 86 0 22 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding Peter Sullivan Toshiko Shibano Muhammad Abdul-Mageed 49 11 0 10 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling Puyuan Peng David Harwath SSL 43 26 0 07 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition Bethan Thomas Samuel Kessler S. Karout 28 71 0 07 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition Itai Gat Hagai Aronowitz Weizhong Zhu E. Morais R. Hoory 47 51 0 02 Feb 2022
Supervised and Self-supervised Pretraining Based COVID-19 Detection Using Acoustic Breathing/Cough/Speech Signals Xing-Yu Chen Qiu-shi Zhu Jie Zhang Lirong Dai 31 14 0 22 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Bowen Shi Wei-Ning Hsu Kushal Lakhotia Abdel-rahman Mohamed SSL 60 306 0 05 Jan 2022
Discrete and continuous representations and processing in deep learning: Looking forward Ruben Cartuyvels Graham Spinks Marie-Francine Moens OCL 38 20 0 04 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings Tiantian Feng H. Hashemi Rajat Hebbar M. Annavaram Shrikanth S. Narayanan 26 25 0 26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision Chengyi Wang Yu-Huan Wu Sanyuan Chen Shujie Liu Jinyu Li Yao Qian Zhenglu Yang SSL 26 28 0 16 Dec 2021
Recent Advances in End-to-End Automatic Speech Recognition Jinyu Li VLM 40 363 0 02 Nov 2021
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction Heming Wang Yao Qian Xiaofei Wang Yiming Wang Chengyi Wang Shujie Liu Takuya Yoshioka Jinyu Li DeLiang Wang 26 29 0 28 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing Sanyuan Chen Chengyi Wang Zhengyang Chen Yu-Huan Wu Shujie Liu ... Yao Qian Jian Wu Micheal Zeng Xiangzhan Yu Furu Wei SSL 141 1,721 0 26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer Yuan Gong Cheng-I Jeff Lai Yu-An Chung James R. Glass ViT 38 268 0 19 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and Challenges Linus Ericsson Henry Gouk Chen Change Loy Timothy M. Hospedales SSL OOD AI4TS 37 274 0 18 Oct 2021
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning Yi-Chen Chen Shu-Wen Yang Cheng-Kuang Lee Simon See Hung-yi Lee SSL 19 12 0 18 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio Representations Sreyan Ghosh Sandesh V Katta Ashish Seth S. Umesh SSL 36 12 0 17 Oct 2021
Word Order Does Not Matter For Speech Recognition Vineel Pratap Qiantong Xu Tatiana Likhomanenko Gabriel Synnaeve R. Collobert 43 4 0 12 Oct 2021
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training Sanyuan Chen Yu Wu Chengyi Wang Zhengyang Chen Zhuo Chen ... Jian Wu Yao Qian Furu Wei Jinyu Li Xiangzhan Yu SSL 35 85 0 12 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models Sanjeev Khudanpur Desh Raj Sanjeev Khudanpur 61 6 0 10 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers Joel Shor A. Jansen Wei Han Daniel S. Park Yu Zhang SSL AI4TS 48 54 0 09 Oct 2021
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition Xuankai Chang Takashi Maekaku Pengcheng Guo Jing Shi Yen-Ju Lu ... Tianzi Wang Shu-Wen Yang Yu Tsao Hung-yi Lee Shinji Watanabe SSL AI4TS 24 81 0 09 Oct 2021
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models Liang-Hsuan Tseng Yu-Kuan Fu Heng-Jui Chang Hung-yi Lee SSL 28 14 0 07 Oct 2021
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT Heng-Jui Chang Shu-Wen Yang Hung-yi Lee SSL 43 165 0 05 Oct 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch Jakob Poncelet Hugo Van hamme SSL 28 1 0 29 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition Yu Zhang Daniel S. Park Wei Han James Qin Anmol Gulati ... Zhifeng Chen Quoc V. Le Chung-Cheng Chiu Ruoming Pang Yonghui Wu SSL 34 175 0 27 Sep 2021
Simple and Effective Zero-shot Cross-lingual Phoneme Recognition Qiantong Xu Alexei Baevski Michael Auli VLM 34 78 0 23 Sep 2021
Self-supervised Contrastive Cross-Modality Representation Learning for Spoken Question Answering Chenyu You Nuo Chen Yuexian Zou SSL 27 63 0 08 Sep 2021
Text-Free Prosody-Aware Generative Spoken Language Modeling Eugene Kharitonov Ann Lee Adam Polyak Yossi Adi Jade Copet ... Tu Nguyen M. Rivière Abdel-rahman Mohamed Emmanuel Dupoux Wei-Ning Hsu 35 117 0 07 Sep 2021
Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin Zane Durante Leena Mathur Eric Ye Sichong Zhao Tejas Ramdas Khalil Iskarous 29 0 0 27 Aug 2021
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition Krishna D N Freshworks 32 11 0 22 Aug 2021
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing Benjamin van Niekerk Leanne Nortje Matthew Baas Herman Kamper SSL 38 31 0 02 Aug 2021
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning Samuel Kessler Bethan Thomas S. Karout SSL 27 29 0 26 Jul 2021
Unsupervised Automatic Speech Recognition: A Review Hanan Aldarmaki Asad Ullah Nazar Zaki VLM SSL 39 57 0 09 Jun 2021