Self-training and Pre-training are Complementary for Speech Recognition

22 October 2020

Papers citing "Self-training and Pre-training are Complementary for Speech Recognition"

44 / 44 papers shown

Title
Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection Xiaohui Zhang Wenjie Fu Mangui Liang 43 1 0 19 Feb 2024
Cross-Domain HAR: Few Shot Transfer Learning for Human Activity Recognition Megha Thukral H. Haresamudram Thomas Ploetz 37 4 0 22 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text Chanho Park Chengsong Lu Mingjie Chen Thomas Hain 28 3 0 12 Oct 2023
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset Lucas Maison Yannick Esteve 26 3 0 01 Jun 2023
Rethinking Semi-supervised Learning with Language Models Zhengxiang Shi Francesco Tonolini Nikolaos Aletras Emine Yilmaz G. Kazai Yunlong Jiao 32 18 0 22 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization Hamza Kheddar Yassine Himeur S. Al-Maadeed Abbes Amira F. Bensaali 47 76 0 27 Apr 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR Paul Hongsuck Seo Arsha Nagrani Cordelia Schmid 29 15 0 29 Mar 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Wei-Ning Hsu Tal Remez Bowen Shi Jacob Donley Yossi Adi DiffM 27 12 0 21 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw Data A. Haliassos Pingchuan Ma Rodrigo Mira Stavros Petridis M. Pantic SSL 45 48 0 12 Dec 2022
TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR Lixin Cao Jun Wang Ben Yang Dan Su Dong Yu 18 4 0 12 Dec 2022
Self-Transriber: Few-shot Lyrics Transcription with Self-training Xiaoxue Gao Xianghu Yue Haizhou Li 30 7 0 18 Nov 2022
The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification Changye Li T. Cohen Serguei V. S. Pakhomov 17 3 0 11 Nov 2022
Continuous Soft Pseudo-Labeling in ASR Tatiana Likhomanenko R. Collobert Navdeep Jaitly Samy Bengio VLM 24 3 0 11 Nov 2022
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation Tsz Kin Lam Shigehiko Schamoni Stefan Riezler VLM 42 8 0 27 Oct 2022
I see what you hear: a vision-inspired method to localize words Mohammad Samragh Arnav Kundu Ting-Yao Hu Minsik Cho Aman Chadha A. Shrivastava Oncel Tuzel Devang Naik ObjD 31 1 0 24 Oct 2022
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR DongSeon Hwang K. Sim Yu Zhang Trevor Strohman 14 10 0 11 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Zi-Hua Zhang Long Zhou Junyi Ao Shujie Liu Lirong Dai Jinyu Li Furu Wei 61 57 0 07 Oct 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription Longshen Ou Xiangming Gu Ye Wang 30 21 0 20 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training Mitchell DeHaven J. Billa VLM AI4TS 15 8 0 01 Jul 2022
Do self-supervised speech models develop human-like perception biases? Juliette Millet Ewan Dunbar SSL 24 20 0 31 May 2022
A Deep Reinforcement Learning Blind AI in DareFightingICE Thai Van Nguyen Xincheng Dai Ibrahim Khan R. Thawonmas H. V. Pham VLM 23 7 0 16 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages Felix Wu Kwangyoun Kim Shinji Watanabe Kyu Jeong Han Ryan T. McDonald Kilian Q. Weinberger Yoav Artzi SyDa 48 37 0 02 May 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition Ye Du Jie Zhang Qiu-shi Zhu Lirong Dai Ming Wu Xin Fang Zhouwang Yang 34 2 0 05 Apr 2022
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification Yen-Lun Liao Xuan-Bo Chen Chung-Che Wang J. Jang AAML 41 8 0 31 Mar 2022
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment Mu Yang K. Hirschi S. Looney Okim Kang John H. L. Hansen 40 15 0 29 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation Hemlata Tak Massimiliano Todisco Xin Wang Jee-weon Jung Junichi Yamagishi Nicholas W. D. Evans 34 151 0 24 Feb 2022
Self-Training: A Survey Massih-Reza Amini Vasilii Feofanov Loïc Pauletto Lies Hadjadj Emilie Devijver Yury Maximov SSL 31 102 0 24 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding Peter Sullivan Toshiko Shibano Muhammad Abdul-Mageed 38 11 0 10 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition Bethan Thomas Samuel Kessler S. Karout 16 70 0 07 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text Ankur Bapna Colin Cherry Yu Zhang Ye Jia Melvin Johnson Yong Cheng Simran Khanuja Jason Riesa Alexis Conneau VLM 27 111 0 03 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training Wenyong Huang Zhenhe Zhang Y. Yeung Xin Jiang Qun Liu 35 23 0 25 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries A. Duarte Samuel Albanie Xavier Giró-i-Nieto Gül Varol SLR 50 29 0 07 Jan 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset Tiezheng Yu Rita Frieske Peng Xu Samuel Cahyawijaya Cheuk Tung Shadow Yiu ... Elham J. Barezi Qifeng Chen Xiaojuan Ma Bertram E. Shi Pascale Fung RALM 44 9 0 07 Jan 2022
On the Use of External Data for Spoken Named Entity Recognition Ankita Pasad Felix Wu Suwon Shon Karen Livescu Kyu Jeong Han 40 16 0 14 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu ... Yatharth Saraf J. Pino Alexei Baevski Alexis Conneau Michael Auli SSL 32 657 0 17 Nov 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch Jakob Poncelet Hugo Van hamme SSL 28 1 0 29 Sep 2021
Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding Shiyang Li Semih Yavuz Wenhu Chen Xifeng Yan 22 12 0 14 Sep 2021
Remember the context! ASR slot error correction through memorization Dhanush Bekal Ashish Shenoy Monica Sunkara S. Bodapati Katrin Kirchhoff KELM 23 12 0 10 Sep 2021
Multi-Task Self-Training for Learning General Representations Golnaz Ghiasi Barret Zoph E. D. Cubuk Quoc V. Le Nayeon Lee SSL 24 100 0 25 Aug 2021
Direct speech-to-speech translation with discrete units Ann Lee Peng-Jen Chen Changhan Wang Jiatao Gu Sravya Popuri ... Yossi Adi Qing He Yun Tang J. Pino Wei-Ning Hsu 35 180 0 12 Jul 2021
Scaling Laws for Acoustic Models J. Droppo Oguz H. Elibol 15 22 0 11 Jun 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation Changhan Wang Anne Wu J. Pino Alexei Baevski Michael Auli Alexis Conneau SSL 31 44 0 14 Apr 2021
Going deeper with Image Transformers Hugo Touvron Matthieu Cord Alexandre Sablayrolles Gabriel Synnaeve Hervé Jégou ViT 27 986 0 31 Mar 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation Changhan Wang M. Rivière Ann Lee Anne Wu Chaitanya Talnikar Daniel Haziza Mary Williamson J. Pino Emmanuel Dupoux SSL 25 460 0 02 Jan 2021