Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.11430
Cited By
Self-training and Pre-training are Complementary for Speech Recognition
22 October 2020
Qiantong Xu
Alexei Baevski
Tatiana Likhomanenko
Paden Tomasello
Alexis Conneau
R. Collobert
Gabriel Synnaeve
Michael Auli
SSL
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-training and Pre-training are Complementary for Speech Recognition"
44 / 44 papers shown
Title
Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection
Xiaohui Zhang
Wenjie Fu
Mangui Liang
43
1
0
19 Feb 2024
Cross-Domain HAR: Few Shot Transfer Learning for Human Activity Recognition
Megha Thukral
H. Haresamudram
Thomas Ploetz
37
4
0
22 Oct 2023
Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
Chanho Park
Chengsong Lu
Mingjie Chen
Thomas Hain
28
3
0
12 Oct 2023
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Lucas Maison
Yannick Esteve
26
3
0
01 Jun 2023
Rethinking Semi-supervised Learning with Language Models
Zhengxiang Shi
Francesco Tonolini
Nikolaos Aletras
Emine Yilmaz
G. Kazai
Yunlong Jiao
32
18
0
22 May 2023
Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization
Hamza Kheddar
Yassine Himeur
S. Al-Maadeed
Abbes Amira
F. Bensaali
47
76
0
27 Apr 2023
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
29
15
0
29 Mar 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
27
12
0
21 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
M. Pantic
SSL
45
48
0
12 Dec 2022
TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR
Lixin Cao
Jun Wang
Ben Yang
Dan Su
Dong Yu
18
4
0
12 Dec 2022
Self-Transriber: Few-shot Lyrics Transcription with Self-training
Xiaoxue Gao
Xianghu Yue
Haizhou Li
30
7
0
18 Nov 2022
The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification
Changye Li
T. Cohen
Serguei V. S. Pakhomov
17
3
0
11 Nov 2022
Continuous Soft Pseudo-Labeling in ASR
Tatiana Likhomanenko
R. Collobert
Navdeep Jaitly
Samy Bengio
VLM
24
3
0
11 Nov 2022
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
VLM
42
8
0
27 Oct 2022
I see what you hear: a vision-inspired method to localize words
Mohammad Samragh
Arnav Kundu
Ting-Yao Hu
Minsik Cho
Aman Chadha
A. Shrivastava
Oncel Tuzel
Devang Naik
ObjD
31
1
0
24 Oct 2022
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
DongSeon Hwang
K. Sim
Yu Zhang
Trevor Strohman
14
10
0
11 Oct 2022
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
Zi-Hua Zhang
Long Zhou
Junyi Ao
Shujie Liu
Lirong Dai
Jinyu Li
Furu Wei
61
57
0
07 Oct 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
Improving Low-Resource Speech Recognition with Pretrained Speech Models: Continued Pretraining vs. Semi-Supervised Training
Mitchell DeHaven
J. Billa
VLM
AI4TS
15
8
0
01 Jul 2022
Do self-supervised speech models develop human-like perception biases?
Juliette Millet
Ewan Dunbar
SSL
24
20
0
31 May 2022
A Deep Reinforcement Learning Blind AI in DareFightingICE
Thai Van Nguyen
Xincheng Dai
Ibrahim Khan
R. Thawonmas
H. V. Pham
VLM
23
7
0
16 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
37
0
02 May 2022
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Ye Du
Jie Zhang
Qiu-shi Zhu
Lirong Dai
Ming Wu
Xin Fang
Zhouwang Yang
34
2
0
05 Apr 2022
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification
Yen-Lun Liao
Xuan-Bo Chen
Chung-Che Wang
J. Jang
AAML
41
8
0
31 Mar 2022
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment
Mu Yang
K. Hirschi
S. Looney
Okim Kang
John H. L. Hansen
40
15
0
29 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
151
0
24 Feb 2022
Self-Training: A Survey
Massih-Reza Amini
Vasilii Feofanov
Loïc Pauletto
Lies Hadjadj
Emilie Devijver
Yury Maximov
SSL
31
102
0
24 Feb 2022
Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding
Peter Sullivan
Toshiko Shibano
Muhammad Abdul-Mageed
38
11
0
10 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Bethan Thomas
Samuel Kessler
S. Karout
16
70
0
07 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
27
111
0
03 Feb 2022
SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Wenyong Huang
Zhenhe Zhang
Y. Yeung
Xin Jiang
Qun Liu
35
23
0
25 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries
A. Duarte
Samuel Albanie
Xavier Giró-i-Nieto
Gül Varol
SLR
50
29
0
07 Jan 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset
Tiezheng Yu
Rita Frieske
Peng Xu
Samuel Cahyawijaya
Cheuk Tung Shadow Yiu
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
RALM
44
9
0
07 Jan 2022
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
40
16
0
14 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
32
657
0
17 Nov 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
Jakob Poncelet
Hugo Van hamme
SSL
28
1
0
29 Sep 2021
Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
Shiyang Li
Semih Yavuz
Wenhu Chen
Xifeng Yan
22
12
0
14 Sep 2021
Remember the context! ASR slot error correction through memorization
Dhanush Bekal
Ashish Shenoy
Monica Sunkara
S. Bodapati
Katrin Kirchhoff
KELM
23
12
0
10 Sep 2021
Multi-Task Self-Training for Learning General Representations
Golnaz Ghiasi
Barret Zoph
E. D. Cubuk
Quoc V. Le
Nayeon Lee
SSL
24
100
0
25 Aug 2021
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
35
180
0
12 Jul 2021
Scaling Laws for Acoustic Models
J. Droppo
Oguz H. Elibol
15
22
0
11 Jun 2021
Large-Scale Self- and Semi-Supervised Learning for Speech Translation
Changhan Wang
Anne Wu
J. Pino
Alexei Baevski
Michael Auli
Alexis Conneau
SSL
31
44
0
14 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
986
0
31 Mar 2021
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
25
460
0
02 Jan 2021
1