Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.01027
Cited By
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
2 April 2021
Wei-Ning Hsu
Anuroop Sriram
Alexei Baevski
Tatiana Likhomanenko
Qiantong Xu
Vineel Pratap
Jacob Kahn
Ann Lee
R. Collobert
Gabriel Synnaeve
Michael Auli
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training"
50 / 54 papers shown
Title
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
72
6
0
13 Aug 2024
Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies
Chung-Wen Wu
Berlin Chen
45
0
0
16 Jun 2024
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition
Andreas Triantafyllopoulos
A. Batliner
Simon Rampp
M. Milling
Björn Schuller
VLM
28
0
0
10 Jun 2024
Speech foundation models on intelligibility prediction for hearing-impaired listeners
Santiago Cuervo
R. Marxer
35
6
0
24 Jan 2024
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
Heng-Jui Chang
James R. Glass
33
3
0
15 Nov 2023
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Khazar Khorrami
María Andrea Cruz Blandón
Tuomas Virtanen
Okko Rasanen
SSL
27
1
0
05 Jun 2023
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Lucas Maison
Yannick Esteve
26
3
0
01 Jun 2023
Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
21
6
0
28 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
55
58
0
18 May 2023
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models
Takanori Ashihara
Takafumi Moriya
Kohei Matsuura
Tomohiro Tanaka
25
3
0
09 May 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
30
2
0
22 Mar 2023
Improving Accented Speech Recognition with Multi-Domain Training
Lucas Maison
Yannick Esteve
23
7
0
14 Mar 2023
Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation
Rehan Ahmad
Md. Asif Jalal
Muhammad Umar Farooq
A. Ollerenshaw
Thomas Hain
18
2
0
01 Mar 2023
Speech Corpora Divergence Based Unsupervised Data Selection for ASR
Changfeng Gao
Gaofeng Cheng
Pengyuan Zhang
Yonghong Yan
16
0
0
26 Feb 2023
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
20
23
0
18 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
25
7
0
31 Dec 2022
Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR
Vrunda N. Sukhadia
Anjana Arunkumar
S. Umesh
23
1
0
03 Nov 2022
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup
Vasista Sai Lodagala
Sreyan Ghosh
S. Umesh
SSL
35
5
0
02 Nov 2022
Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition
Lester Phillip Violeta
D. Ma
Wen-Chin Huang
T. Toda
28
7
0
02 Nov 2022
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Dan Berrebbi
Brian Yan
Shinji Watanabe
LRM
23
4
0
01 Nov 2022
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
Qiu-shi Zhu
Long Zhou
Jie Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
60
37
0
27 Oct 2022
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
19
0
0
27 Oct 2022
Efficient Utilization of Large Pre-Trained Models for Low Resource ASR
Peter Vieting
Christoph Luscher
Julian Dierkes
Ralf Schluter
Hermann Ney
38
5
0
26 Oct 2022
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Haomiao Yang
Jinming Zhao
Gholamreza Haffari
Ehsan Shareghi
30
2
0
24 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
31
33
0
16 Oct 2022
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
32
13
0
14 Oct 2022
Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Xiaokang Zhao
Qiu-shi Zhu
Jie Zhang
39
4
0
28 Sep 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
27
3
0
29 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
27
41
0
14 Jul 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR
Qiu-shi Zhu
Jie Zhang
Zitian Zhang
Lirong Dai
43
15
0
26 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
131
349
0
21 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
48
37
0
02 May 2022
Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Nay San
Martijn Bartelds
Tolúldopdé Ogúnrdemí
Alison Mount
R. Thompson
Mike Higgins
Roy Barker
Jane Simpson
Dan Jurafsky
27
6
0
15 Apr 2022
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
Eesung Kim
J. Jeon
Hyeji Seo
Ho-Young Kim
SSL
23
37
0
08 Apr 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations
L. D. Prasad
Sreyan Ghosh
S. Umesh
25
12
0
31 Mar 2022
How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications
Juan Pablo Zuluaga
Amrutha Prasad
Iuliia Nigmatulina
Seyyed Saeed Sarfjoo
P. Motlícek
Matthias Kleinert
H. Helmke
Oliver Ohneiser
Qingran Zhan
19
43
0
31 Mar 2022
Generative Spoken Dialogue Language Modeling
Tu Nguyen
Eugene Kharitonov
Jade Copet
Yossi Adi
Wei-Ning Hsu
...
Paden Tomasello
Robin Algayres
Benoît Sagot
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
35
80
0
30 Mar 2022
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
Kuan Po Huang
Yuanbin Fu
Yu Zhang
Hung-yi Lee
19
28
0
30 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
19
7
0
01 Mar 2022
Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models
Vrunda N. Sukhadia
S. Umesh
33
8
0
18 Feb 2022
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
40
16
0
14 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
32
657
0
17 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
110
1,704
0
26 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
61
94
0
20 Oct 2021
Don't speak too fast: The impact of data bias on self-supervised speech models
Yen Meng
Yi-Hui Chou
Andy T. Liu
Hung-yi Lee
34
25
0
15 Oct 2021
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Li-Wei Chen
Alexander I. Rudnicky
VLM
24
121
0
12 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models
Matthew Wiesner
Desh Raj
Sanjeev Khudanpur
59
6
0
10 Oct 2021
Comparison of Self-Supervised Speech Pre-Training Methods on Flemish Dutch
Jakob Poncelet
Hugo Van hamme
SSL
28
1
0
29 Sep 2021
1
2
Next