Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.06670
Cited By
Common Voice: A Massively-Multilingual Speech Corpus
13 December 2019
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Common Voice: A Massively-Multilingual Speech Corpus"
50 / 315 papers shown
Title
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition
Steven Vander Eeckt
Hugo Van hamme
CLL
MoMe
67
14
0
27 Oct 2022
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
F. López
Jordi Luque
14
6
0
27 Oct 2022
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li
Francis Keith
William Hartmann
M. Snover
19
0
0
27 Oct 2022
There is more than one kind of robustness: Fooling Whisper with adversarial examples
R. Olivier
Bhiksha Raj
AAML
40
12
0
26 Oct 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
51
24
0
26 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
32
2
0
23 Oct 2022
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR
Gary Wang
Ekin D.Cubuk
Andrew Rosenberg
Shuyang Cheng
Ron J. Weiss
Bhuvana Ramabhadran
Pedro J. Moreno
Quoc V. Le
Daniel S. Park
30
1
0
19 Oct 2022
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR
Zhehuai Chen
Ankur Bapna
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Pedro J. Moreno
Nanxin Chen
51
17
0
18 Oct 2022
Learning Invariant Representation and Risk Minimized for Unsupervised Accent Domain Adaptation
Chendong Zhao
Jianzong Wang
Xiaoyang Qu
Haoqian Wang
Jing Xiao
SSL
38
1
0
15 Oct 2022
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition
Chao-Han Huck Yang
Jun Qi
Sabato Marco Siniscalchi
Chin-Hui Lee
26
4
0
12 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
24
3
0
01 Oct 2022
Direct Speech Translation for Automatic Subtitling
Sara Papi
Marco Gaido
Alina Karakanta
Mauro Cettolo
Matteo Negri
Marco Turchi
54
11
0
27 Sep 2022
Goodness of Pronunciation Pipelines for OOV Problem
Ankit Grover
22
0
0
08 Sep 2022
External Knowledge Selection with Weighted Negative Sampling in Knowledge-grounded Task-oriented Dialogue Systems
Janghoon Han
Joongbo Shin
Hosung Song
Hyunjik Jo
Gyeonghun Kim
Yireun Kim
Stanley Jungkyu Choi
21
4
0
06 Sep 2022
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Kaushal Bhogale
A. Raman
Tahir Javed
Sumanth Doddapaneni
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
36
22
0
26 Aug 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages
Tahir Javed
Kaushal Bhogale
A. Raman
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
ELM
30
20
0
24 Aug 2022
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
A. I. S. Ferreira
Gustavo dos Reis Oliveira
27
3
0
29 Jul 2022
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models
Daniel Bermuth
Alexander Poeppel
W. Reif
26
7
0
29 Jun 2022
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi
Hyung-Min Park
VLM
31
11
0
25 Jun 2022
The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress
Lukas Christ
Shahin Amiriparian
Alice Baird
Panagiotis Tzirakis
Alexander Kathan
...
Eva-Maria Messner
Andreas Konig
Alan S. Cowen
Min Zhang
Björn W. Schuller
39
60
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura
Yuki Saito
Shinnosuke Takamichi
Kentaro Tachibana
Hiroshi Saruwatari
AI4TS
19
7
0
16 Jun 2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Jan Lehecka
J. Svec
A. Pražák
J. Psutka
22
12
0
15 Jun 2022
Do self-supervised speech models develop human-like perception biases?
Juliette Millet
Ewan Dunbar
SSL
24
20
0
31 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
352
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
25
36
0
17 May 2022
Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language
Martin Malmsten
Chris Haffenden
Love Borjeson
21
9
0
06 May 2022
Quantifying Language Variation Acoustically with Few Resources
Martijn Bartelds
Martijn B. Wieling
27
11
0
05 May 2022
Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Nay San
Martijn Bartelds
Tolúldopdé Ogúnrdemí
Alison Mount
R. Thompson
Mike Higgins
Roy Barker
Jane Simpson
Dan Jurafsky
30
6
0
15 Apr 2022
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
26
5
0
12 Apr 2022
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
Lin Zhang
Xin Wang
Erica Cooper
Nicholas W. D. Evans
Junichi Yamagishi
27
56
0
11 Apr 2022
Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Tobias Weise
P. Klumpp
Kubilay Can Demir
Andreas Maier
E. Noeth
B.J. Heismann
Maria Schuster
S. Yang
11
3
0
08 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
28
56
0
06 Apr 2022
Federated Self-supervised Speech Representations: Are We There Yet?
Yan Gao
Javier Fernandez-Marques
Titouan Parcollet
Abhinav Mehrotra
Nicholas D. Lane
35
13
0
06 Apr 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
10
26
0
05 Apr 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Takaaki Saeki
Detai Xin
Wataru Nakata
Tomoki Koriyama
Shinnosuke Takamichi
Hiroshi Saruwatari
39
180
0
05 Apr 2022
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
TTA
VLM
21
10
0
27 Mar 2022
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
47
20
0
24 Mar 2022
Automatic Speech Recognition for Speech Assessment of Persian Preschool Children
Amirhossein Abaskohi
Fatemeh Mortazavi
Hadi Moradi
34
6
0
24 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
26
109
0
14 Mar 2022
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks
Yizhou Lu
Mingkun Huang
Xinghua Qu
Pengfei Wei
Zejun Ma
27
19
0
09 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
154
0
24 Feb 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
34
6
0
22 Feb 2022
Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines
Alexander Isenko
R. Mayer
Jeffrey Jedele
Hans-Arno Jacobsen
19
23
0
17 Feb 2022
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition
Bethan Thomas
Samuel Kessler
S. Karout
28
71
0
07 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
30
111
0
03 Feb 2022
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
P. Mihajlik
A. Balog
T. E. Gráczi
A. Kohári
Balázs Tarján
K. Mády
25
8
0
01 Feb 2022
NAS-VAD: Neural Architecture Search for Voice Activity Detection
Daniel Rho
Jinhyeok Park
J. Ko
46
6
0
22 Jan 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset
Tiezheng Yu
Rita Frieske
Peng Xu
Samuel Cahyawijaya
Cheuk Tung Shadow Yiu
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
RALM
47
9
0
07 Jan 2022
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
31
143
0
15 Dec 2021
Previous
1
2
3
4
5
6
7
Next