Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.00390
Cited By
v1
v2 (latest)
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
2 January 2021
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (536★)
Papers citing
"VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation"
50 / 311 papers shown
Title
SQuId: Measuring Speech Naturalness in Many Languages
Thibault Sellam
Ankur Bapna
Joshua Camp
Diana Mackinnon
Ankur P. Parikh
Jason Riesa
83
18
0
12 Oct 2022
Direct Speech Translation for Automatic Subtitling
Sara Papi
Marco Gaido
Alina Karakanta
Mauro Cettolo
Matteo Negri
Marco Turchi
102
11
0
27 Sep 2022
Bangla-Wave: Improving Bangla Automatic Speech Recognition Utilizing N-gram Language Models
Mohammed Rakib
Md. Ismail Hossain
Nabeel Mohammed
Fuad Rahman
VLM
85
7
0
13 Sep 2022
Are disentangled representations all you need to build speaker anonymization systems?
Pierre Champion
D. Jouvet
Anthony Larcher
113
20
0
22 Aug 2022
TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Hajo N. Krabbenhöft
Erhardt Barth
72
3
0
25 Jun 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
92
23
0
24 Jun 2022
Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Jan Lehecka
J. Psutka
Josef Psutka
51
4
0
15 Jun 2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech
Jan Lehecka
J. Svec
A. Pražák
J. Psutka
52
13
0
15 Jun 2022
The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Ziqiang Zhang
Junyi Ao
Long Zhou
Shujie Liu
Furu Wei
Jinyu Li
36
9
0
12 Jun 2022
Toward a realistic model of speech processing in the brain with self-supervised learning
Juliette Millet
Charlotte Caucheteux
Pierre Orhan
Yves Boubenec
Alexandre Gramfort
Ewan Dunbar
Christophe Pallier
J. King
112
99
0
03 Jun 2022
Predicting non-native speech perception using the Perceptual Assimilation Model and state-of-the-art acoustic models
Juliette Millet
I. Chitoran
Ewan Dunbar
59
6
0
31 May 2022
Do self-supervised speech models develop human-like perception biases?
Juliette Millet
Ewan Dunbar
SSL
68
23
0
31 May 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
155
332
0
25 May 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
Paul-Ambroise Duquenne
Hongyu Gong
Benoît Sagot
Holger Schwenk
85
20
0
24 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
285
368
0
21 May 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
65
37
0
17 May 2022
Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language
Martin Malmsten
Chris Haffenden
Love Borjeson
66
10
0
06 May 2022
Efficient yet Competitive Speech Translation: FBK@IWSLT2022
Marco Gaido
Sara Papi
Dennis Fucci
G. Fiameni
Matteo Negri
Marco Turchi
69
20
0
05 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
105
39
0
02 May 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
95
42
0
27 Apr 2022
LibriS2S: A German-English Speech-to-Speech Translation Corpus
Pedro Jeuris
Jan Niehues
AuLLM
29
3
0
22 Apr 2022
ASR in German: A Detailed Error Analysis
John M. Wirth
René Peinl
55
6
0
12 Apr 2022
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
Lin Zhang
Xin Wang
Erica Cooper
Nicholas W. D. Evans
Junichi Yamagishi
109
60
0
11 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
94
108
0
07 Apr 2022
Speech Pre-training with Acoustic Piece
Shuo Ren
Shujie Liu
Yu Wu
Long Zhou
Furu Wei
SSL
65
17
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
142
58
0
06 Apr 2022
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Xuankai Chang
Takashi Maekaku
Yuya Fujita
Shinji Watanabe
VLM
111
46
0
01 Apr 2022
ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
Edresson Casanova
C. Shulby
Alexander Korolev
Arnaldo Cândido Júnior
A. S. Soares
S. Aluísio
M. Ponti
117
14
0
29 Mar 2022
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics
A. Virkkunen
Aku Rouhe
Nhan Phan
M. Kurimo
95
4
0
28 Mar 2022
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
94
21
0
24 Mar 2022
Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks
Anssi Moisio
Dejan Porjazovski
Aku Rouhe
Yaroslav Getman
A. Virkkunen
Tamás Grósz
Krister Lindén
M. Kurimo
88
23
0
24 Mar 2022
The VoicePrivacy 2022 Challenge Evaluation Plan
N. Tomashenko
Xin Wang
Xiaoxiao Miao
Hubert Nourtel
Pierre Champion
Massimiliano Todisco
Emmanuel Vincent
Nicholas W. D. Evans
Junichi Yamagishi
J. Bonastre
117
63
0
23 Mar 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLM
AILaw
ELM
155
22
0
21 Mar 2022
Dawn of the transformer era in speech emotion recognition: closing the valence gap
Johannes Wagner
Andreas Triantafyllopoulos
H. Wierstorf
Maximilian Schmitt
Felix Burkhardt
F. Eyben
Björn W. Schuller
96
306
0
14 Mar 2022
Building and curating conversational corpora for diversity-aware language science and technology
Andreas Liesenfeld
Mark Dingemanse
50
4
0
07 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
135
108
0
06 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
129
168
0
24 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
73
114
0
03 Feb 2022
BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
P. Mihajlik
A. Balog
T. E. Gráczi
A. Kohári
Balázs Tarján
K. Mády
44
8
0
01 Feb 2022
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
Yeting Jia
Michelle Tadmor Ramanovich
Quan Wang
Heiga Zen
SLR
94
70
0
11 Jan 2022
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
94
150
0
15 Dec 2021
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
95
16
0
14 Dec 2021
Human-Machine Interaction Speech Corpus from the ROBIN project
V. Pais
Radu Ion
Andrei-Marius Avram
Elena Irimia
V. Mititelu
Maria Mitrofan
58
6
0
22 Nov 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
106
76
0
19 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
114
712
0
17 Nov 2021
Towards Building ASR Systems for the Next Billion Users
Tahir Javed
Sumanth Doddapaneni
A. Raman
Kaushal Bhogale
Gowtham Ramesh
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
84
55
0
06 Nov 2021
Pseudo-Labeling for Massively Multilingual Speech Recognition
Loren Lugosch
Tatiana Likhomanenko
Gabriel Synnaeve
R. Collobert
VLM
77
30
0
30 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
294
1,911
0
26 Oct 2021
ASR4REAL: An extended benchmark for speech models
M. Rivière
Jade Copet
Gabriel Synnaeve
AuLLM
78
15
0
16 Oct 2021
Scribosermo: Fast Speech-to-Text models for German and other Languages
Daniel Bermuth
Alexander Poeppel
Wolfgang Reif
54
9
0
15 Oct 2021
Previous
1
2
3
4
5
6
7
Next