Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.00390
Cited By
v1
v2 (latest)
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
2 January 2021
Changhan Wang
M. Rivière
Ann Lee
Anne Wu
Chaitanya Talnikar
Daniel Haziza
Mary Williamson
J. Pino
Emmanuel Dupoux
SSL
Re-assign community
ArXiv (abs)
PDF
HTML
Github (536★)
Papers citing
"VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation"
50 / 311 papers shown
Title
Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions
Heming Wang
Meng Yu
Huatian Zhang
Chunlei Zhang
Zhongweiyang Xu
Muqiao Yang
Yixuan Zhang
Dong Yu
83
3
0
16 Sep 2023
Direct Text to Speech Translation System using Acoustic Units
Victoria Mingote
Pablo Gimeno
Luis Vicente
Sameer Khurana
Antoine Laurent
J. Duret
55
4
0
14 Sep 2023
Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Aviv Navon
Aviv Shamsian
Neta Glazer
Gill Hetz
Joseph Keshet
74
13
0
13 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
Fabien Ringeval
D. Schwab
Laurent Besacier
91
17
0
11 Sep 2023
Speech Wikimedia: A 77 Language Multilingual Speech Dataset
Rafael Mosquera Gómez
Julián Eusse
Juan Ciro
Daniel Galvez
Ryan Hileman
K. Bollacker
David Kanter
23
4
0
30 Aug 2023
Sparks of Large Audio Models: A Survey and Outlook
S. Latif
Moazzam Shoukat
Fahad Shamshad
Muhammad Usama
Yi Ren
...
Wenwu Wang
Xulong Zhang
Roberto Togneri
Min Zhang
Björn W. Schuller
LM&MA
AuLLM
188
39
0
24 Aug 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
AI4TS
VLM
116
71
0
22 Aug 2023
Deep learning-based denoising streamed from mobile phones improves speech-in-noise understanding for hearing aid users
P. U. Diehl
Hannes Zilly
Felix Sattler
Y. Singer
Kevin Kepp
...
Paul Meyer-Rachner
A. Pudszuhn
V. Hofmann
M. Vormann
Elias Sprengel
66
3
0
22 Aug 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
73
17
0
18 Aug 2023
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis
Tu Nguyen
Wei-Ning Hsu
Antony DÁvirro
Bowen Shi
Itai Gat
...
Gabriel Synnaeve
Michael Hassid
Felix Kreuk
Yossi Adi
Emmanuel Dupoux
75
62
0
10 Aug 2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Minsu Kim
J. Choi
Dahun Kim
Y. Ro
96
10
0
03 Aug 2023
The Effect of Spoken Language on Speech Enhancement using Self-Supervised Speech Representation Loss Functions
George Close
Thomas Hain
Stefan Goetze
63
8
0
27 Jul 2023
Robust Automatic Speech Recognition via WavAugment Guided Phoneme Adversarial Training
Gege Qi
YueFeng Chen
Xiaofeng Mao
Xiaojun Jia
Ranjie Duan
Rong Zhang
Hui Xue
VLM
AAML
92
0
0
24 Jul 2023
MASR: Multi-label Aware Speech Representation
Anjali Raj
Shikhar Bharadwaj
Sriram Ganapathy
Min Ma
Shikhar Vashishth
SSL
43
0
0
20 Jul 2023
Multilingual Speech-to-Speech Translation into Multiple Target Languages
Hongyu Gong
Ning Dong
Sravya Popuri
Vedanuj Goswami
Ann Lee
J. Pino
82
5
0
17 Jul 2023
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development
Yanir Marmor
Kinneret Misgav
Y. Lifshitz
VLM
95
3
0
17 Jul 2023
Towards cross-language prosody transfer for dialog
Jonathan Avila
Nigel G. Ward
72
7
0
09 Jul 2023
Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Eliya Segev
Maya Alroy
Ronen Katsir
Noam Wies
Ayana Shenhav
...
D. Zar
Oren Tadmor
Jacob Bitterman
Amnon Shashua
Tal Rosenwein
82
2
0
04 Jul 2023
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
158
36
0
30 Jun 2023
Towards Improving the Performance of Pre-Trained Speech Models for Low-Resource Languages Through Lateral Inhibition
Andrei-Marius Avram
Ruazvan-Alexandru Smuadu
Vasile Puaics
Dumitru-Clementin Cercel
Radu Ion
Dan Tufics
VLM
57
0
0
30 Jun 2023
Confidence-based Ensembles of End-to-End Speech Recognition Models
Igor Gitman
Vitaly Lavrukhin
A. Laptev
Boris Ginsburg
UQCV
87
9
0
27 Jun 2023
Large-scale unsupervised audio pre-training for video-to-speech synthesis
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
69
4
0
27 Jun 2023
AudioPaLM: A Large Language Model That Can Speak and Listen
Paul Kishan Rubenstein
Chulayuth Asawaroengchai
D. Nguyen
Ankur Bapna
Zalan Borsos
...
Neil Zeghidour
Yu Zhang
Zhishuai Zhang
Lukás Zilka
Christian Frank
LM&MA
AuLLM
VLM
138
295
0
22 Jun 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies
Yuya Yamamoto
45
2
0
22 Jun 2023
Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer
Kunal Dhawan
KDimating Rekesh
Boris Ginsburg
69
12
0
14 Jun 2023
Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Sanjana Sankar
D. Beautemps
F. Elisei
Olivier Perrotin
Thomas Hueber
49
0
0
14 Jun 2023
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR
Goeric Huybrechts
S. Ronanki
Xilai Li
H. Nosrati
S. Bodapati
Katrin Kirchhoff
51
1
0
13 Jun 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
Danni Liu
Thai-Binh Nguyen
Sai Koneru
Enes Yavuz Ugan
Ngoc-Quan Pham
Tuan-Nam Nguyen
Tu Anh Dinh
Carlos Mullov
A. Waibel
Jan Niehues
86
7
0
08 Jun 2023
Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak
Jan Lehecka
J. Psutka
Josef Psutka
48
1
0
07 Jun 2023
Label Aware Speech Representation Learning For Language Identification
Shikhar Vashishth
Shikhar Bharadwaj
Sriram Ganapathy
Ankur Bapna
Min Ma
Wei Han
Vera Axelrod
Partha P. Talukdar
SSL
61
4
0
07 Jun 2023
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation
Sameer Khurana
Nauman Dawalatabad
Antoine Laurent
Luis Vicente
Pablo Gimeno
Victoria Mingote
James R. Glass
VLM
113
1
0
01 Jun 2023
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Lucas Maison
Yannick Esteve
97
3
0
01 Jun 2023
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models
Yu-Hsiang Wang
Huan Chen
Kai-Wei Chang
Winston H. Hsu
Hung-yi Lee
115
7
0
30 May 2023
BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Claytone Sikasote
Eunice Mukonde
Md Mahfuz Ibn Alam
Antonios Anastasopoulos
58
8
0
26 May 2023
Robustness of Multi-Source MT to Transcription Errors
Dominik Machávcek
Peter Polák
Ondrej Bojar
Raj Dabre
72
4
0
26 May 2023
Svarah: Evaluating English ASR Systems on Indian Accents
Tahir Javed
Sakshi Joshi
Vignesh Nagarajan
Sairam Sundaresan
J. Nawale
A. Raman
Kaushal Bhogale
Pratyush Kumar
Mitesh M. Khapra
45
8
0
25 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
51
3
0
24 May 2023
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Rongjie Huang
Huadai Liu
Xize Cheng
Yi Ren
Lin Li
...
Jinzheng He
Lichao Zhang
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
128
8
0
24 May 2023
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
112
45
0
24 May 2023
On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications
Vamsikrishna Chemudupati
Marzieh S. Tahaei
Heitor R. Guimarães
Arthur Pimentel
Anderson R. Avila
Mehdi Rezagholizadeh
Boxing Chen
Tiago H. Falk
SSL
128
7
0
23 May 2023
Scaling Speech Technology to 1,000+ Languages
Vineel Pratap
Andros Tjandra
Bowen Shi
Paden Tomasello
Arun Babu
...
Yossi Adi
Xiaohui Zhang
Wei-Ning Hsu
Alexis Conneau
Michael Auli
VLM
166
361
0
22 May 2023
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
129
61
0
22 May 2023
Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training
Jianfeng He
Julian Salazar
Kaisheng Yao
Haoqi Li
Jason (Jinglun) Cai
VLM
70
7
0
22 May 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
106
22
0
21 May 2023
MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting
Neil Shah
Vishal Tambrahalli
Saiteja Kosgi
N. Pedanekar
Vineet Gandhi
65
0
0
19 May 2023
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
90
27
0
19 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
126
70
0
18 May 2023
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang
Yang Feng
75
26
0
15 May 2023
Masked Audio Text Encoders are Effective Multi-Modal Rescorers
Jason (Jinglun) Cai
Monica Sunkara
Xilai Li
Anshu Bhatia
Xiao Pan
S. Bodapati
129
3
0
11 May 2023
FedSOV: Federated Model Secure Ownership Verification with Unforgeable Signature
Wenyuan Yang
Gongxi Zhu
Yuguo Yin
Hanlin Gu
Lixin Fan
Qiang Yang
Xiaochun Cao
FedML
63
6
0
10 May 2023
Previous
1
2
3
4
5
6
7
Next