ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.06670
  4. Cited By
Common Voice: A Massively-Multilingual Speech Corpus

Common Voice: A Massively-Multilingual Speech Corpus

13 December 2019
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
    VLM
ArXivPDFHTML

Papers citing "Common Voice: A Massively-Multilingual Speech Corpus"

50 / 315 papers shown
Title
Stateful Conformer with Cache-based Inference for Streaming Automatic
  Speech Recognition
Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Vahid Noroozi
Somshubra Majumdar
Ankur Kumar
Jagadeesh Balam
Boris Ginsburg
36
10
0
27 Dec 2023
PyThaiNLP: Thai Natural Language Processing in Python
PyThaiNLP: Thai Natural Language Processing in Python
Wannaphong Phatthiyaphaibun
Korakot Chaovavanich
Charin Polpanumas
Arthit Suriyawongkul
Lalita Lowphansirikul
Pattarawat Chormai
Peerat Limkonchotiwat
Thanathip Suntorntip
Can Udomcharoenchaikit
32
89
0
07 Dec 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech
  Translation
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
35
2
0
01 Nov 2023
CLARA: Multilingual Contrastive Learning for Audio Representation
  Acquisition
CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition
K. A. Noriy
Xiaosong Yang
Marcin Budka
Jian Jun Zhang
VLM
26
3
0
18 Oct 2023
Optimized Tokenization for Transcribed Error Correction
Optimized Tokenization for Transcribed Error Correction
Tomer Wullach
Shlomo E. Chazan
32
0
0
16 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
A. Alishahi
36
12
0
15 Oct 2023
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment
  Technique
From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique
Sina Elahimanesh
Shayan Salehi
Sara Zahedi Movahed
Lisa Alazraki
Ruoyu Hu
Abbas Edalat
24
0
0
13 Oct 2023
Toward Joint Language Modeling for Speech Units and Text
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou
Chung-Ming Chien
Wei-Ning Hsu
Karen Livescu
Arun Babu
Alexis Conneau
Alexei Baevski
Michael Auli
VLM
28
20
0
12 Oct 2023
OmniLingo: Listening- and speaking-based language learning
OmniLingo: Listening- and speaking-based language learning
F. M. Tyers
Nicholas Howell
8
0
0
10 Oct 2023
Leveraging Multilingual Self-Supervised Pretrained Models for
  Sequence-to-Sequence End-to-End Spoken Language Understanding
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
Pavel Denisov
Ngoc Thang Vu
29
1
0
09 Oct 2023
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
42
15
0
09 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
32
10
0
05 Oct 2023
LRPD: Large Replay Parallel Dataset
LRPD: Large Replay Parallel Dataset
Ivan Yakovlev
Mikhail P. Melnikov
Nikita Bukhal
Rostislav Makarov
Alexander Alenin
Juncheng Billy Li
A. Okhotnikov
34
1
0
29 Sep 2023
Rethinking Session Variability: Leveraging Session Embeddings for
  Session Robustness in Speaker Verification
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
32
1
0
26 Sep 2023
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
R. S. Srinivasa
Jaejin Cho
Chouchang Yang
Yashas Malur Saidutta
Ching Hua Lee
Yilin Shen
Hongxia Jin
VLM
36
8
0
26 Sep 2023
Human Transcription Quality Improvement
Human Transcription Quality Improvement
Jian Gao
Hanbo Sun
Cheng Cao
Zheng Du
43
2
0
24 Sep 2023
Discrete Audio Representation as an Alternative to Mel-Spectrograms for
  Speaker and Speech Recognition
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition
Krishna C. Puvvada
Nithin Rao Koluguri
Kunal Dhawan
Jagadeesh Balam
Boris Ginsburg
34
13
0
19 Sep 2023
Using fine-tuning and min lookahead beam search to improve Whisper
Using fine-tuning and min lookahead beam search to improve Whisper
Andrea Do
Oscar Brown
Zhengjie Wang
Nikhil Mathew
Zixin Liu
Jawwad Ahmed
Cheng Yu
35
1
0
19 Sep 2023
ASTER: Automatic Speech Recognition System Accessibility Testing for
  Stutterers
ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers
Yi Liu
Yuekang Li
Gelei Deng
Felix Juefei Xu
Yao Du
Cen Zhang
Chengwei Liu
Yeting Li
Lei Ma
Yang Liu
24
3
0
30 Aug 2023
Bornil: An open-source sign language data crowdsourcing platform for AI
  enabled dialect-agnostic communication
Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication
Shahriar Elahi Dhruvo
Mohammad Akhlaqur Rahman
M. Mandal
Md. Istiak Hossain Shihab
A. A. N. Ansary
...
Sejuti Rahman
Sayma Sultana Chowdhury
Sabbir Ahmed Chowdhury
Farig Sadeque
Asif Sushmit
23
1
0
29 Aug 2023
Improving Continuous Sign Language Recognition with Cross-Lingual Signs
Improving Continuous Sign Language Recognition with Cross-Lingual Signs
Fangyun Wei
Yutong Chen
SLR
28
28
0
21 Aug 2023
Indonesian Automatic Speech Recognition with XLSR-53
Indonesian Automatic Speech Recognition with XLSR-53
Panji Arisaputra
Amalia Zahra
21
6
0
20 Aug 2023
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Tolulope Ogunremi
Kólá Túbosún
Aremu Anuoluwapo
Iroro Orife
David Ifeoluwa Adelani
42
6
0
29 Jul 2023
Replay to Remember: Continual Layer-Specific Fine-tuning for German
  Speech Recognition
Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Theresa Pekarek-Rosin
S. Wermter
VLM
CLL
32
2
0
14 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
26
6
0
12 Jul 2023
Leveraging multilingual transfer for unsupervised semantic acoustic word
  embeddings
Leveraging multilingual transfer for unsupervised semantic acoustic word embeddings
C. Jacobs
Herman Kamper
32
1
0
05 Jul 2023
Speech-based Age and Gender Prediction with Transformers
Speech-based Age and Gender Prediction with Transformers
Felix Burkhardt
Johannes Wagner
H. Wierstorf
F. Eyben
Björn Schuller
14
15
0
29 Jun 2023
NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition
  via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning
NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning
K. Yuksel
Thiago Castro Ferreira
Golara Javadi
Mohamed El-Badrashiny
Ahmet Gunduz
26
4
0
21 Jun 2023
Multi-pass Training and Cross-information Fusion for Low-resource
  End-to-end Accented Speech Recognition
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang
Yanhua Long
Yijie Li
Haoran Wei
37
4
0
20 Jun 2023
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Cihan Xiao
Henry Li Xinyuan
Jinyi Yang
Dongji Gao
Matthew Wiesner
Kevin Duh
Sanjeev Khudanpur
37
1
0
20 Jun 2023
Rehearsal-Free Online Continual Learning for Automatic Speech
  Recognition
Rehearsal-Free Online Continual Learning for Automatic Speech Recognition
Steven Vander Eeckt
Hugo Van hamme
CLL
43
3
0
19 Jun 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
KIT's Multilingual Speech Translation System for IWSLT 2023
Danni Liu
Thai-Binh Nguyen
Sai Koneru
Enes Yavuz Ugan
Ngoc-Quan Pham
Tuan-Nam Nguyen
Tu Anh Dinh
Carlos Mullov
A. Waibel
Jan Niehues
28
7
0
08 Jun 2023
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based
  Augmentation
Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation
Massa Baali
Ibrahim Almakky
Shady Shehata
Fakhri Karray
45
1
0
07 Jun 2023
Allophant: Cross-lingual Phoneme Recognition with Articulatory
  Attributes
Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes
Kevin Glocker
Aaricia Herygers
Munir Georges
29
4
0
07 Jun 2023
Some voices are too common: Building fair speech recognition systems
  using the Common Voice dataset
Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Lucas Maison
Yannick Esteve
28
3
0
01 Jun 2023
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer
  Learning for Low-Resource Text-to-Speech
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
P. Do
Matt Coler
J. Dijkstra
E. Klabbers
OffRL
29
0
0
01 Jun 2023
Findings of the VarDial Evaluation Campaign 2023
Findings of the VarDial Evaluation Campaign 2023
Noëmi Aepli
Çagri Çöltekin
Rob van der Goot
T. Jauhiainen
Mourhaf Kazzaz
Nikola Ljubesic
Kai North
Barbara Plank
Yves Scherrer
Marcos Zampieri
19
29
0
31 May 2023
Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme
  Recognition
Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition
Xiao-lan Wu
P. Bell
A. Rajan
19
5
0
29 May 2023
Stochastic Pitch Prediction Improves the Diversity and Naturalness of
  Speech in Glow-TTS
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
DiffM
35
4
0
28 May 2023
DistriBlock: Identifying adversarial audio samples by leveraging
  characteristics of the output distribution
DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution
Matías P. Pizarro
D. Kolossa
Asja Fischer
AAML
43
1
0
26 May 2023
DisfluencyFixer: A tool to enhance Language Learning through Speech To
  Speech Disfluency Correction
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction
Vineet Bhat
P. Jyothi
P. Bhattacharyya
24
0
0
26 May 2023
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in
  End-to-End Zero-Shot Speech Synthesis
Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Seong-Hyun Park
Bohyung Kim
Tae-Hyun Oh
50
1
0
26 May 2023
Spoken Question Answering and Speech Continuation Using
  Spectrogram-Powered LLM
Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Eliya Nachmani
Alon Levkovitch
Roy Hirsch
Julián Salazar
Chulayutsh Asawaroengchai
Soroosh Mariooryad
Ehud Rivlin
RJ Skerry-Ryan
Michelle Tadmor Ramanovich
AuLLM
34
34
0
24 May 2023
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
  Translation
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Chenyang Le
Yao Qian
Long Zhou
Shujie Liu
Yanmin Qian
Michael Zeng
Xuedong Huang
24
13
0
24 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative
  AI
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
38
2
0
23 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLM
SyDa
44
54
0
22 May 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech
  Pre-Training for Adaptation to Unseen Languages
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
39
22
0
21 May 2023
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems
Karel Beneš
M. Kocour
L. Burget
37
2
0
21 May 2023
DUB: Discrete Unit Back-translation for Speech Translation
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
29
23
0
19 May 2023
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation
Lijie Yang
Chao-Han Huck Yang
Jen-Tzung Chien
22
11
0
18 May 2023
Previous
1234567
Next