ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.12446
  4. Cited By
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

25 May 2022
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
    VLM
ArXivPDFHTML

Papers citing "FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech"

50 / 52 papers shown
Title
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
172
1
0
07 May 2025
Kimi-Audio Technical Report
Kimi-Audio Technical Report
KimiTeam
Ding Ding
Zeqian Ju
Yichong Leng
S. Liu
...
Z. Yang
Aoxiong Yin
Ruibin Yuan
Yuhang Zhang
Zaida Zhou
AuLLM
VLM
110
5
0
25 Apr 2025
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Prabhat Pandey
R. Swaminathan
K V Vijay Girish
Arunasish Sen
Jian Xie
Grant P. Strimel
Andreas Schwarz
140
0
0
12 Apr 2025
Whispering in Amharic: Fine-tuning Whisper for Low-resource Language
Whispering in Amharic: Fine-tuning Whisper for Low-resource Language
Dawit Ketema Gete
Bedru Yimam Ahamed
Tadesse Destaw Belay
Yohannes Ayana Ejigu
Sukairaj Hafiz Imam
...
Umma Aliyu Musa
Martin Semmann
Shamsuddeen Hassan Muhammad
Henning Schreiber
Seid Muhie Yimam
43
0
0
24 Mar 2025
Training and Inference Efficiency of Encoder-Decoder Speech Models
Training and Inference Efficiency of Encoder-Decoder Speech Models
Piotr .Zelasko
Kunal Dhawan
Daniel Galvez
Krishna C. Puvvada
Ankita Pasad
Nithin Rao Koluguri
Ke Hu
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
43
0
0
07 Mar 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Abdelrahman Abouelenin
Atabak Ashfaq
Adam Atkinson
Hany Awadalla
Nguyen Bach
...
Ishmam Zabir
Yunan Zhang
Li Zhang
Yuhang Zhang
Xiren Zhou
MoE
SyDa
73
24
0
03 Mar 2025
Nexus-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision
Che Liu
Yingji Zhang
D. Zhang
Weijie Zhang
Chenggong Gong
...
André Freitas
Qifan Wang
Z. Xu
Rongjuncheng Zhang
Yong Dai
AuLLM
76
0
0
26 Feb 2025
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
Hyunji Lee
Danni Liu
Supriti Sinhamahapatra
Jan Niehues
106
0
0
21 Feb 2025
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
Jiaxing Zhao
Q. Yang
Yixing Peng
Detao Bai
Shimin Yao
...
Xiang Chen
Shenghao Fu
Weixuan chen
Xihan Wei
Liefeng Bo
VGen
AuLLM
50
5
0
28 Jan 2025
Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
Alexan Ayrapetyan
Sofia Kostandian
Ara Yeroyan
Mher Yerznkanyan
Nikolay Karpov
Nune Tadevosyan
Vitaly Lavrukhin
Boris Ginsburg
66
0
0
08 Jan 2025
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
Yifan Peng
Krishna C. Puvvada
Zhehuai Chen
Piotr .Zelasko
He Huang
Kunal Dhawan
Ke Hu
Shinji Watanabe
Jagadeesh Balam
Boris Ginsburg
58
2
0
23 Oct 2024
Efficiently Identifying Low-Quality Language Subsets in Multilingual
  Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset
Farhan Samir
Emily P. Ahn
Shreya Prakash
Márton Soskuthy
Vered Shwartz
Jian Zhu
26
0
0
05 Oct 2024
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee
Anna Y. Sun
Basil Hosmer
Bilge Acun
Can Balioglu
...
Ram Pasunuru
Scott Yih
Sravya Popuri
Xing Liu
Carole-Jean Wu
52
2
0
30 Sep 2024
EMMeTT: Efficient Multimodal Machine Translation Training
EMMeTT: Efficient Multimodal Machine Translation Training
Piotr Żelasko
Zhehuai Chen
Mengru Wang
Daniel Galvez
Oleksii Hrinchuk
Shuoyang Ding
Ke Hu
Jagadeesh Balam
Vitaly Lavrukhin
Boris Ginsburg
35
1
0
20 Sep 2024
ASR Benchmarking: Need for a More Representative Conversational Dataset
ASR Benchmarking: Need for a More Representative Conversational Dataset
Gaurav Maheshwari
Dmitry Ivanov
Théo Johannet
Kevin El Haddad
28
0
0
18 Sep 2024
WER We Stand: Benchmarking Urdu ASR Models
WER We Stand: Benchmarking Urdu ASR Models
Samee Arif
Aamina Jamal Khan
Mustafa Abbas
Agha Ali Raza
Awais Athar
24
3
0
17 Sep 2024
Chain-of-Thought Prompting for Speech Translation
Chain-of-Thought Prompting for Speech Translation
Ke Hu
Zhehuai Chen
Chao-Han Huck Yang
Piotr Żelasko
Oleksii Hrinchuk
Vitaly Lavrukhin
Jagadeesh Balam
Boris Ginsburg
LRM
39
2
0
17 Sep 2024
WhisperNER: Unified Open Named Entity and Speech Recognition
WhisperNER: Unified Open Named Entity and Speech Recognition
Gil Ayache
Menachem Pirchi
Aviv Navon
Aviv Shamsian
Gill Hetz
Joseph Keshet
30
0
0
12 Sep 2024
What is lost in Normalization? Exploring Pitfalls in Multilingual ASR
  Model Evaluations
What is lost in Normalization? Exploring Pitfalls in Multilingual ASR Model Evaluations
Kavya Manohar
Leena G Pillai
29
3
0
04 Sep 2024
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Min Ma
Yuma Koizumi
Shigeki Karita
Heiga Zen
Jason Riesa
Haruko Ishikawa
M. Bacchiani
VLM
32
4
0
12 Aug 2024
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Beomseok Lee
Ioan Calapodescu
Marco Gaido
Matteo Negri
Laurent Besacier
AuLLM
39
3
0
07 Aug 2024
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
  Speech Recognition
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Ye Bai
Jingping Chen
Jitong Chen
Wei Chen
Zhuo Chen
...
Wanyi Zhang
Yang Zhang
Yawei Zhang
Yijie Zheng
Ming Zou
AuLLM
46
19
0
05 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark J. F. Gales
Kate Knill
25
1
0
01 Jul 2024
Outlier Reduction with Gated Attention for Improved Post-training
  Quantization in Large Sequence-to-sequence Speech Foundation Models
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
MQ
30
1
0
16 Jun 2024
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech
  Units for Spoken Language Understanding
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Suwon Shon
Kwangyoun Kim
Yi-Te Hsu
Prashant Sridhar
Shinji Watanabe
Karen Livescu
AuLLM
46
2
0
13 Jun 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling
  Constraints, Languages, and Datasets
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
77
5
0
12 Jun 2024
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in
  Multilingual ASR
Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR
Yerbolat Khassanov
Zhipeng Chen
Tianfeng Chen
Tze Yuang Chong
Wei Li
Jun Zhang
Lu Lu
Yuxuan Wang
AI4CE
16
0
0
12 Jun 2024
PolySpeech: Exploring Unified Multitask Speech Models for
  Competitiveness with Single-task Models
PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models
Runyan Yang
Huibao Yang
Xiqing Zhang
Tiantian Ye
Ying Liu
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
34
0
0
12 Jun 2024
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
Zheshu Song
Jianheng Zhuo
Yifan Yang
Ziyang Ma
Shixiong Zhang
Xie Chen
36
9
0
07 Jun 2024
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech
  Foundation Models
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
Vyas Raina
Rao Ma
Charles G McGhee
Kate Knill
Mark J. F. Gales
AAML
33
4
0
09 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Extending Multilingual Speech Synthesis to 100+ Languages without
  Transcribed Data
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Takaaki Saeki
Gary Wang
Nobuyuki Morioka
Isaac Elias
Kyle Kastner
...
Andrew Rosenberg
Bhuvana Ramabhadran
Heiga Zen
Francoise Beaufays
Hadar Shemtov
38
13
0
29 Feb 2024
Direct Punjabi to English speech translation using discrete units
Direct Punjabi to English speech translation using discrete units
Prabhjot Kaur
L. A. M. Bush
Weisong Shi
31
0
0
25 Feb 2024
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech
  Recognition, Translation, and Language Identification
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
Yifan Peng
Yui Sudo
Muhammad Shakeel
Shinji Watanabe
VLM
37
17
0
20 Feb 2024
The taste of IPA: Towards open-vocabulary keyword spotting and forced
  alignment in any language
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
Jian Zhu
Changbing Yang
Farhan Samir
Jahurul Islam
32
4
0
14 Nov 2023
Multimodal Modeling For Spoken Language Identification
Multimodal Modeling For Spoken Language Identification
Shikhar Bharadwaj
Min Ma
Shikhar Vashishth
Ankur Bapna
Sriram Ganapathy
...
Yu Zhang
D. Esch
Sandy Ritchie
Partha P. Talukdar
Jason Riesa
30
0
0
19 Sep 2023
Using fine-tuning and min lookahead beam search to improve Whisper
Using fine-tuning and min lookahead beam search to improve Whisper
Andrea Do
Oscar Brown
Zhengjie Wang
Nikhil Mathew
Zixin Liu
Jawwad Ahmed
Cheng Yu
27
1
0
19 Sep 2023
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Tolulope Ogunremi
Kólá Túbosún
Aremu Anuoluwapo
Iroro Orife
David Ifeoluwa Adelani
36
6
0
29 Jul 2023
Boosting Norwegian Automatic Speech Recognition
Boosting Norwegian Automatic Speech Recognition
Javier de la Rosa
Rolv-Arild Braaten
P. Kummervold
Freddy Wetjen
Svein Arne Brygfjeld
33
7
0
04 Jul 2023
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Cihan Xiao
Henry Li Xinyuan
Jinyi Yang
Dongji Gao
Matthew Wiesner
Kevin Duh
Sanjeev Khudanpur
34
1
0
20 Jun 2023
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual
  Pretrained Language Models
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin
Chengzhi Hu
Zheyu Zhang
André F. T. Martins
Hinrich Schütze
29
1
0
23 May 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech
  Pre-Training for Adaptation to Unseen Languages
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Andrew Rouditchenko
Sameer Khurana
Samuel Thomas
Rogerio Feris
Leonid Karlinsky
Hilde Kuehne
David Harwath
Brian Kingsbury
James R. Glass
VLM
37
22
0
21 May 2023
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented
  Languages
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder
J. Clark
Alexander Gutkin
Mihir Kale
Min Ma
...
Dan Garrette
R. Ingle
Melvin Johnson
Dmitry Panteleev
Partha P. Talukdar
ELM
22
38
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
24
17
0
18 May 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi
Dan Berrebbi
William Chen
Ho-Lam Chung
En-Pei Hu
...
Xuankai Chang
Shang-Wen Li
Abdel-rahman Mohamed
Hung-yi Lee
Shinji Watanabe
ELM
55
58
0
18 May 2023
Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects
Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects
Clément Sicard
Kajetan Pyszkowski
Victor Gillioz
24
7
0
20 Apr 2023
Hallucinations in Large Multilingual Translation Models
Hallucinations in Large Multilingual Translation Models
Nuno M. Guerreiro
Duarte M. Alves
Jonas Waldendorf
Barry Haddow
Alexandra Birch
Pierre Colombo
André F.T. Martins
VLM
HILM
LRM
35
140
0
28 Mar 2023
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE
  Benchmark for Speech Understanding
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding
Yingting Li
Ambuj Mehrish
Shuaijiang Zhao
Rishabh Bhardwaj
Amir Zadeh
Navonil Majumder
Rada Mihalcea
Soujanya Poria
AAML
26
16
0
02 Mar 2023
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual
  Speech-to-Speech Translations
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
39
34
0
08 Nov 2022
A Compact End-to-End Model with Local and Global Context for Spoken
  Language Identification
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
30
3
0
27 Oct 2022
12
Next