ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.10752
56
22

XTREME-S: Evaluating Cross-lingual Speech Representations

21 March 2022
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
Anton Lozhkov
Colin Cherry
Ye Jia
Clara E. Rivera
Mihir Kale
D. Esch
Vera Axelrod
Simran Khanuja
J. Clark
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
    VLM
    AILaw
    ELM
ArXivPDFHTML
Abstract

We introduce XTREME-S, a new benchmark to evaluate universal cross-lingual speech representations in many languages. XTREME-S covers four task families: speech recognition, classification, speech-to-text translation and retrieval. Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning. This paper describes the new benchmark and establishes the first speech-only and speech-text baselines using XLS-R and mSLAM on all downstream tasks. We motivate the design choices and detail how to use the benchmark. Datasets and fine-tuning scripts are made easily accessible at https://hf.co/datasets/google/xtreme_s.

View on arXiv
Comments on this paper