ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.01051
  4. Cited By
SUPERB: Speech processing Universal PERformance Benchmark

SUPERB: Speech processing Universal PERformance Benchmark

3 May 2021
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
Yist Y. Lin
Andy T. Liu
Jiatong Shi
Xuankai Chang
Guan-Ting Lin
Tzu-hsien Huang
Wei-Cheng Tseng
Ko-tik Lee
Da-Rong Liu
Zili Huang
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
    SSL
ArXivPDFHTML

Papers citing "SUPERB: Speech processing Universal PERformance Benchmark"

50 / 209 papers shown
Title
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
Haoshuai Zhou
Boxuan Cao
Changgeng Mo
Linkai Li
Shan Xiang Wang
AI4CE
31
0
0
13 May 2025
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
Junyi Peng
Takanori Ashihara
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
ELM
29
0
0
10 May 2025
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
131
0
0
09 May 2025
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection
June-Woo Kim
Haram Yoon
Wonkyo Oh
Dawoon Jung
Sung-Hoon Yoon
Dae-Jin Kim
Dong-Ho Lee
Sang-Yeol Lee
Chan-Mo Yang
36
0
0
06 May 2025
fastabx: A library for efficient computation of ABX discriminability
fastabx: A library for efficient computation of ABX discriminability
Maxime Poli
Emmanuel Chemla
Emmanuel Dupoux
34
0
0
05 May 2025
BLAB: Brutally Long Audio Bench
BLAB: Brutally Long Audio Bench
Orevaoghene Ahia
Martijn Bartelds
Kabir Ahuja
Hila Gonen
Valentin Hofmann
...
Noah Bennett
Shinji Watanabe
Noah A. Smith
Yulia Tsvetkov
Sachin Kumar
AuLLM
LM&MA
VLM
60
0
0
05 May 2025
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
26
0
0
30 Apr 2025
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
28
0
0
21 Apr 2025
BrainWavLM: Fine-tuning Speech Representations with Brain Responses to Language
BrainWavLM: Fine-tuning Speech Representations with Brain Responses to Language
Nishitha Vattikonda
A. Vaidya
Richard Antonello
Alexander G. Huth
103
0
0
13 Feb 2025
Evaluation of Deep Audio Representations for Hearables
Evaluation of Deep Audio Representations for Hearables
Fabian Gröger
Pascal Baumann
Ludovic Amruthalingam
Laurent Simon
Ruksana Giurda
Simone Lionetti
88
0
0
10 Feb 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
69
0
0
05 Feb 2025
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Yassine El Kheir
Youness Samih
Suraj Maharjan
Tim Polzehl
Sebastian Möller
73
1
0
05 Feb 2025
Safe Gradient Flow for Bilevel Optimization
Safe Gradient Flow for Bilevel Optimization
Sina Sharifi
Nazanin Abolfazli
E. Y. Hamedani
Mahyar Fazlyab
36
1
0
27 Jan 2025
Noise-Agnostic Multitask Whisper Training for Reducing False Alarm Errors in Call-for-Help Detection
Noise-Agnostic Multitask Whisper Training for Reducing False Alarm Errors in Call-for-Help Detection
Myeonghoon Ryu
June-Woo Kim
Minseok Oh
Suji Lee
Han Park
41
0
0
20 Jan 2025
LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
Soumya Dutta
Sriram Ganapathy
36
2
0
20 Jan 2025
How Redundant Is the Transformer Stack in Speech Representation Models?
How Redundant Is the Transformer Stack in Speech Representation Models?
Teresa Dorszewski
Albert Kjøller Jacobsen
Lenka Tětková
Lars Kai Hansen
107
0
0
20 Jan 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
38
5
0
17 Jan 2025
Discrete Speech Unit Extraction via Independent Component Analysis
Discrete Speech Unit Extraction via Independent Component Analysis
Tomohiko Nakamura
Kwanghee Choi
Keigo Hojo
Yoshiaki Bando
Satoru Fukayama
Shinji Watanabe
43
0
0
11 Jan 2025
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Shih-Heng Wang
Zih-Ching Chen
Jiatong Shi
Ming To Chuang
Guan-Ting Lin
Kuan Po Huang
David Harwath
Shang-Wen Li
Hung-yi Lee
81
1
0
27 Nov 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Cheol Jun Cho
Nicholas Lee
Akshat Gupta
Dhruv Agarwal
Ethan Chen
Alan W Black
Gopala K. Anumanchipalli
34
0
0
09 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
59
14
0
01 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Xiaoyu Yang
Qiujia Li
Chao Zhang
P. Woodland
24
0
0
25 Sep 2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Yufeng Yang
Desh Raj
Ju Lin
Niko Moritz
J. Jia
...
Egor Lakomkin
Yiteng Huang
Jacob Donley
Jay Mahadeokar
Ozlem Kalinli
26
2
0
17 Sep 2024
Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Yi-Jen Shih
Zoi Gkalitsiou
A. Dimakis
David Harwath
42
1
0
16 Sep 2024
Self-supervised Learning for Acoustic Few-Shot Classification
Self-supervised Learning for Acoustic Few-Shot Classification
Jingyong Liang
Bernd Meyer
Isaac Ning Lee
Thanh-Toan Do
SSL
52
0
0
15 Sep 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models
  for Speaker Verification
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Sung Won Han
SLR
50
0
0
12 Sep 2024
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
Andy T. Liu
Yi-Cheng Lin
Haibin Wu
Stefan Winkler
Hung-yi Lee
31
1
0
09 Sep 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
57
33
0
29 Aug 2024
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
45
2
0
29 Aug 2024
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation
Hemant Yadav
Sunayana Sitaram
R. Shah
SSL
49
0
0
20 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
42
1
0
12 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Optimizing Automatic Speech Assessment: W-RankSim Regularization and
  Hybrid Feature Fusion Strategies
Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies
Chung-Wen Wu
Berlin Chen
40
0
0
16 Jun 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
42
9
0
15 Jun 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling
  Constraints, Languages, and Datasets
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
77
5
0
12 Jun 2024
Self-Supervised Speech Representations are More Phonetic than Semantic
Self-Supervised Speech Representations are More Phonetic than Semantic
Kwanghee Choi
Ankita Pasad
Tomohiko Nakamura
Satoru Fukayama
Karen Livescu
Shinji Watanabe
29
14
0
12 Jun 2024
TokSing: Singing Voice Synthesis based on Discrete Tokens
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu
Chunlei Zhang
Jiatong Shi
Yuxun Tang
Shan Yang
Qin Jin
39
6
0
12 Jun 2024
Refining Self-Supervised Learnt Speech Representation using Brain
  Activations
Refining Self-Supervised Learnt Speech Representation using Brain Activations
Hengyu Li
Kangdi Mei
Zhaoci Liu
Yang Ai
Liping Chen
Jie Zhang
Zhenhua Ling
SSL
21
1
0
12 Jun 2024
Predicting Heart Activity from Speech using Data-driven and
  Knowledge-based features
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
Gasser Elbanna
Z. Mostaani
Mathew Magimai.-Doss
SSL
47
0
0
10 Jun 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient
  Fine-Tuning
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
Chung-Ming Chien
Andros Tjandra
Apoorv Vyas
Matt Le
Bowen Shi
Wei-Ning Hsu
32
0
0
10 Jun 2024
Emotion-Aware Speech Self-Supervised Representation Learning with
  Intensity Knowledge
Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
Rui Liu
Zening Ma
SSL
42
1
0
10 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech
  Representation Models
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
40
1
0
08 Jun 2024
Dataset-Distillation Generative Model for Speech Emotion Recognition
Dataset-Distillation Generative Model for Speech Emotion Recognition
Fabian Ritter Gutierrez
Kuan Po Huang
Jeremy H. M Wong
Dianwen Ng
Hung-yi Lee
Nancy F. Chen
Eng Siong Chng
DD
37
0
0
05 Jun 2024
Fill in the Gap! Combining Self-supervised Representation Learning with
  Neural Audio Synthesis for Speech Inpainting
Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Ihab Asaad
Maxime Jacquelin
Olivier Perrotin
Laurent Girin
Thomas Hueber
33
0
0
30 May 2024
Investigating the Áutoencoder Behavior' in Speech Self-Supervised
  Models: a focus on HuBERT's Pretraining
Investigating the Áutoencoder Behavior' in Speech Self-Supervised Models: a focus on HuBERT's Pretraining
Valentin Vielzeuf
SSL
44
0
0
14 May 2024
RepAugment: Input-Agnostic Representation-Level Augmentation for
  Respiratory Sound Classification
RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification
June-Woo Kim
Miika Toikkanen
Sangmin Bae
Minseok Kim
Ho-Young Jung
32
5
0
05 May 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech
  Recognition Datasets
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets
Jan Pevsán
Santosh Kesiraju
Lukávs Burget
JanHonza'' vCernocký
19
0
0
12 Mar 2024
12345
Next