ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.01051
  4. Cited By
SUPERB: Speech processing Universal PERformance Benchmark

SUPERB: Speech processing Universal PERformance Benchmark

3 May 2021
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
Yist Y. Lin
Andy T. Liu
Jiatong Shi
Xuankai Chang
Guan-Ting Lin
Tzu-hsien Huang
Wei-Cheng Tseng
Ko-tik Lee
Da-Rong Liu
Zili Huang
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
    SSL
ArXivPDFHTML

Papers citing "SUPERB: Speech processing Universal PERformance Benchmark"

50 / 212 papers shown
Title
Predicting within and across language phoneme recognition performance of
  self-supervised learning speech pre-trained models
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Han Ji
T. Patel
O. Scharenborg
39
7
0
24 Jun 2022
Comparing supervised and self-supervised embedding for ExVo Multi-Task
  learning track
Comparing supervised and self-supervised embedding for ExVo Multi-Task learning track
Tilak Purohit
Imen Ben Mahmoud
Bogdan Vlasenko
Mathew Magimai.-Doss
SSL
17
8
0
23 Jun 2022
Boosting Cross-Domain Speech Recognition with Self-Supervision
Boosting Cross-Domain Speech Recognition with Self-Supervision
Hanjing Zhu
Gaofeng Cheng
Jindong Wang
Wenxin Hou
Pengyuan Zhang
Yonghong Yan
19
13
0
20 Jun 2022
Investigation of Ensemble features of Self-Supervised Pretrained Models
  for Automatic Speech Recognition
Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition
Anjana Arunkumar
Vrunda N. Sukhadia
S. Umesh
25
10
0
11 Jun 2022
Self-supervised models of audio effectively explain human cortical
  responses to speech
Self-supervised models of audio effectively explain human cortical responses to speech
Aditya R. Vaidya
Shailee Jain
Alexander G. Huth
30
42
0
27 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
131
349
0
21 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
28
8
0
08 May 2022
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
  Languages
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Felix Wu
Kwangyoun Kim
Shinji Watanabe
Kyu Jeong Han
Ryan T. McDonald
Kilian Q. Weinberger
Yoav Artzi
SyDa
45
37
0
02 May 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker
  Recognition?
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
18
39
0
27 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by
  Disentangling Speakers
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
30
110
0
20 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio
  Representations
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
36
53
0
15 Apr 2022
The PartialSpoof Database and Countermeasures for the Detection of Short
  Fake Speech Segments Embedded in an Utterance
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
Lin Zhang
Xin Wang
Erica Cooper
Nicholas W. D. Evans
Junichi Yamagishi
19
56
0
11 Apr 2022
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Rong Ye
Chengqi Zhao
Tom Ko
Chutong Meng
Tao Wang
Mingxuan Wang
Jun Cao
9
23
0
08 Apr 2022
Automatic Pronunciation Assessment using Self-Supervised Speech
  Representation Learning
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
Eesung Kim
J. Jeon
Hyeji Seo
Ho-Young Kim
SSL
23
37
0
08 Apr 2022
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Ryandhimas E. Zezario
Szu-Wei Fu
Fei Chen
C. Fuh
Hsin-Min Wang
Yu Tsao
24
13
0
07 Apr 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
28
56
0
06 Apr 2022
User-Level Differential Privacy against Attribute Inference Attack of
  Speech Emotion Recognition in Federated Learning
User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning
Tiantian Feng
Raghuveer Peri
Shrikanth Narayanan
FedML
18
28
0
05 Apr 2022
Combining Spectral and Self-Supervised Features for Low Resource Speech
  Recognition and Translation
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Dan Berrebbi
Jiatong Shi
Brian Yan
Osbel López-Francisco
Jonathan D. Amith
Shinji Watanabe
10
26
0
05 Apr 2022
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech
  Separation for Flexible Number of Speakers
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Soumi Maiti
Yushi Ueda
Shinji Watanabe
Chunlei Zhang
Meng Yu
Shi-Xiong Zhang
Yong-mei Xu
34
32
0
31 Mar 2022
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech
  Representations
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations
L. D. Prasad
Sreyan Ghosh
S. Umesh
25
12
0
31 Mar 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken
  Language Model for Speech Processing Tasks
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
24
22
0
31 Mar 2022
Generative Spoken Dialogue Language Modeling
Generative Spoken Dialogue Language Modeling
Tu Nguyen
Eugene Kharitonov
Jade Copet
Yossi Adi
Wei-Ning Hsu
...
Paden Tomasello
Robin Algayres
Benoît Sagot
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
32
80
0
30 Mar 2022
Improving Distortion Robustness of Self-supervised Speech Processing
  Tasks with Domain Adaptation
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
Kuan Po Huang
Yuanbin Fu
Yu Zhang
Hung-yi Lee
19
28
0
30 Mar 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning
  with Once-for-All Hidden-Unit BERT
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
Rui Wang
Qibing Bai
Junyi Ao
Long Zhou
Zhixiang Xiong
Zhihua Wei
Yu Zhang
Tom Ko
Haizhou Li
34
61
0
29 Mar 2022
A Speech Representation Anonymization Framework via Selective Noise
  Perturbation
A Speech Representation Anonymization Framework via Selective Noise Perturbation
Minh Tran
M. Soleymani
27
4
0
26 Mar 2022
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On
  Federated Learning using Multiview Pseudo-Labeling
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Tiantian Feng
Shrikanth Narayanan
35
17
0
15 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Measuring the Impact of Individual Domain Factors in Self-Supervised
  Pre-Training
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Ramon Sanabria
Wei-Ning Hsu
Alexei Baevski
Michael Auli
19
7
0
01 Mar 2022
Towards a Common Speech Analysis Engine
Towards a Common Speech Analysis Engine
Hagai Aronowitz
Itai Gat
E. Morais
Weizhong Zhu
R. Hoory
20
3
0
01 Mar 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDL
AI4TS
SSL
19
11
0
01 Mar 2022
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
151
0
24 Feb 2022
Domain Adaptation of low-resource Target-Domain models using
  well-trained ASR Conformer Models
Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models
Vrunda N. Sukhadia
S. Umesh
30
8
0
18 Feb 2022
Multimodal Emotion Recognition using Transfer Learning from Speaker
  Recognition and BERT-based models
Multimodal Emotion Recognition using Transfer Learning from Speaker Recognition and BERT-based models
Sarala Padi
S. O. Sadjadi
Tianyi Zhou
Ram D. Sriram
28
36
0
16 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
37
50
0
02 Feb 2022
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
  Languages
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
Emanuele Bugliarello
Fangyu Liu
Jonas Pfeiffer
Siva Reddy
Desmond Elliott
E. Ponti
Ivan Vulić
MLLM
VLM
ELM
48
62
0
27 Jan 2022
Bias in Automated Speaker Recognition
Bias in Automated Speaker Recognition
Wiebke Toussaint
Aaron Yi Ding
CVBM
32
44
0
24 Jan 2022
Attribute Inference Attack of Speech Emotion Recognition in Federated
  Learning Settings
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng
H. Hashemi
Rajat Hebbar
M. Annavaram
Shrikanth S. Narayanan
23
25
0
26 Dec 2021
Self-Supervised Learning for speech recognition with Intermediate layer
  supervision
Self-Supervised Learning for speech recognition with Intermediate layer supervision
Chengyi Wang
Yu-Huan Wu
Sanyuan Chen
Shujie Liu
Jinyu Li
Yao Qian
Zhenglu Yang
SSL
21
28
0
16 Dec 2021
On the Use of External Data for Spoken Named Entity Recognition
On the Use of External Data for Spoken Named Entity Recognition
Ankita Pasad
Felix Wu
Suwon Shon
Karen Livescu
Kyu Jeong Han
37
16
0
14 Dec 2021
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Siddhant Arora
Siddharth Dalmia
Pavel Denisov
Xuankai Chang
Yushi Ueda
...
Karthik Ganesan
Brian Yan
Ngoc Thang Vu
A. Black
Shinji Watanabe
VLM
33
74
0
29 Nov 2021
Speech Tasks Relevant to Sleepiness Determined with Deep Transfer
  Learning
Speech Tasks Relevant to Sleepiness Determined with Deep Transfer Learning
Bang Tran
Youxiang Zhu
Xiaohui Liang
J. Schwoebel
L. Warrenburg
18
7
0
29 Nov 2021
Towards Learning Universal Audio Representations
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
29
68
0
23 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at
  Scale
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
32
657
0
17 Nov 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice
  Conversion
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
Benjamin van Niekerk
M. Carbonneau
Julian Zaïdi
Matthew Baas
Hugo Seuté
Herman Kamper
DRL
22
111
0
03 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
110
1,704
0
26 Oct 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
38
268
0
19 Oct 2021
Speech Representation Learning Through Self-supervised Pretraining And
  Multi-task Finetuning
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Yi-Chen Chen
Shu-Wen Yang
Cheng-Kuang Lee
Simon See
Hung-yi Lee
SSL
19
12
0
18 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio
  Representations
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
36
12
0
17 Oct 2021
Don't speak too fast: The impact of data bias on self-supervised speech
  models
Don't speak too fast: The impact of data bias on self-supervised speech models
Yen Meng
Yi-Hui Chou
Andy T. Liu
Hung-yi Lee
34
25
0
15 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
26
124
0
12 Oct 2021
Previous
12345
Next