Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04624
Cited By
SpeechBrain: A General-Purpose Speech Toolkit
8 June 2021
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
Loren Lugosch
Cem Subakan
Nauman Dawalatabad
A. Heba
Jianyuan Zhong
Ju-Chieh Chou
Sung-Lin Yeh
Szu-Wei Fu
Chien-Feng Liao
E. Rastorgueva
Franccois Grondin
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpeechBrain: A General-Purpose Speech Toolkit"
50 / 148 papers shown
Title
Automatic Proficiency Assessment in L2 English Learners
Armita Mohammadi
Alessandro Lameiras Koerich
Laureano Moro-Velazquez
P. Cardinal
37
0
0
05 May 2025
BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech Recognition
Paige Tuttosi
Mantaj Dhillon
Luna Sang
Shane Eastwood
Poorvi Bhatia
Quang Minh Dinh
Avni Kapoor
Yewon Jin
Angelica Lim
28
0
0
30 Apr 2025
SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development
Minghan Wang
Ye Bai
Yalin Wang
Thuy-Trang Vu
Ehsan Shareghi
Gholamreza Haffari
52
0
0
31 Mar 2025
The Role of Prosody in Spoken Question Answering
Jie Chi
Maureen de Seyssel
Natalie Schluter
54
0
0
08 Feb 2025
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement
Jae-Sung Bae
Anastasia Kuznetsova
Dinesh Manocha
John Hershey
Trausti Kristjansson
Minje Kim
77
0
0
23 Jan 2025
PASS: Presentation Automation for Slide Generation and Speech
Tushar Aggarwal
Aarohi Bhand
68
1
0
17 Jan 2025
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition
Shucong Zhang
Titouan Parcollet
Rogier van Dalen
Sourav Bhattacharya
51
0
0
10 Jan 2025
Metadata-Enhanced Speech Emotion Recognition: Augmented Residual Integration and Co-Attention in Two-Stage Fine-Tuning
Zixiang Wan
Ziyue Qiu
Yiyang Liu
Wei-Qiang Zhang
31
0
0
31 Dec 2024
autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
Simon Rampp
Andreas Triantafyllopoulos
M. Milling
Björn Schuller
87
0
0
16 Dec 2024
The First VoicePrivacy Attacker Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Emmanuel Vincent
Junichi Yamagishi
128
2
0
09 Oct 2024
HAINAN: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
Hainan Xu
Travis M. Bartley
Vladimir Bataev
Boris Ginsburg
157
0
0
03 Oct 2024
Voice Conversion-based Privacy through Adversarial Information Hiding
J. Webber
O. Watts
G. Henter
Jennifer Williams
Simon King
45
0
0
23 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
56
3
0
23 Sep 2024
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora
F. Nespoli
Daniel Barreda
Patrick A. Naylor
28
1
0
17 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
37
2
0
04 Sep 2024
Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?
Anupama Chingacham
Miaoran Zhang
Vera Demberg
Dietrich Klakow
41
0
0
07 Aug 2024
FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process
Yuyan Bu
Qiang Sheng
Juan Cao
Peng Qi
Danding Wang
Jintao Li
DiffM
41
8
0
23 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
46
1
0
08 Jul 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
42
9
0
15 Jun 2024
Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice
Shubham Gupta
Mirco Ravanelli
Pascal Germain
Cem Subakan
FAtt
45
3
0
14 Jun 2024
Sustainable self-supervised learning for speech representations
Luis Lugo
Valentin Vielzeuf
31
2
0
11 Jun 2024
Controlling Emotion in Text-to-Speech with Natural Language Prompts
Thomas Bott
Florian Lux
Ngoc Thang Vu
38
6
0
10 Jun 2024
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation
D. Doukhan
Christine Maertens
William Le Personnic
Ludovic Speroni
Reda Dehak
38
2
0
06 Jun 2024
Hypernetworks for Personalizing ASR to Atypical Speech
Max Müller-Eberstein
Dianna Yee
Karren D. Yang
G. Mantena
Colin S. Lea
33
0
0
06 Jun 2024
1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Mingjie Chen
Hezhao Zhang
Yuanchao Li
Jiachen Luo
Wen Wu
...
Lin Wang
P. Woodland
Xie Chen
Huy P Phan
Thomas Hain
23
0
0
30 May 2024
Listenable Maps for Zero-Shot Audio Classifiers
Francesco Paissan
Luca Della Libera
Mirco Ravanelli
Cem Subakan
40
4
0
27 May 2024
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization
Jenthe Thienpondt
Kris Demuynck
41
2
0
15 May 2024
Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants
Chloe Sekkat
Fanny Leroy
Salima Mdhaffar
Blake Perry Smith
Yannick Esteve
Joseph Dureau
A. Coucke
32
1
0
14 May 2024
Low-resource speech recognition and dialect identification of Irish in a multi-task framework
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
43
2
0
02 May 2024
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Hainan Xu
Zhehuai Chen
Fei Jia
Boris Ginsburg
38
0
0
04 Apr 2024
The VoicePrivacy 2024 Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Pierre Champion
Sarina Meyer
Xin Wang
Emmanuel Vincent
Michele Panariello
Nicholas W. D. Evans
Junichi Yamagishi
Massimiliano Todisco
38
21
0
03 Apr 2024
Multi-Level Attention Aggregation for Language-Agnostic Speaker Replication
Yejin Jeon
Gary Geunbae Lee
31
2
0
06 Mar 2024
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
26
9
0
14 Dec 2023
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Xingqun Qi
Jiahao Pan
Peng Li
Ruibin Yuan
Xiaowei Chi
...
Wenhan Luo
Wei Xue
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
SLR
31
11
0
29 Nov 2023
Soft Random Sampling: A Theoretical and Empirical Analysis
Xiaodong Cui
Ashish R. Mittal
Songtao Lu
Wei Zhang
G. Saon
Brian Kingsbury
48
1
0
21 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
35
2
0
01 Nov 2023
MUST: A Multilingual Student-Teacher Learning approach for low-resource speech recognition
Muhammad Umar Farooq
Rehan Ahmad
Thomas Hain
25
0
0
29 Oct 2023
Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search
Yingying Gao
Shilei Zhang
Zihao Cui
Chao Deng
Junlan Feng
20
0
0
23 Oct 2023
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
30
3
0
16 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
28
7
0
09 Oct 2023
Multimodal Modeling For Spoken Language Identification
Shikhar Bharadwaj
Min Ma
Shikhar Vashishth
Ankur Bapna
Sriram Ganapathy
...
Yu Zhang
D. Esch
Sandy Ritchie
Partha P. Talukdar
Jason Riesa
30
0
0
19 Sep 2023
Voice Morphing: Two Identities in One Voice
Sushant Pani
Anurag Chowdhury
Morgan Sandler
Arun Ross
19
1
0
05 Sep 2023
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech
Hyungchan Yoon
Changhwan Kim
Eunwoo Song
Hyun-Wook Yoon
Hong-Goo Kang
37
1
0
28 Aug 2023
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Wen Wu
C. Zhang
P. Woodland
31
3
0
14 Aug 2023
On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer
Md. Asif Jalal
Pablo Peso Parada
Jisi Zhang
Karthikeyan P. Saravanan
Mete Ozay
Myoungji Han
Jung In Lee
Seokyeong Jung
28
1
0
25 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
26
6
0
12 Jul 2023
Implicit spoken language diarization
Jagabandhu Mishra
Amartya Roy Chowdhury
S. M. I. S. R. Mahadeva Prasanna
22
0
0
22 Jun 2023
Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation
K. Lakshminarayana
C. Dittmar
N. Pia
Emanuel Habets
28
0
0
16 Jun 2023
1
2
3
Next