Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.03502
Cited By
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings
8 April 2021
L. Pepino
Pablo Riera
Luciana Ferrer
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings"
50 / 57 papers shown
Title
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Ruikun Hou
B. Bühler
Tim Fütterer
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
31
0
0
12 May 2025
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
Maja J. Hjuler
Line H. Clemmensen
Sneha Das
FAtt
49
1
0
07 Apr 2025
Heterogeneous bimodal attention fusion for speech emotion recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua Reiss
44
0
0
09 Mar 2025
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Ruchik Mishra
Andrew Frye
M. M. Rayguru
Dan O. Popa
39
1
0
16 Sep 2024
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets with Heterogeneous Recording Conditions
L. Gauder
Pablo Riera
A. Slachevsky
G. Forno
Adolfo M. Garcia
Luciana Ferrer
38
1
0
11 Sep 2024
Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
Bulat Khaertdinov
Pedro Jeuris
Annanda Sousa
Enrique Hortal
38
1
0
12 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets
Jan Pevsán
Santosh Kesiraju
Lukávs Burget
JanHonza'' vCernocký
24
0
0
12 Mar 2024
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Ziyang Ma
Zhisheng Zheng
Jiaxin Ye
Jinchao Li
Zhifu Gao
Shiliang Zhang
Xie Chen
MDE
SLR
SSL
25
88
0
23 Dec 2023
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
30
3
0
16 Oct 2023
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
39
1
0
19 Sep 2023
Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations
Debaditya Shome
Ali Etemad
35
5
0
09 Sep 2023
Leveraging Label Information for Multimodal Emotion Recognition
Pei-Hsin Wang
Sunlu Zeng
Junqing Chen
Lu Fan
Meng Chen
Youzheng Wu
Xiaodong He
29
4
0
05 Sep 2023
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
Haiyang Sun
Fulin Zhang
Yingying Gao
Zheng Lian
Shilei Zhang
Junlan Feng
30
4
0
12 Jun 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
27
2
0
24 May 2023
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
32
5
0
19 May 2023
A multimodal dynamical variational autoencoder for audiovisual speech representation learning
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
33
11
0
05 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
81
6
0
05 May 2023
A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Orchid Chetia Phukan
Arun Balaji Buduru
Rajesh Sharma
28
6
0
22 Apr 2023
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
34
20
0
21 Apr 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
29
17
0
03 Apr 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
32
1
0
19 Mar 2023
Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition
Weiwei Zhou
Jiada Lu
Zhaolong Xiong
Weifeng Wang
27
28
0
15 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
19
30
0
28 Feb 2023
Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition
Dekai Sun
Yancheng He
Jiqing Han
20
19
0
27 Feb 2023
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Zihan Zhao
Yu Wang
Yanfeng Wang
20
18
0
20 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech
Dominik Wagner
Sebastian P. Bayerl
H. A. C. Maruri
Tobias Bocklet
24
7
0
04 Dec 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
25
3
0
10 Nov 2022
Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words
Taesu Kim
Seungheon Doh
G. Lee
Hyungseok Jeon
Juhan Nam
Hyeon‐Jeong Suk
24
2
0
07 Nov 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
31
7
0
31 Oct 2022
Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning
Yi Chang
Zhao Ren
Thanh Tam Nguyen
Kun Qian
Björn W. Schuller
33
5
0
26 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
23
11
0
26 Oct 2022
Multilevel Transformer For Multimodal Emotion Recognition
Junyi He
Meimei Wu
Meng Li
Xiaobo Zhu
Feng Ye
15
6
0
26 Oct 2022
Training speech emotion classifier without categorical annotations
Meysam Shamsi
Marie Tahon
18
2
0
14 Oct 2022
Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Vincent Karas
Andreas Triantafyllopoulos
Meishu Song
Björn W. Schuller
38
4
0
15 Sep 2022
Generating Coherent Drum Accompaniment With Fills And Improvisations
Rishabh A. Dahale
Vaibhav Talwadker
Preeti Rao
Prateek Verma
22
3
0
01 Sep 2022
DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input
Jun Rekimoto
24
6
0
22 Aug 2022
Fully Automated End-to-End Fake Audio Detection
Chenglong Wang
Jiangyan Yi
J. Tao
Haiyang Sun
Xun Chen
Zhengkun Tian
Haoxin Ma
Cunhang Fan
Ruibo Fu
26
28
0
20 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Atijit Anuchitanukul
Lucia Specia
VLM
30
6
0
24 Jun 2022
The Influence of Dataset Partitioning on Dysfluency Detection Systems
Sebastian P. Bayerl
Dominik Wagner
Elmar Nöth
Tobias Bocklet
Korbinian Riedhammer
44
20
0
07 Jun 2022
Learning Speech Emotion Representations in the Quaternion Domain
E. Guizzo
Tillman Weyde
Simone Scardapane
Danilo Comminiello
24
18
0
05 Apr 2022
Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
22
17
0
04 Apr 2022
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom
Yeonghyeon Lee
Ji Sub Um
Hoi-Rim Kim
35
25
0
04 Apr 2022
Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
24
0
0
29 Mar 2022
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
Huisheng Mao
Ziqi Yuan
Hua Xu
Wenmeng Yu
Yihe Liu
Kai Gao
22
41
0
23 Mar 2022
KSoF: The Kassel State of Fluency Dataset -- A Therapy Centered Dataset of Stuttering
Sebastian P. Bayerl
A. Gudenberg
Florian Honig
Elmar Nöth
Korbinian Riedhammer
29
35
0
10 Mar 2022
The Vicomtech Audio Deepfake Detection System based on Wav2Vec2 for the 2022 ADD Challenge
Juan M. Martín-Donas
Aitor Álvarez
35
98
0
03 Mar 2022
1
2
Next