ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.03502
  4. Cited By
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings

Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings

8 April 2021
L. Pepino
Pablo Riera
Luciana Ferrer
ArXivPDFHTML

Papers citing "Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings"

50 / 59 papers shown
Title
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Ruikun Hou
B. Bühler
Tim Fütterer
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
31
0
0
12 May 2025
Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning
Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning
Davoud Shariat Panah
Alessandro N Franciosi
Cormac McCarthy
Andrew Hines
26
0
0
15 Apr 2025
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
Maja J. Hjuler
Line H. Clemmensen
Sneha Das
FAtt
52
1
0
07 Apr 2025
Heterogeneous bimodal attention fusion for speech emotion recognition
Heterogeneous bimodal attention fusion for speech emotion recognition
Jiachen Luo
Huy Phan
Lin Wang
Joshua Reiss
44
0
0
09 Mar 2025
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
Ruchik Mishra
Andrew Frye
M. M. Rayguru
Dan O. Popa
39
1
0
16 Sep 2024
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets
  with Heterogeneous Recording Conditions
The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets with Heterogeneous Recording Conditions
L. Gauder
Pablo Riera
A. Slachevsky
G. Forno
Adolfo M. Garcia
Luciana Ferrer
38
1
0
11 Sep 2024
Exploring Self-Supervised Multi-view Contrastive Learning for Speech
  Emotion Recognition with Limited Annotations
Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
Bulat Khaertdinov
Pedro Jeuris
Annanda Sousa
Enrique Hortal
38
1
0
12 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech
  Recognition Datasets
Beyond the Labels: Unveiling Text-Dependency in Paralinguistic Speech Recognition Datasets
Jan Pevsán
Santosh Kesiraju
Lukávs Burget
JanHonza'' vCernocký
24
0
0
12 Mar 2024
emotion2vec: Self-Supervised Pre-Training for Speech Emotion
  Representation
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Ziyang Ma
Zhisheng Zheng
Jiaxin Ye
Jinchao Li
Zhifu Gao
Shiliang Zhang
Xie Chen
MDE
SLR
SSL
25
88
0
23 Dec 2023
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained
  Models and Bayesian Inference
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
30
3
0
16 Oct 2023
Test-Time Training for Speech
Test-Time Training for Speech
Sri Harsha Dumpala
Chandramouli Shama Sastry
Sageev Oore
39
1
0
19 Sep 2023
Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect
  Representations
Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations
Debaditya Shome
Ali Etemad
35
5
0
09 Sep 2023
Leveraging Label Information for Multimodal Emotion Recognition
Leveraging Label Information for Multimodal Emotion Recognition
Pei-Hsin Wang
Sunlu Zeng
Junqing Chen
Lu Fan
Meng Chen
Youzheng Wu
Xiaodong He
29
4
0
05 Sep 2023
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge
  in Speech Emotion Recognition
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
Haiyang Sun
Fulin Zhang
Yingying Gao
Zheng Lian
Shilei Zhang
Junlan Feng
30
4
0
12 Jun 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
27
2
0
24 May 2023
Recycle-and-Distill: Universal Compression Strategy for
  Transformer-based Speech SSL Models with Attention Map Reusing and Masking
  Distillation
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang
Sungnyun Kim
Se-Young Yun
Hoi-Rim Kim
32
5
0
19 May 2023
A multimodal dynamical variational autoencoder for audiovisual speech
  representation learning
A multimodal dynamical variational autoencoder for audiovisual speech representation learning
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
33
11
0
05 May 2023
A vector quantized masked autoencoder for audiovisual speech emotion recognition
A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
SSL
81
6
0
05 May 2023
A Comparative Study of Pre-trained Speech and Audio Embeddings for
  Speech Emotion Recognition
A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition
Orchid Chetia Phukan
Arun Balaji Buduru
Rajesh Sharma
28
6
0
22 Apr 2023
A vector quantized masked autoencoder for speech emotion recognition
A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok
Simon Leglaive
Renaud Séguier
40
20
0
21 Apr 2023
Designing and Evaluating Speech Emotion Recognition Systems: A reality
  check case study with IEMOCAP
Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP
Nikolaos Antoniou
Athanasios Katsamanis
Theodoros Giannakopoulos
Shrikanth Narayanan
34
17
0
03 Apr 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
46
47
0
21 Mar 2023
The Graph feature fusion technique for speaker recognition based on
  wav2vec2.0 framework
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
32
1
0
19 Mar 2023
Leveraging TCN and Transformer for effective visual-audio fusion in
  continuous emotion recognition
Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition
Weiwei Zhou
Jiada Lu
Zhaolong Xiong
Weifeng Wang
27
28
0
15 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and
  Elderly Speech Recognition
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
24
30
0
28 Feb 2023
Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For
  Multimodal Emotion Recognition
Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition
Dekai Sun
Yancheng He
Jiqing Han
22
19
0
27 Feb 2023
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition
Zihan Zhao
Yu Wang
Yanfeng Wang
22
18
0
20 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing
  Tasks
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Generative Models for Improved Naturalness, Intelligibility, and Voicing
  of Whispered Speech
Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech
Dominik Wagner
Sebastian P. Bayerl
H. A. C. Maruri
Tobias Bocklet
24
7
0
04 Dec 2022
Self-supervised learning with bi-label masked speech prediction for
  streaming multi-talker speech recognition
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
28
3
0
10 Nov 2022
Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words
Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words
Taesu Kim
Seungheon Doh
G. Lee
Hyungseok Jeon
Juhan Nam
Hyeon‐Jeong Suk
24
2
0
07 Nov 2022
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge
  Distillation
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Liyong Guo
Xiaoyu Yang
Quandong Wang
Yuxiang Kong
Zengwei Yao
...
Wei Kang
Long Lin
Mingshuang Luo
Piotr Żelasko
Daniel Povey
VLM
36
7
0
31 Oct 2022
Knowledge Transfer For On-Device Speech Emotion Recognition with Neural
  Structured Learning
Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning
Yi Chang
Zhao Ren
Thanh Tam Nguyen
Kun Qian
Björn W. Schuller
33
5
0
26 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
23
11
0
26 Oct 2022
Multilevel Transformer For Multimodal Emotion Recognition
Multilevel Transformer For Multimodal Emotion Recognition
Junyi He
Meimei Wu
Meng Li
Xiaobo Zhu
Feng Ye
15
6
0
26 Oct 2022
Training speech emotion classifier without categorical annotations
Training speech emotion classifier without categorical annotations
Meysam Shamsi
Marie Tahon
18
2
0
14 Oct 2022
Self-Supervised Attention Networks and Uncertainty Loss Weighting for
  Multi-Task Emotion Recognition on Vocal Bursts
Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts
Vincent Karas
Andreas Triantafyllopoulos
Meishu Song
Björn W. Schuller
38
4
0
15 Sep 2022
Generating Coherent Drum Accompaniment With Fills And Improvisations
Generating Coherent Drum Accompaniment With Fills And Improvisations
Rishabh A. Dahale
Vaibhav Talwadker
Preeti Rao
Prateek Verma
24
3
0
01 Sep 2022
DualVoice: Speech Interaction that Discriminates between Normal and
  Whispered Voice Input
DualVoice: Speech Interaction that Discriminates between Normal and Whispered Voice Input
Jun Rekimoto
27
6
0
22 Aug 2022
Fully Automated End-to-End Fake Audio Detection
Fully Automated End-to-End Fake Audio Detection
Chenglong Wang
Jiangyan Yi
J. Tao
Haiyang Sun
Xun Chen
Zhengkun Tian
Haoxin Ma
Cunhang Fan
Ruibo Fu
26
28
0
20 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion,
  Age, and Origin from Vocal Bursts
Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Atijit Anuchitanukul
Lucia Specia
VLM
30
6
0
24 Jun 2022
The Influence of Dataset Partitioning on Dysfluency Detection Systems
The Influence of Dataset Partitioning on Dysfluency Detection Systems
Sebastian P. Bayerl
Dominik Wagner
Elmar Nöth
Tobias Bocklet
Korbinian Riedhammer
44
20
0
07 Jun 2022
Learning Speech Emotion Representations in the Quaternion Domain
Learning Speech Emotion Representations in the Quaternion Domain
E. Guizzo
Tillman Weyde
Simone Scardapane
Danilo Comminiello
32
18
0
05 Apr 2022
Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection
Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection
S. A. Sheikh
Md. Sahidullah
F. Hirsch
Slim Ouni
22
17
0
04 Apr 2022
Anti-Spoofing Using Transfer Learning with Variational Information
  Bottleneck
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom
Yeonghyeon Lee
Ji Sub Um
Hoi-Rim Kim
35
25
0
04 Apr 2022
Visualizations of Complex Sequences of Family-Infant Vocalizations Using
  Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features
Jialu Li
M. Hasegawa-Johnson
Nancy L. McElwain
24
0
0
29 Mar 2022
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
M-SENA: An Integrated Platform for Multimodal Sentiment Analysis
Huisheng Mao
Ziqi Yuan
Hua Xu
Wenmeng Yu
Yihe Liu
Kai Gao
22
41
0
23 Mar 2022
KSoF: The Kassel State of Fluency Dataset -- A Therapy Centered Dataset
  of Stuttering
KSoF: The Kassel State of Fluency Dataset -- A Therapy Centered Dataset of Stuttering
Sebastian P. Bayerl
A. Gudenberg
Florian Honig
Elmar Nöth
Korbinian Riedhammer
29
35
0
10 Mar 2022
12
Next