ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.10202
  4. Cited By
Multimodal Emotion Recognition with High-level Speech and Text Features

Multimodal Emotion Recognition with High-level Speech and Text Features

29 September 2021
M. R. Makiuchi
Kuniaki Uto
Koichi Shinoda
ArXiv (abs)PDFHTML

Papers citing "Multimodal Emotion Recognition with High-level Speech and Text Features"

24 / 24 papers shown
Title
"Yeah Right!" -- Do LLMs Exhibit Multimodal Feature Transfer?
"Yeah Right!" -- Do LLMs Exhibit Multimodal Feature Transfer?
Benjamin Z. Reichman
Kartik Talamadupula
84
0
0
07 Jan 2025
Fusion approaches for emotion recognition from speech using acoustic and
  text-based features
Fusion approaches for emotion recognition from speech using acoustic and text-based features
L. Pepino
Pablo Riera
Luciana Ferrer
Agustin Gravano
70
49
0
27 Mar 2024
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
Puneet Kumar
Sarthak Malik
Balasubramanian Raman
Xiaobai Li
113
2
0
24 Aug 2022
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings
Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings
L. Pepino
Pablo Riera
Luciana Ferrer
67
363
0
08 Apr 2021
On the use of Self-supervised Pre-trained Acoustic and Linguistic
  Features for Continuous Speech Emotion Recognition
On the use of Self-supervised Pre-trained Acoustic and Linguistic Features for Continuous Speech Emotion Recognition
Manon Macary
Marie Tahon
Yannick Esteve
Anthony Rousseau
SSL
53
55
0
18 Nov 2020
Emotion recognition by fusing time synchronous and time asynchronous
  representations
Emotion recognition by fusing time synchronous and time asynchronous representations
Wen Wu
Chao Zhang
P. Woodland
56
67
0
27 Oct 2020
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve
  Multimodal Speech Emotion Recognition
Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Shamane Siriwardhana
Andrew Reis
Rivindu Weerasekera
Suranga Nanayakkara
63
112
0
15 Aug 2020
Advancing Multiple Instance Learning with Attention Modeling for
  Categorical Speech Emotion Recognition
Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition
Shuiyang Mao
P. Ching
C.-C. Jay Kuo
Tan Lee
32
11
0
15 Aug 2020
Transformer based unsupervised pre-training for acoustic representation
  learning
Transformer based unsupervised pre-training for acoustic representation learning
Ruixiong Zhang
Haiwei Wu
Wubo Li
Dongwei Jiang
Wei Zou
Xiangang Li
SSLViT
56
27
0
29 Jul 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
282
5,801
0
20 Jun 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Speech Decomposition via Triple Information Bottleneck
Kaizhi Qian
Yang Zhang
Shiyu Chang
David D. Cox
M. Hasegawa-Johnson
82
184
0
23 Apr 2020
Speaker-invariant Affective Representation Learning via Adversarial
  Training
Speaker-invariant Affective Representation Learning via Adversarial Training
Haoqi Li
Ming Tu
Jing-ling Huang
Shrikanth Narayanan
P. Georgiou
66
56
0
04 Nov 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
232
8,433
0
19 Jun 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
81
465
0
14 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
94,891
0
11 Oct 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
353
2,279
0
14 Jun 2018
Exploring Disentangled Feature Representation Beyond Face Identification
Exploring Disentangled Feature Representation Beyond Face Identification
Yu Liu
Fangyin Wei
Jing Shao
Lu Sheng
Junjie Yan
Xiaogang Wang
CoGeCVBM
53
156
0
10 Apr 2018
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with
  Tacotron
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
RJ Skerry-Ryan
Eric Battenberg
Y. Xiao
Yuxuan Wang
Daisy Stanton
Joel Shor
Ron J. Weiss
R. Clark
Rif A. Saurous
54
554
0
24 Mar 2018
Generalized End-to-End Loss for Speaker Verification
Generalized End-to-End Loss for Speaker Verification
Li Wan
Quan Wang
Alan Papir
Ignacio López Moreno
VLM
68
927
0
28 Oct 2017
VoxCeleb: a large-scale speaker identification dataset
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
125
2,274
0
26 Jun 2017
Unsupervised Learning of Disentangled Representations from Video
Unsupervised Learning of Disentangled Representations from Video
Emily L. Denton
Vighnesh Birodkar
DRLCoGeOOD
76
552
0
31 May 2017
WaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alex Graves
Nal Kalchbrenner
A. Senior
Koray Kavukcuoglu
DiffM
406
7,399
0
12 Sep 2016
Listen, Attend and Spell
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
156
2,266
0
05 Aug 2015
Explaining and Harnessing Adversarial Examples
Explaining and Harnessing Adversarial Examples
Ian Goodfellow
Jonathon Shlens
Christian Szegedy
AAMLGAN
277
19,066
0
20 Dec 2014
1