Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1704.03152
Cited By
Deep Multimodal Representation Learning from Temporal Data
11 April 2017
Xitong Yang
Palghat Ramesh
Radha Chitta
S. Madhvanath
Edgar A. Bernal
Jiebo Luo
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Multimodal Representation Learning from Temporal Data"
13 / 13 papers shown
Title
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
76
0
0
30 Apr 2025
Audio-Visual Speaker Verification via Joint Cross-Attention
R Gnana Praveen
Jahangir Alam
34
6
0
28 Sep 2023
Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations
Yilun Hao
Ruinan Wang
Zhangjie Cao
Zihan Wang
Yuchen Cui
Dorsa Sadigh
29
2
0
16 Sep 2022
COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition
M. Tellamekala
Shahin Amiriparian
Björn W. Schuller
Elisabeth André
T. Giesbrecht
M. Valstar
28
25
0
12 Jun 2022
Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations
Aishik Konwer
Xuan Xu
Joseph Bae
Chaoyu Chen
Prateek Prasanna
MedIm
36
15
0
02 Mar 2022
Cross Attentional Audio-Visual Fusion for Dimensional Emotion Recognition
R Gnana Praveen
Eric Granger
P. Cardinal
CVBM
25
40
0
09 Nov 2021
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
29
139
0
17 May 2021
Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors
Michelle A. Lee
Matthew Tan
Yuke Zhu
Jeannette Bohg
46
25
0
01 Dec 2020
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
Peter Zachares
Matthew Tan
K. Srinivasan
Silvio Savarese
Fei-Fei Li
Animesh Garg
Jeannette Bohg
SSL
23
207
0
28 Jul 2019
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu
Feiping Nie
Xuelong Li
32
43
0
08 Oct 2018
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
Qing Guo
Yuan-fang Wang
William Yang Wang
11
76
0
15 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
33
425
0
23 Mar 2018
Zero-Shot Deep Domain Adaptation
Kuan-Chuan Peng
Ziyan Wu
Jan Ernst
VLM
19
87
0
06 Jul 2017
1