Deep Multimodal Representation Learning from Temporal Data

Deep Multimodal Representation Learning from Temporal Data

11 April 2017

Edgar A. Bernal

Papers citing "Deep Multimodal Representation Learning from Temporal Data"

13 / 13 papers shown

Title
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning Sangyeon Cho Jangyeong Jeon Mingi Kim Junyeong Kim CLIP VLM 76 0 0 30 Apr 2025
Audio-Visual Speaker Verification via Joint Cross-Attention R Gnana Praveen Jahangir Alam 34 6 0 28 Sep 2023
Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations Yilun Hao Ruinan Wang Zhangjie Cao Zihan Wang Yuchen Cui Dorsa Sadigh 29 2 0 16 Sep 2022
COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition M. Tellamekala Shahin Amiriparian Björn W. Schuller Elisabeth André T. Giesbrecht M. Valstar 28 25 0 12 Jun 2022
Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations Aishik Konwer Xuan Xu Joseph Bae Chaoyu Chen Prateek Prasanna MedIm 36 15 0 02 Mar 2022
Cross Attentional Audio-Visual Fusion for Dimensional Emotion Recognition R Gnana Praveen Eric Granger P. Cardinal CVBM 25 40 0 09 Nov 2021
A Review on Explainability in Multimodal Deep Neural Nets Gargi Joshi Rahee Walambe K. Kotecha 29 139 0 17 May 2021
Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors Michelle A. Lee Matthew Tan Yuke Zhu Jeannette Bohg 46 25 0 01 Dec 2020
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks Michelle A. Lee Yuke Zhu Peter Zachares Matthew Tan K. Srinivasan Silvio Savarese Fei-Fei Li Animesh Garg Jeannette Bohg SSL 23 207 0 28 Jul 2019
Dense Multimodal Fusion for Hierarchically Joint Representation Di Hu Feiping Nie Xuelong Li 32 43 0 08 Oct 2018
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning Qing Guo Yuan-fang Wang William Yang Wang 11 76 0 15 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos Yapeng Tian Jing Shi Bochen Li Zhiyao Duan Chenliang Xu 33 425 0 23 Mar 2018
Zero-Shot Deep Domain Adaptation Kuan-Chuan Peng Ziyan Wu Jan Ernst VLM 19 87 0 06 Jul 2017