ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04538
  4. Cited By
What Makes Multi-modal Learning Better than Single (Provably)

What Makes Multi-modal Learning Better than Single (Provably)

8 June 2021
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
ArXivPDFHTML

Papers citing "What Makes Multi-modal Learning Better than Single (Provably)"

43 / 43 papers shown
Title
Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation
Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation
Jiaqi Tan
Xu Zheng
Yuhang Liu
17
0
0
19 May 2025
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Yu Gui
Cong Ma
Zongming Ma
SSL
33
0
0
18 May 2025
A Multimodal Multi-Agent Framework for Radiology Report Generation
A Multimodal Multi-Agent Framework for Radiology Report Generation
Ziruo Yi
Ting Xiao
Mark V. Albert
MedIm
29
0
0
14 May 2025
Optimizing Mouse Dynamics for User Authentication by Machine Learning: Addressing Data Sufficiency, Accuracy-Practicality Trade-off, and Model Performance Challenges
Optimizing Mouse Dynamics for User Authentication by Machine Learning: Addressing Data Sufficiency, Accuracy-Practicality Trade-off, and Model Performance Challenges
Yi Wang
Chengyv Wu
Yang Liao
Maowei You
AAML
39
0
0
30 Apr 2025
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
89
0
0
30 Apr 2025
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
Mingkun Zhang
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
BDL
VLM
54
1
0
25 Feb 2025
Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images
Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images
Yuze Wang
Rong Xiao
Haifeng Li
Mariana Belgiu
Chao Tao
36
0
0
08 Jan 2025
GNN-Transformer Cooperative Architecture for Trustworthy Graph Contrastive Learning
GNN-Transformer Cooperative Architecture for Trustworthy Graph Contrastive Learning
Jianqing Liang
Xinkai Wei
Min Chen
Zhiqiang Wang
Jiye Liang
83
0
0
18 Dec 2024
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
Jiawei Han
ViT
47
2
0
19 Oct 2024
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han
Xiaohong Liu
Tong Pan
Jing Xu
Xiaoyu Wang
...
Zhenyu Li
Zixuan Wang
Jiangning Song
Guangyu Wang
Ting Chen
28
0
0
21 Aug 2024
Completed Feature Disentanglement Learning for Multimodal MRIs Analysis
Completed Feature Disentanglement Learning for Multimodal MRIs Analysis
Tianling Liu
Hongying Liu
Fanhua Shang
Lequan Yu
Tong Han
Liang Wan
54
1
0
06 Jul 2024
Predictive Dynamic Fusion
Predictive Dynamic Fusion
Bing Cao
Yinan Xia
Yi Ding
Changqing Zhang
Qinghua Hu
39
9
0
07 Jun 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
41
7
0
02 May 2024
Contrastive Learning on Multimodal Analysis of Electronic Health Records
Contrastive Learning on Multimodal Analysis of Electronic Health Records
Tianxi Cai
Feiqing Huang
Ryumei Nakada
Linjun Zhang
Doudou Zhou
61
0
0
22 Mar 2024
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal
  Learning with Missing Modalities and Data Scarcity
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity
Zhuo Zhi
Ziquan Liu
M. Elbadawi
Adam Daneshmend
Mine Orlu
Abdul Basit
Andreas Demosthenous
Miguel R. D. Rodrigues
36
2
0
14 Mar 2024
BronchoCopilot: Towards Autonomous Robotic Bronchoscopy via Multimodal
  Reinforcement Learning
BronchoCopilot: Towards Autonomous Robotic Bronchoscopy via Multimodal Reinforcement Learning
Jianbo Zhao
Hao Chen
Qingyao Tian
Jian Chen
Bingyu Yang
Hongbin Liu
37
1
0
03 Mar 2024
Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning
Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning
Yuhang Liu
Zhen Zhang
Dong Gong
Erdun Gao
Biwei Huang
Anton Van Den Hengel
Kun Zhang
Javen Qinfeng Shi
Javen Qinfeng Shi
52
5
0
09 Feb 2024
Triple Disentangled Representation Learning for Multimodal Affective
  Analysis
Triple Disentangled Representation Learning for Multimodal Affective Analysis
Ying Zhou
Xuefeng Liang
Han Chen
Yin Zhao
Xin Chen
Lida Yu
52
3
0
29 Jan 2024
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Hugo Chan-To-Hing
B. Veeravalli
30
8
0
05 Jan 2024
Improving Discriminative Multi-Modal Learning with Large-Scale
  Pre-Trained Models
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
47
2
0
08 Oct 2023
Missing-modality Enabled Multi-modal Fusion Architecture for Medical
  Data
Missing-modality Enabled Multi-modal Fusion Architecture for Medical Data
Muyu Wang
Shiyu Fan
Yichen Li
Hui Chen
MedIm
17
1
0
27 Sep 2023
kTrans: Knowledge-Aware Transformer for Binary Code Embedding
kTrans: Knowledge-Aware Transformer for Binary Code Embedding
Wenyu Zhu
Hao Wang
Yuchen Zhou
Jiaming Wang
Zihan Sha
Zeyu Gao
Chao Zhang
32
10
0
24 Aug 2023
Interpretation on Multi-modal Visual Fusion
Interpretation on Multi-modal Visual Fusion
Hao Chen
Hao Zhou
Yongjian Deng
39
0
0
19 Aug 2023
GiGaMAE: Generalizable Graph Masked Autoencoder via Collaborative Latent
  Space Reconstruction
GiGaMAE: Generalizable Graph Masked Autoencoder via Collaborative Latent Space Reconstruction
Yucheng Shi
Yushun Dong
Qiaoyu Tan
Jundong Li
Ninghao Liu
50
25
0
18 Aug 2023
Divert More Attention to Vision-Language Object Tracking
Divert More Attention to Vision-Language Object Tracking
Mingzhe Guo
Zhipeng Zhang
Li Jing
Haibin Ling
Heng Fan
VLM
42
3
0
19 Jul 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing
  Modality Fusion
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
26
2
0
15 Jun 2023
Multimodal Data Integration for Oncology in the Era of Deep Neural
  Networks: A Review
Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review
Asim Waqas
Aakash Tripathi
Ravichandran Ramachandran
Paul Stewart
Ghulam Rasool
AI4CE
42
32
0
11 Mar 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video
  Understanding
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
28
21
0
05 Jan 2023
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Xin Ni
Yong Liu
Hao Wen
Yatai Ji
Jing Xiao
Yujiu Yang
37
9
0
09 Dec 2022
Towards Good Practices for Missing Modality Robust Action Recognition
Towards Good Practices for Missing Modality Robust Action Recognition
Sangmin Woo
Sumin Lee
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
24
44
0
25 Nov 2022
HALSIE: Hybrid Approach to Learning Segmentation by Simultaneously
  Exploiting Image and Event Modalities
HALSIE: Hybrid Approach to Learning Segmentation by Simultaneously Exploiting Image and Event Modalities
Shristi Das Biswas
Adarsh Kosta
C. Liyanagedera
M. Apolinario
Kaushik Roy
34
18
0
19 Nov 2022
Self-supervised remote sensing feature learning: Learning Paradigms,
  Challenges, and Future Works
Self-supervised remote sensing feature learning: Learning Paradigms, Challenges, and Future Works
Chao Tao
Ji Qi
Mingning Guo
Qing Zhu
Haifeng Li
SSL
31
56
0
15 Nov 2022
Greedy Modality Selection via Approximate Submodular Maximization
Greedy Modality Selection via Approximate Submodular Maximization
Runxiang Cheng
Gargi Balasubramaniam
Yifei He
Yao-Hung Hubert Tsai
Han Zhao
24
1
0
22 Oct 2022
MMRNet: Improving Reliability for Multimodal Object Detection and
  Segmentation for Bin Picking via Multimodal Redundancy
MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy
Yuhao Chen
Hayden Gunraj
E. Z. Zeng
Robbie Meyer
Maximilian Gilles
Alexander Wong
37
1
0
19 Oct 2022
Uncertainty Estimation for Multi-view Data: The Power of Seeing the
  Whole Picture
Uncertainty Estimation for Multi-view Data: The Power of Seeing the Whole Picture
M. Jung
He Zhao
Joanna Dipnall
Belinda Gabbe
Lan Du
UQCV
EDL
67
12
0
06 Oct 2022
Visual Grounding of Inter-lingual Word-Embeddings
Visual Grounding of Inter-lingual Word-Embeddings
W. Mohammed
Hassan Shahmohammadi
Hendrik P. A. Lensch
R. Baayen
13
1
0
08 Sep 2022
Modality Mixer for Multi-modal Action Recognition
Modality Mixer for Multi-modal Action Recognition
Sumin Lee
Sangmin Woo
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
26
10
0
24 Aug 2022
Divert More Attention to Vision-Language Tracking
Divert More Attention to Vision-Language Tracking
Mingzhe Guo
Zhipeng Zhang
Heng Fan
Li Jing
29
53
0
03 Jul 2022
More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced
  Modality of Wearable Sensors
More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced Modality of Wearable Sensors
Huiyuan Yang
Han Yu
K. Sridhar
T. Vaessen
I. Myin‐Germeys
Akane Sano
23
7
0
16 Feb 2022
M5Product: Self-harmonized Contrastive Learning for E-commercial
  Multi-modal Pretraining
M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
Xiao Dong
Xunlin Zhan
Yangxin Wu
Yunchao Wei
Michael C. Kampffmeyer
Xiaoyong Wei
Minlong Lu
Yaowei Wang
Xiaodan Liang
33
37
0
09 Sep 2021
Deep Continuous Fusion for Multi-Sensor 3D Object Detection
Deep Continuous Fusion for Multi-Sensor 3D Object Detection
Ming Liang
Binh Yang
Shenlong Wang
R. Urtasun
3DPC
208
840
0
20 Dec 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
  Functional Entropies
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
Alex Schwing
Tamir Hazan
60
90
0
21 Oct 2020
Norm-Based Capacity Control in Neural Networks
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
127
577
0
27 Feb 2015
1