Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04538
Cited By
What Makes Multi-modal Learning Better than Single (Provably)
8 June 2021
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"What Makes Multi-modal Learning Better than Single (Provably)"
43 / 43 papers shown
Title
Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation
Jiaqi Tan
Xu Zheng
Yuhang Liu
17
0
0
19 May 2025
Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables
Yu Gui
Cong Ma
Zongming Ma
SSL
26
0
0
18 May 2025
A Multimodal Multi-Agent Framework for Radiology Report Generation
Ziruo Yi
Ting Xiao
Mark V. Albert
MedIm
29
0
0
14 May 2025
Optimizing Mouse Dynamics for User Authentication by Machine Learning: Addressing Data Sufficiency, Accuracy-Practicality Trade-off, and Model Performance Challenges
Yi Wang
Chengyv Wu
Yang Liao
Maowei You
AAML
39
0
0
30 Apr 2025
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
87
0
0
30 Apr 2025
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
Mingkun Zhang
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
BDL
VLM
54
1
0
25 Feb 2025
Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images
Yuze Wang
Rong Xiao
Haifeng Li
Mariana Belgiu
Chao Tao
36
0
0
08 Jan 2025
GNN-Transformer Cooperative Architecture for Trustworthy Graph Contrastive Learning
Jianqing Liang
Xinkai Wei
Min Chen
Zhiqiang Wang
Jiye Liang
80
0
0
18 Dec 2024
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
J. Han
ViT
47
2
0
19 Oct 2024
CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction
Rong Han
Xiaohong Liu
Tong Pan
Jing Xu
Xiaoyu Wang
...
Zhenyu Li
Zixuan Wang
Jiangning Song
Guangyu Wang
Ting Chen
26
0
0
21 Aug 2024
Completed Feature Disentanglement Learning for Multimodal MRIs Analysis
Tianling Liu
Hongying Liu
Fanhua Shang
Lequan Yu
Tong Han
Liang Wan
51
1
0
06 Jul 2024
Predictive Dynamic Fusion
Bing Cao
Yinan Xia
Yi Ding
Changqing Zhang
Qinghua Hu
39
9
0
07 Jun 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
41
7
0
02 May 2024
Contrastive Learning on Multimodal Analysis of Electronic Health Records
Tianxi Cai
Feiqing Huang
Ryumei Nakada
Linjun Zhang
Doudou Zhou
61
0
0
22 Mar 2024
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity
Zhuo Zhi
Ziquan Liu
M. Elbadawi
Adam Daneshmend
Mine Orlu
Abdul Basit
Andreas Demosthenous
Miguel R. D. Rodrigues
36
2
0
14 Mar 2024
BronchoCopilot: Towards Autonomous Robotic Bronchoscopy via Multimodal Reinforcement Learning
Jianbo Zhao
Hao Chen
Qingyao Tian
Jian Chen
Bingyu Yang
Hongbin Liu
37
1
0
03 Mar 2024
Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning
Yuhang Liu
Zhen Zhang
Dong Gong
Erdun Gao
Biwei Huang
Anton Van Den Hengel
Kun Zhang
Javen Qinfeng Shi
Javen Qinfeng Shi
49
5
0
09 Feb 2024
Triple Disentangled Representation Learning for Multimodal Affective Analysis
Ying Zhou
Xuefeng Liang
Han Chen
Yin Zhao
Xin Chen
Lida Yu
52
3
0
29 Jan 2024
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Hugo Chan-To-Hing
B. Veeravalli
30
8
0
05 Jan 2024
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
47
2
0
08 Oct 2023
Missing-modality Enabled Multi-modal Fusion Architecture for Medical Data
Muyu Wang
Shiyu Fan
Yichen Li
Hui Chen
MedIm
17
1
0
27 Sep 2023
kTrans: Knowledge-Aware Transformer for Binary Code Embedding
Wenyu Zhu
Hao Wang
Yuchen Zhou
Jiaming Wang
Zihan Sha
Zeyu Gao
Chao Zhang
32
10
0
24 Aug 2023
Interpretation on Multi-modal Visual Fusion
Hao Chen
Hao Zhou
Yongjian Deng
39
0
0
19 Aug 2023
GiGaMAE: Generalizable Graph Masked Autoencoder via Collaborative Latent Space Reconstruction
Yucheng Shi
Yushun Dong
Qiaoyu Tan
Jundong Li
Ninghao Liu
48
25
0
18 Aug 2023
Divert More Attention to Vision-Language Object Tracking
Mingzhe Guo
Zhipeng Zhang
Li Jing
Haibin Ling
Heng Fan
VLM
42
3
0
19 Jul 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
26
2
0
15 Jun 2023
Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review
Asim Waqas
Aakash Tripathi
Ravichandran Ramachandran
Paul Stewart
Ghulam Rasool
AI4CE
39
32
0
11 Mar 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding
Shuhan Tan
Tushar Nagarajan
Kristen Grauman
26
21
0
05 Jan 2023
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Xin Ni
Yong Liu
Hao Wen
Yatai Ji
Jing Xiao
Yujiu Yang
37
9
0
09 Dec 2022
Towards Good Practices for Missing Modality Robust Action Recognition
Sangmin Woo
Sumin Lee
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
22
44
0
25 Nov 2022
HALSIE: Hybrid Approach to Learning Segmentation by Simultaneously Exploiting Image and Event Modalities
Shristi Das Biswas
Adarsh Kosta
C. Liyanagedera
M. Apolinario
Kaushik Roy
34
18
0
19 Nov 2022
Self-supervised remote sensing feature learning: Learning Paradigms, Challenges, and Future Works
Chao Tao
Ji Qi
Mingning Guo
Qing Zhu
Haifeng Li
SSL
29
56
0
15 Nov 2022
Greedy Modality Selection via Approximate Submodular Maximization
Runxiang Cheng
Gargi Balasubramaniam
Yifei He
Yao-Hung Hubert Tsai
Han Zhao
21
1
0
22 Oct 2022
MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy
Yuhao Chen
Hayden Gunraj
E. Z. Zeng
Robbie Meyer
Maximilian Gilles
Alexander Wong
37
1
0
19 Oct 2022
Uncertainty Estimation for Multi-view Data: The Power of Seeing the Whole Picture
M. Jung
He Zhao
Joanna Dipnall
Belinda Gabbe
Lan Du
UQCV
EDL
67
12
0
06 Oct 2022
Visual Grounding of Inter-lingual Word-Embeddings
W. Mohammed
Hassan Shahmohammadi
Hendrik P. A. Lensch
R. Baayen
13
1
0
08 Sep 2022
Modality Mixer for Multi-modal Action Recognition
Sumin Lee
Sangmin Woo
Yeonju Park
Muhammad Adi Nugroho
Changick Kim
26
10
0
24 Aug 2022
Divert More Attention to Vision-Language Tracking
Mingzhe Guo
Zhipeng Zhang
Heng Fan
Li Jing
29
53
0
03 Jul 2022
More to Less (M2L): Enhanced Health Recognition in the Wild with Reduced Modality of Wearable Sensors
Huiyuan Yang
Han Yu
K. Sridhar
T. Vaessen
I. Myin‐Germeys
Akane Sano
23
7
0
16 Feb 2022
M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
Xiao Dong
Xunlin Zhan
Yangxin Wu
Yunchao Wei
Michael C. Kampffmeyer
Xiaoyong Wei
Minlong Lu
Yaowei Wang
Xiaodan Liang
33
37
0
09 Sep 2021
Deep Continuous Fusion for Multi-Sensor 3D Object Detection
Ming Liang
Binh Yang
Shenlong Wang
R. Urtasun
3DPC
208
840
0
20 Dec 2020
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Itai Gat
Idan Schwartz
A. Schwing
Tamir Hazan
60
90
0
21 Oct 2020
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
127
577
0
27 Feb 2015
1