Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.05952
Cited By
Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning
10 March 2023
Qian Jiang
Changyou Chen
Han Zhao
Liqun Chen
Q. Ping
S. D. Tran
Yi Xu
Belinda Zeng
Trishul Chilimbi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning"
28 / 28 papers shown
Title
Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
Pengfei Zhao
Rongbo Luan
Wei Zhang
Peng Wu
Sifeng He
15
0
0
08 Jun 2025
Learning Shared Representations from Unpaired Data
Amitai Yacobi
Nir Ben-Ari
Ronen Talmon
Uri Shaham
SSL
72
0
0
23 May 2025
DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation
Ziyu Zhao
Xiaoguang Li
Linjia Shi
Nasrin Imanpour
Song Wang
VLM
66
0
0
16 May 2025
Synergy-CLIP: Extending CLIP with Multi-modal Integration for Robust Representation Learning
Sangyeon Cho
Jangyeong Jeon
Mingi Kim
Junyeong Kim
CLIP
VLM
239
0
0
30 Apr 2025
Position: Beyond Euclidean -- Foundation Models Should Embrace Non-Euclidean Geometries
Neil He
Jiahong Liu
Buze Zhang
N. Bui
Ali Maatouk
Menglin Yang
Irwin King
Melanie Weber
Rex Ying
79
1
0
11 Apr 2025
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Jungkyoo Shin
Bumsoo Kim
Eunwoo Kim
96
1
0
21 Mar 2025
Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images
Yuze Wang
Rong Xiao
Haifeng Li
Mariana Belgiu
Chao Tao
105
0
0
08 Jan 2025
Neural-MCRL: Neural Multimodal Contrastive Representation Learning for EEG-based Visual Decoding
Yueyang Li
Zijian Kang
Shengyu Gong
Wenhao Dong
Weiming Zeng
Hongjie Yan
W. Siok
Nizhuan Wang
119
2
0
23 Dec 2024
I0T: Embedding Standardization Method Towards Zero Modality Gap
Na Min An
Eunki Kim
James Thorne
Hyunjung Shim
VLM
115
1
0
18 Dec 2024
Generalized Multimodal Fusion via Poisson-Nernst-Planck Equation
Jiayu Xiong
Jing Wang
Hengjing Xiang
Jun Xue
Chen Xu
Zhouqiang Jiang
57
0
0
20 Oct 2024
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning
William A. Stigall
117
0
0
14 Oct 2024
Ordinal Preference Optimization: Aligning Human Preferences via NDCG
Yang Zhao
Yixin Wang
Mingzhang Yin
87
2
0
06 Oct 2024
Learning Multimodal Latent Generative Models with Energy-Based Prior
Shiyu Yuan
Jiali Cui
Hanao Li
Tian Han
58
1
0
30 Sep 2024
Fusion in Context: A Multimodal Approach to Affective State Recognition
Youssef Mohamed
Séverin Lemaignan
Arzu Guneysu
Patric Jensfelt
Christian Smith
69
0
0
18 Sep 2024
Visual Neural Decoding via Improved Visual-EEG Semantic Consistency
Hongzhou Chen
Lianghua He
Yihang Liu
Longzhen Yang
63
1
0
13 Aug 2024
X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
S. Swetha
Jinyu Yang
T. Neiman
Mamshad Nayeem Rizve
Son Tran
Benjamin Z. Yao
Trishul Chilimbi
Mubarak Shah
105
2
0
18 Jul 2024
Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP
Sedigheh Eslami
Gerard de Melo
VLM
78
5
0
25 Jun 2024
Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning
Zihua Zhao
Mengxi Chen
Tianjie Dai
Jiangchao Yao
Bo han
Ya Zhang
Yanfeng Wang
NoLa
94
6
0
27 May 2024
From Orthogonality to Dependency: Learning Disentangled Representation for Multi-Modal Time-Series Sensing Signals
Ruichu Cai
Zhifan Jiang
Zijian Li
Weilin Chen
Xuexin Chen
Zhifeng Hao
Yifan Shen
Guan-Hong Chen
Kun Zhang
116
1
0
25 May 2024
Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation
Xinyao Li
Yuke Li
Zhekai Du
Fengling Li
Ke Lu
Jingjing Li
VLM
86
5
0
11 Mar 2024
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation
Zhekai Du
Xinyao Li
Fengling Li
Ke Lu
Lei Zhu
Jingjing Li
81
18
0
05 Mar 2024
ArcSin: Adaptive ranged cosine Similarity injected noise for Language-Driven Visual Tasks
Yang Liu
Xiaomin Yu
Gongyu Zhang
Christos Bergeles
Prokar Dasgupta
Alejandro Granados
Sebastien Ourselin
66
2
0
27 Feb 2024
Improving Cross-modal Alignment with Synthetic Pairs for Text-only Image Captioning
Zhiyue Liu
Jinyuan Liu
Fanrong Ma
CLIP
VLM
78
11
0
14 Dec 2023
C3Net: Compound Conditioned ControlNet for Multimodal Content Generation
Juntao Zhang
Yuehuai Liu
Yu-Wing Tai
Chi-Keung Tang
DiffM
76
5
0
29 Nov 2023
SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization
Hao Dong
Ismail Nejjar
Han Sun
Eleni Chatzi
Olga Fink
99
25
0
30 Oct 2023
An Empirical Study of Self-supervised Learning with Wasserstein Distance
Makoto Yamada
Yuki Takezawa
Guillaume Houry
Kira Michaela Dusterwald
Deborah Sulem
Han Zhao
Yao-Hung Hubert Tsai
57
1
0
16 Oct 2023
Multi-Modal Representation Learning with Text-Driven Soft Masks
Jaeyoo Park
Bohyung Han
SSL
49
4
0
03 Apr 2023
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
343
1,058
0
09 Oct 2021
1