Transferring Pre-trained Multimodal Representations with Cross-modal
Similarity Matching

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching

7 January 2023

Papers citing "Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching"

16 / 16 papers shown

Title
Continual Learning on CLIP via Incremental Prompt Tuning with Intrinsic Textual Anchors Haodong Lu Xinyu Zhang Kristen Moore Jason Xue Lina Yao Anton van den Hengel Dong Gong CLL VLM 5 0 0 27 May 2025
Expanding Event Modality Applications through a Robust CLIP-Based Encoder SungHeon Jeong Hanning Chen Sanggeon Yun Suhyeon Cho Wenjun Huang Xiangjian Liu Mohsen Imani 117 2 0 04 Dec 2024
What to align in multimodal contrastive learning? Benoit Dufumier J. Castillo-Navarro D. Tuia Jean-Philippe Thiran 41 4 0 11 Sep 2024
CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination Kaicheng Yang Tiancheng Gu Xiang An Haiqiang Jiang Xiangzi Dai Ziyong Feng Weidong Cai Jiankang Deng VLM 68 8 0 18 Aug 2024
Multi-modal Relation Distillation for Unified 3D Representation Learning Huiqun Wang Yiping Bao Panwang Pan Zeming Li Xiao Liu Ruijie Yang Di Huang 61 0 0 19 Jul 2024
Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model Zichang Liu Qingyun Liu Yuening Li Liang Liu Anshumali Shrivastava Shuchao Bi Lichan Hong Ed H. Chi Zhe Zhao VLM 49 4 0 21 Feb 2024
Learning on Multimodal Graphs: A Survey Ciyuan Peng Jiayuan He Feng Xia 64 7 0 07 Feb 2024
Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception Tianlin Li Wentao Wu Chenglong Li Zhicheng Zhao Zhe Chen Yukai Shi Jin Tang 54 4 0 15 Dec 2023
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy Paul Pu Liang Zihao Deng Martin Q. Ma James Zou Louis-Philippe Morency Ruslan Salakhutdinov SSL 39 51 0 08 Jun 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models Ximeng Sun Pengchuan Zhang Peizhao Zhang Hardik Shah Kate Saenko Xide Xia VLM 47 20 0 31 Mar 2023
ImageNet-21K Pretraining for the Masses T. Ridnik Emanuel Ben-Baruch Asaf Noy Lihi Zelnik-Manor SSeg VLM CLIP 220 691 0 22 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning Brian Lester Rami Al-Rfou Noah Constant VPVLM 301 3,917 0 18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 356 3,760 0 11 Feb 2021
SEED: Self-supervised Distillation For Visual Representation Zhiyuan Fang Jianfeng Wang Lijuan Wang Lei Zhang Yezhou Yang Zicheng Liu SSL 250 190 0 12 Jan 2021
Improved Baselines with Momentum Contrastive Learning Xinlei Chen Haoqi Fan Ross B. Girshick Kaiming He SSL 322 3,389 0 09 Mar 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 955 20,660 0 17 Apr 2017