ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09331
  4. Cited By
Contrastive Vision-Language Pre-training with Limited Resources

Contrastive Vision-Language Pre-training with Limited Resources

17 December 2021
Quan Cui
Boyan Zhou
Yu Guo
Weidong Yin
Hao Wu
Osamu Yoshie
Yubo Chen
    VLM
    CLIP
ArXivPDFHTML

Papers citing "Contrastive Vision-Language Pre-training with Limited Resources"

18 / 18 papers shown
Title
A Survey of Low-shot Vision-Language Model Adaptation via Representer
  Theorem
A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Kun Ding
Ying Wang
Gaofeng Meng
Shiming Xiang
VLM
29
0
0
15 Oct 2024
Residual Policy Learning for Perceptive Quadruped Control Using
  Differentiable Simulation
Residual Policy Learning for Perceptive Quadruped Control Using Differentiable Simulation
Jing Yuan Luo
Yunlong Song
Victor Klemm
Fan Shi
Davide Scaramuzza
Marco Hutter
31
1
0
04 Oct 2024
Image Copy Detection for Diffusion Models
Image Copy Detection for Diffusion Models
Wenhao Wang
Yifan Sun
Zhentao Tan
Yi Yang
30
1
0
30 Sep 2024
Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models
Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models
Shuai Fu
Xiequn Wang
Qiushi Huang
Yu Zhang
VLM
39
2
0
26 Aug 2024
DPA: Dual Prototypes Alignment for Unsupervised Adaptation of
  Vision-Language Models
DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models
Eman Ali
Sathira Silva
Muhammad Haris Khan
VLM
29
0
0
16 Aug 2024
On the Element-Wise Representation and Reasoning in Zero-Shot Image
  Recognition: A Systematic Survey
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey
Jingcai Guo
Zhijie Rao
Zhi Chen
Song Guo
Jingren Zhou
Dacheng Tao
33
3
0
09 Aug 2024
Binding Touch to Everything: Learning Unified Multimodal Tactile
  Representations
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
Fengyu Yang
Chao Feng
Ziyang Chen
Hyoungseob Park
Daniel Wang
...
Ziyao Zeng
Xien Chen
Rit Gangopadhyay
Andrew Owens
Alex Wong
38
53
0
31 Jan 2024
Domain Prompt Learning with Quaternion Networks
Domain Prompt Learning with Quaternion Networks
Qinglong Cao
Zhengqin Xu
Yuntian Chen
Chao Ma
Xiaokang Yang
VLM
31
10
0
12 Dec 2023
BDC-Adapter: Brownian Distance Covariance for Better Vision-Language
  Reasoning
BDC-Adapter: Brownian Distance Covariance for Better Vision-Language Reasoning
Yi Zhang
Ce Zhang
Zihan Liao
Yushun Tang
Zhihai He
BDL
VLM
18
10
0
03 Sep 2023
Cross-Modal Concept Learning and Inference for Vision-Language Models
Cross-Modal Concept Learning and Inference for Vision-Language Models
Yi Zhang
Ce Zhang
Yushun Tang
Z. He
VLM
MLLM
CLIP
25
15
0
28 Jul 2023
Multi-Modal Representation Learning with Text-Driven Soft Masks
Multi-Modal Representation Learning with Text-Driven Soft Masks
Jaeyoo Park
Bohyung Han
SSL
24
4
0
03 Apr 2023
Vision-Language Models for Vision Tasks: A Survey
Vision-Language Models for Vision Tasks: A Survey
Jingyi Zhang
Jiaxing Huang
Sheng Jin
Shijian Lu
VLM
41
479
0
03 Apr 2023
Self-Supervised Multimodal Learning: A Survey
Self-Supervised Multimodal Learning: A Survey
Yongshuo Zong
Oisin Mac Aodha
Timothy M. Hospedales
SSL
19
43
0
31 Mar 2023
Vision Learners Meet Web Image-Text Pairs
Vision Learners Meet Web Image-Text Pairs
Bingchen Zhao
Quan Cui
Hao Wu
Osamu Yoshie
Cheng Yang
Oisin Mac Aodha
VLM
24
5
0
17 Jan 2023
Dynamic Contrastive Distillation for Image-Text Retrieval
Dynamic Contrastive Distillation for Image-Text Retrieval
Jun Rao
Liang Ding
Shuhan Qi
Meng Fang
Yang Liu
Liqiong Shen
Dacheng Tao
VLM
53
30
0
04 Jul 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,434
0
11 Nov 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,081
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,693
0
11 Feb 2021
1