ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.05208
  4. Cited By
Supervision Exists Everywhere: A Data Efficient Contrastive
  Language-Image Pre-training Paradigm

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

11 October 2021
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
    VLM
    CLIP
ArXivPDFHTML

Papers citing "Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm"

50 / 324 papers shown
Title
OpenFashionCLIP: Vision-and-Language Contrastive Learning with
  Open-Source Fashion Data
OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data
Giuseppe Cartella
Alberto Baldrati
Davide Morelli
Marcella Cornia
Marco Bertini
Rita Cucchiara
VLM
CLIP
29
7
0
11 Sep 2023
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge
  Distillation at Multiple Levels
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
Bo Wan
Tinne Tuytelaars
VLM
29
3
0
10 Sep 2023
LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for
  Vision-Language Models
LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models
Cheng Shi
Sibei Yang
VLM
24
21
0
03 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
  Large Model
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
33
19
0
02 Sep 2023
Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields
Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields
H. Song
Seokhun Choi
Hoseok Do
Chul Lee
Taehyeong Kim
DiffM
33
24
0
23 Aug 2023
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive
  Language-Image Pre-training
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training
Xi Deng
Han Shi
Runhu Huang
Changlin Li
Hang Xu
Jianhua Han
James T. Kwok
Shen Zhao
Wei Zhang
Xiaodan Liang
CLIP
VLM
29
3
0
22 Aug 2023
DPL: Decoupled Prompt Learning for Vision-Language Models
DPL: Decoupled Prompt Learning for Vision-Language Models
C. Xu
Yuhan Zhu
Guozhen Zhang
Haocheng Shen
Yixuan Liao
Xiaoxin Chen
Gangshan Wu
Limin Wang
VLM
29
4
0
19 Aug 2023
An Empirical Study of CLIP for Text-based Person Search
An Empirical Study of CLIP for Text-based Person Search
Min Cao
Yang Bai
Ziyin Zeng
Mang Ye
Min Zhang
VLM
49
36
0
19 Aug 2023
ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
ALIP: Adaptive Language-Image Pre-training with Synthetic Caption
Kaicheng Yang
Jiankang Deng
Xiang An
Jiawei Li
Ziyong Feng
Jia Guo
Jing Yang
Tongliang Liu
VLM
CLIP
48
45
0
16 Aug 2023
Exploring Transfer Learning in Medical Image Segmentation using
  Vision-Language Models
Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models
K. Poudel
Manish Dhakal
Prasiddha Bhandari
Rabin Adhikari
Safal Thapaliya
Bishesh Khanal
VLM
30
17
0
15 Aug 2023
FoodSAM: Any Food Segmentation
FoodSAM: Any Food Segmentation
Xing Lan
Jiayi Lyu
Han Jiang
Kunkun Dong
Zehai Niu
Yi Zhang
Jian Xue
VLM
26
25
0
11 Aug 2023
ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free
  Domain Adaptation
ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation
Xuefeng Hu
Ke Zhang
Lu Xia
Albert Y. C. Chen
Jiajia Luo
...
Nan Qiao
Xiao Zeng
Min Sun
Cheng-Hao Kuo
Ram Nevatia
VLM
27
25
0
04 Aug 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
61
42
0
30 Jul 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
Pietro Mascagni
Pietro Mascagni
N. Padoy
Nicolas Padoy
35
20
0
27 Jul 2023
CLIP-KD: An Empirical Study of CLIP Model Distillation
CLIP-KD: An Empirical Study of CLIP Model Distillation
Chuanguang Yang
Zhulin An
Libo Huang
Junyu Bi
Xinqiang Yu
Hansheng Yang
Boyu Diao
Yongjun Xu
VLM
29
27
0
24 Jul 2023
GIST: Generating Image-Specific Text for Fine-grained Object
  Classification
GIST: Generating Image-Specific Text for Fine-grained Object Classification
Kathleen M. Lewis
Emily Mu
Adrian V. Dalca
John Guttag
VLM
29
7
0
21 Jul 2023
T-MARS: Improving Visual Representations by Circumventing Text Feature
  Learning
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini
Sachin Goyal
Zachary Chase Lipton
J. Zico Kolter
Aditi Raghunathan
VLM
45
33
0
06 Jul 2023
CoPL: Contextual Prompt Learning for Vision-Language Understanding
CoPL: Contextual Prompt Learning for Vision-Language Understanding
Koustava Goswami
Srikrishna Karanam
Prateksha Udhayanan
J. JosephK.
Balaji Vasan Srinivasan
VLM
26
8
0
03 Jul 2023
Integrating Large Pre-trained Models into Multimodal Named Entity
  Recognition with Evidential Fusion
Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion
Weide Liu
Xiaoyang Zhong
Jingwen Hou
Shaohua Li
Haozhe Huang
Yuming Fang
EDL
35
5
0
29 Jun 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun
  resolution
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
S. Hall
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
25
39
0
21 Jun 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large
  Vision-Language Model for Remote Sensing
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
Zilun Zhang
Tiancheng Zhao
Yulong Guo
Jianwei Yin
DiffM
VLM
32
52
0
20 Jun 2023
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
F. Liu
Delong Chen
Zhan-Rong Guan
Xiaocong Zhou
Jiale Zhu
Qiaolin Ye
Liyong Fu
Jun Zhou
VLM
71
192
0
19 Jun 2023
Generate to Understand for Representation
Generate to Understand for Representation
Changshan Xue
Xiande Zhong
Xiaoqing Liu
VLM
40
0
0
14 Jun 2023
Safeguarding Data in Multimodal AI: A Differentially Private Approach to
  CLIP Training
Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training
Alyssa Huang
Peihan Liu
Ryumei Nakada
Linjun Zhang
Wanrong Zhang
VLM
73
5
0
13 Jun 2023
MOFI: Learning Image Representations from Noisy Entity Annotated Images
MOFI: Learning Image Representations from Noisy Entity Annotated Images
Wentao Wu
Aleksei Timofeev
Chen Chen
Bowen Zhang
Kun Duan
...
Yantao Zheng
Jonathon Shlens
Xianzhi Du
Zhe Gan
Yinfei Yang
VLM
26
7
0
13 Jun 2023
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for
  Histopathology Images
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
Ming Y. Lu
Bowen Chen
Andrew Zhang
Drew F. K. Williamson
Richard J. Chen
Tong Ding
L. Le
Yung-Sung Chuang
Faisal Mahmood
VLM
MedIm
30
100
0
13 Jun 2023
How Does Fine-Tuning Impact Out-of-Distribution Detection for
  Vision-Language Models?
How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?
Yifei Ming
Yixuan Li
OODD
VLM
24
38
0
09 Jun 2023
On the Generalization of Multi-modal Contrastive Learning
On the Generalization of Multi-modal Contrastive Learning
Qi Zhang
Yifei Wang
Yisen Wang
22
24
0
07 Jun 2023
MolFM: A Multimodal Molecular Foundation Model
MolFM: A Multimodal Molecular Foundation Model
Yi Luo
Kai Yang
Massimo Hong
Xingyi Liu
Zaiqing Nie
38
37
0
06 Jun 2023
UniDiff: Advancing Vision-Language Models with Generative and
  Discriminative Learning
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning
Xiao Dong
Runhu Huang
Xiaoyong Wei
Zequn Jie
Jianxing Yu
Jian Yin
Xiaodan Liang
VLM
DiffM
34
1
0
01 Jun 2023
Improving CLIP Training with Language Rewrites
Improving CLIP Training with Language Rewrites
Lijie Fan
Dilip Krishnan
Phillip Isola
Dina Katabi
Yonglong Tian
BDL
VLM
CLIP
33
157
0
31 May 2023
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL
  Models
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Roei Herzig
Donghyun Kim
...
Yikang Shen
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
47
52
0
31 May 2023
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and
  Unlabeled Image Collections
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
M. Jehanzeb Mirza
Leonid Karlinsky
Wei Lin
Mateusz Koziñski
Horst Possegger
Rogerio Feris
Horst Bischof
VLM
50
30
0
29 May 2023
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and
  Text Embeddings
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings
William Brannon
Wonjune Kang
S. Fulay
Hang Jiang
Brandon Roy
Dwaipayan Roy
Jad Kabbara
SSL
14
22
0
23 May 2023
S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist
  Captions
S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist Captions
Sangwoo Mo
Minkyu Kim
Kyungmin Lee
Jinwoo Shin
VLM
CLIP
44
21
0
23 May 2023
Weakly Supervised 3D Open-vocabulary Segmentation
Weakly Supervised 3D Open-vocabulary Segmentation
Kunhao Liu
Fangneng Zhan
Jiahui Zhang
Muyu Xu
Yingchen Yu
Abdulmotaleb El Saddik
Christian Theobalt
Eric P. Xing
Shijian Lu
25
66
0
23 May 2023
Can Language Models Understand Physical Concepts?
Can Language Models Understand Physical Concepts?
Lei Li
Jingjing Xu
Qingxiu Dong
Ce Zheng
Qi Liu
Lingpeng Kong
Xu Sun
ALM
33
18
0
23 May 2023
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained
  Vision-Language Model
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
CLIP
VLM
23
25
0
23 May 2023
Pulling Target to Source: A New Perspective on Domain Adaptive Semantic
  Segmentation
Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation
Haochen Wang
Yujun Shen
Jingjing Fei
Wei Li
Liwei Wu
Yuxi Wang
Zhaoxiang Zhang
OOD
46
7
0
23 May 2023
Not All Semantics are Created Equal: Contrastive Self-supervised
  Learning with Automatic Temperature Individualization
Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization
Zimeng Qiu
Quanqi Hu
Zhuoning Yuan
Denny Zhou
Lijun Zhang
Tianbao Yang
34
17
0
19 May 2023
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual
  Grounding
TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding
Chenchi Zhang
Jun Xiao
Lei Chen
Jian Shao
Long Chen
VLM
LRM
32
2
0
19 May 2023
MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical
  Images and Texts
MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical Images and Texts
Qiuhui Chen
Xinyue Hu
Zirui Wang
Yi Hong
LM&MA
MedIm
22
34
0
18 May 2023
CLIP-GCD: Simple Language Guided Generalized Category Discovery
CLIP-GCD: Simple Language Guided Generalized Category Discovery
Rabah Ouldnoughi
Chia-Wen Kuo
Z. Kira
VLM
31
14
0
17 May 2023
Improved baselines for vision-language pre-training
Improved baselines for vision-language pre-training
Enrico Fini
Pietro Astolfi
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
SSL
CLIP
VLM
50
22
0
15 May 2023
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
Haixin Wang
Xinlong Yang
Jianlong Chang
Di Jin
Jinan Sun
Shikun Zhang
Xiao Luo
Qi Tian
33
23
0
15 May 2023
Continual Vision-Language Representation Learning with Off-Diagonal
  Information
Continual Vision-Language Representation Learning with Off-Diagonal Information
Zixuan Ni
Longhui Wei
Siliang Tang
Yueting Zhuang
Qi Tian
VLM
CLL
33
25
0
11 May 2023
Vision-Language Models in Remote Sensing: Current Progress and Future
  Trends
Vision-Language Models in Remote Sensing: Current Progress and Future Trends
Xiang Li
Congcong Wen
Yuan Hu
Zhenghang Yuan
Xiao Xiang Zhu
VLM
24
71
0
09 May 2023
Less is More: Removing Text-regions Improves CLIP Training Efficiency
  and Robustness
Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness
Liangliang Cao
Bowen Zhang
Chen Chen
Yinfei Yang
Xianzhi Du
Wen‐Cheng Zhang
Zhiyun Lu
Yantao Zheng
CLIP
VLM
27
15
0
08 May 2023
SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using
  Vision-Language Models
SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models
Jonathan Roberts
Kai Han
Samuel Albanie
VLM
40
12
0
23 Apr 2023
Hyperbolic Image-Text Representations
Hyperbolic Image-Text Representations
Karan Desai
Maximilian Nickel
Tanmay Rajpurohit
Justin Johnson
Ramakrishna Vedantam
VLM
42
57
0
18 Apr 2023
Previous
1234567
Next