ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.03610
  4. Cited By
Unified Contrastive Learning in Image-Text-Label Space

Unified Contrastive Learning in Image-Text-Label Space

7 April 2022
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Bin Xiao
Ce Liu
Lu Yuan
Jianfeng Gao
    VLM
    SSL
ArXivPDFHTML

Papers citing "Unified Contrastive Learning in Image-Text-Label Space"

50 / 165 papers shown
Title
On the Efficacy of Text-Based Input Modalities for Action Anticipation
On the Efficacy of Text-Based Input Modalities for Action Anticipation
Apoorva Beedu
Karan Samel
Irfan Essa
55
2
0
23 Jan 2024
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for
  Multimodal Alignment
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Ziping Ma
Furong Xu
Jian Liu
Ming Yang
Qingpei Guo
VLM
42
3
0
04 Jan 2024
Modality-Collaborative Transformer with Hybrid Feature Reconstruction
  for Robust Emotion Recognition
Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition
Chengxin Chen
Pengyuan Zhang
38
5
0
26 Dec 2023
CLIM: Contrastive Language-Image Mosaic for Region Representation
CLIM: Contrastive Language-Image Mosaic for Region Representation
Size Wu
Wenwei Zhang
Lumin Xu
Sheng Jin
Wentao Liu
Chen Change Loy
ObjD
VLM
60
15
0
18 Dec 2023
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling
  Vision-Language Models Through Open-Vocabulary Knowledge
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge
Huy Le
Tung Kieu
Anh Nguyen
Ngan Le
VGen
32
1
0
15 Dec 2023
Learning Hierarchical Prompt with Structured Linguistic Knowledge for
  Vision-Language Models
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models
Yubin Wang
Xinyang Jiang
De Cheng
Dongsheng Li
Cairong Zhao
VLM
40
15
0
11 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
K. Lin
Joya Chen
Mike Zheng Shou
41
0
0
04 Dec 2023
CLAMP: Contrastive LAnguage Model Prompt-tuning
CLAMP: Contrastive LAnguage Model Prompt-tuning
Piotr Teterwak
Ximeng Sun
Bryan A. Plummer
Kate Saenko
Ser-Nam Lim
MLLM
VLM
40
1
0
04 Dec 2023
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced
  Training
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Raviteja Vemulapalli
Oncel Tuzel
CLIP
VLM
33
43
0
28 Nov 2023
Choosing Wisely and Learning Deeply: Selective Cross-Modality
  Distillation via CLIP for Domain Generalization
Choosing Wisely and Learning Deeply: Selective Cross-Modality Distillation via CLIP for Domain Generalization
Jixuan Leng
Yijiang Li
Haohan Wang
VLM
37
0
0
26 Nov 2023
Hardware Resilience Properties of Text-Guided Image Classifiers
Hardware Resilience Properties of Text-Guided Image Classifiers
Syed Talal Wasim
Kabila Haile Soboka
Abdulrahman Mahmoud
Salman Khan
David Brooks
Gu-Yeon Wei
VLM
27
1
0
23 Nov 2023
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot
  End-to-End Temporal Action Detection
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
Thinh Phan
Khoa T. Vo
Duy Le
Gianfranco Doretto
Don Adjeroh
Ngan Le
VLM
26
9
0
01 Nov 2023
Foundational Models in Medical Imaging: A Comprehensive Survey and
  Future Vision
Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision
Bobby Azad
Reza Azad
Sania Eskandari
Afshin Bozorgpour
A. Kazerouni
I. Rekik
Dorit Merhof
VLM
MedIm
101
60
0
28 Oct 2023
RAEDiff: Denoising Diffusion Probabilistic Models Based Reversible
  Adversarial Examples Self-Generation and Self-Recovery
RAEDiff: Denoising Diffusion Probabilistic Models Based Reversible Adversarial Examples Self-Generation and Self-Recovery
Fan Xing
Xiaoyi Zhou
Xuefeng Fan
Zhuo Tian
Yan Zhao
DiffM
21
0
0
25 Oct 2023
CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training
CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training
Kihyun You
Jawook Gu
Jiyeon Ham
Beomhee Park
Jiho Kim
Eun K. Hong
Woonhyuk Baek
Byungseok Roh
CLIP
VLM
26
59
0
20 Oct 2023
Multi-level Contrastive Learning for Script-based Character
  Understanding
Multi-level Contrastive Learning for Script-based Character Understanding
Dawei Li
Hengyuan Zhang
Yanran Li
Shiping Yang
49
17
0
20 Oct 2023
Analyzing Zero-Shot Abilities of Vision-Language Models on Video
  Understanding Tasks
Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks
Avinash Madasu
Anahita Bhiwandiwalla
Vasudev Lal
VLM
37
0
0
07 Oct 2023
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene
  Representation
Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation
Kashu Yamazaki
Taisei Hanyu
Khoa T. Vo
Thang M. Pham
Minh-Triet Tran
Gianfranco Doretto
Anh Nguyen
Ngan Le
24
25
0
05 Oct 2023
Understanding Transferable Representation Learning and Zero-shot
  Transfer in CLIP
Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP
Zixiang Chen
Yihe Deng
Yuanzhi Li
Quanquan Gu
VLM
28
11
0
02 Oct 2023
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
R. S. Srinivasa
Jaejin Cho
Chouchang Yang
Yashas Malur Saidutta
Ching Hua Lee
Yilin Shen
Hongxia Jin
VLM
36
8
0
26 Sep 2023
A Sentence Speaks a Thousand Images: Domain Generalization through
  Distilling CLIP with Language Guidance
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance
Zeyi Huang
Andy Zhou
Zijian Lin
Mu Cai
Haohan Wang
Yong Jae Lee
VLM
OOD
32
28
0
21 Sep 2023
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight
  Inheritance
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
Kan Wu
Houwen Peng
Zhenghong Zhou
Bin Xiao
Mengchen Liu
...
Xi
Xi Chen
Xinggang Wang
Hongyang Chao
Han Hu
VLM
OODD
29
53
0
21 Sep 2023
Efficiently Robustify Pre-trained Models
Efficiently Robustify Pre-trained Models
Nishant Jain
Iit Roorkee
Harkirat Singh Behl
Vibhav Vineet
OOD
VLM
28
3
0
14 Sep 2023
EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale
  Visual Localization
EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization
Minjung Kim
Junseo Koo
Gunhee Kim
3DPC
20
8
0
14 Sep 2023
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge
  Distillation at Multiple Levels
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
Bo Wan
Tinne Tuytelaars
VLM
32
3
0
10 Sep 2023
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Zigang Geng
Binxin Yang
Tiankai Hang
Chen Li
Shuyang Gu
...
Jianmin Bao
Zheng-Wei Zhang
Han Hu
Dongdong Chen
Baining Guo
DiffM
VLM
53
93
0
07 Sep 2023
Multimodal Contrastive Learning and Tabular Attention for Automated
  Alzheimer's Disease Prediction
Multimodal Contrastive Learning and Tabular Attention for Automated Alzheimer's Disease Prediction
Weichen Huang
24
12
0
29 Aug 2023
SAAN: Similarity-aware attention flow network for change detection with
  VHR remote sensing images
SAAN: Similarity-aware attention flow network for change detection with VHR remote sensing images
Haonan Guo
Xin Su
Chen Wu
Bo Du
Lefei Zhang
15
17
0
28 Aug 2023
A Foundation Language-Image Model of the Retina (FLAIR): Encoding Expert Knowledge in Text Supervision
A Foundation Language-Image Model of the Retina (FLAIR): Encoding Expert Knowledge in Text Supervision
Julio Silva-Rodríguez
H. Chakor
Riadh Kobbi
Jose Dolz
Ismail Ben Ayed
VLM
MedIm
74
35
0
15 Aug 2023
Learning Concise and Descriptive Attributes for Visual Recognition
Learning Concise and Descriptive Attributes for Visual Recognition
Andy Yan
Yu-Xiang Wang
Yiwu Zhong
Chengyu Dong
Zexue He
Yujie Lu
William Wang
Jingbo Shang
Julian McAuley
VLM
27
60
0
07 Aug 2023
Sat2Cap: Mapping Fine-Grained Textual Descriptions from Satellite Images
Sat2Cap: Mapping Fine-Grained Textual Descriptions from Satellite Images
Aayush Dhakal
Adeel Ahmad
Subash Khanal
Srikumar Sastry
Hannah Kerner
Nathan Jacobs
25
13
0
29 Jul 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
Fahad Shahbaz Khan
VLM
38
118
0
25 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times
  and Location Reasoning
Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
Gengyuan Zhang
Yurui Zhang
Kerui Zhang
Volker Tresp
LRM
27
10
0
12 Jul 2023
Unified Medical Image-Text-Label Contrastive Learning With Continuous
  Prompt
Unified Medical Image-Text-Label Contrastive Learning With Continuous Prompt
Yuhao Wang
VLM
MedIm
30
1
0
12 Jul 2023
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the
  Backbone
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick
Yale Song
Sayan Nag
Kevin Qinghong Lin
Hardik Shah
Mike Zheng Shou
Ramalingam Chellappa
Pengchuan Zhang
VLM
42
89
0
11 Jul 2023
Semantic-SAM: Segment and Recognize Anything at Any Granularity
Semantic-SAM: Segment and Recognize Anything at Any Granularity
Feng Li
Hao Zhang
Pei Sun
Xueyan Zou
Siyi Liu
Jianwei Yang
Chun-yue Li
Lei Zhang
Jianfeng Gao
VLM
40
173
0
10 Jul 2023
Vision Language Transformers: A Survey
Vision Language Transformers: A Survey
Clayton Fields
C. Kennington
VLM
28
5
0
06 Jul 2023
Multi-Similarity Contrastive Learning
Multi-Similarity Contrastive Learning
Emily Mu
John Guttag
Maggie Makar
SSL
32
2
0
06 Jul 2023
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
What a MESS: Multi-Domain Evaluation of Zero-Shot Semantic Segmentation
Benedikt Blumenstiel
Johannes Jakubik
Hilde Kuhne
Michael Vossing
VLM
32
15
0
27 Jun 2023
Relating tSNE and UMAP to Classical Dimensionality Reduction
Relating tSNE and UMAP to Classical Dimensionality Reduction
Andrew Draganov
Simon Dohn
FAtt
27
4
0
20 Jun 2023
MOFI: Learning Image Representations from Noisy Entity Annotated Images
MOFI: Learning Image Representations from Noisy Entity Annotated Images
Wentao Wu
Aleksei Timofeev
Chen Chen
Bowen Zhang
Kun Duan
...
Yantao Zheng
Jonathon Shlens
Xianzhi Du
Zhe Gan
Yinfei Yang
VLM
26
7
0
13 Jun 2023
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for
  Histopathology Images
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
Ming Y. Lu
Bowen Chen
Andrew Zhang
Drew F. K. Williamson
Richard J. Chen
Tong Ding
L. Le
Yung-Sung Chuang
Faisal Mahmood
VLM
MedIm
36
100
0
13 Jun 2023
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Paul Pu Liang
Zihao Deng
Martin Q. Ma
James Zou
Louis-Philippe Morency
Ruslan Salakhutdinov
SSL
26
49
0
08 Jun 2023
ScaleDet: A Scalable Multi-Dataset Object Detector
ScaleDet: A Scalable Multi-Dataset Object Detector
Yanbei Chen
Manchen Wang
Abhay Mittal
Zhenlin Xu
Paolo Favaro
Joseph Tighe
Davide Modolo
ObjD
19
19
0
08 Jun 2023
UniBoost: Unsupervised Unimodal Pre-training for Boosting Zero-shot
  Vision-Language Tasks
UniBoost: Unsupervised Unimodal Pre-training for Boosting Zero-shot Vision-Language Tasks
Yanan Sun
Zi-Qi Zhong
Qi Fan
Chi-Keung Tang
Yu-Wing Tai
VLM
33
4
0
07 Jun 2023
Z-GMOT: Zero-shot Generic Multiple Object Tracking
Z-GMOT: Zero-shot Generic Multiple Object Tracking
Kim Hoang Tran
Anh Duy Le Dinh
Tien-Phat Nguyen
Thinh Phan
Pha Nguyen
Khoa Luu
Don Adjeroh
Gianfranco Doretto
Ngan Hoang Le
VOT
36
5
0
28 May 2023
Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Jannik Kossen
Mark Collier
Basil Mustafa
Tianlin Li
Xiaohua Zhai
Lucas Beyer
Andreas Steiner
Jesse Berent
Rodolphe Jenatton
Efi Kokiopoulou
VLM
45
11
0
26 May 2023
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for
  Improved Vision-Language Compositionality
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
Harman Singh
Pengchuan Zhang
Qifan Wang
Mengjiao MJ Wang
Wenhan Xiong
Jingfei Du
Yu Chen
CoGe
VLM
29
24
0
23 May 2023
VLAB: Enhancing Video Language Pre-training by Feature Adapting and
  Blending
VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending
Xingjian He
Sihan Chen
Fan Ma
Zhicheng Huang
Xiaojie Jin
Zikang Liu
Dongmei Fu
Yi Yang
Jiaheng Liu
Jiashi Feng
VLM
CLIP
23
17
0
22 May 2023
Previous
1234
Next