ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.03610
  4. Cited By
Unified Contrastive Learning in Image-Text-Label Space

Unified Contrastive Learning in Image-Text-Label Space

7 April 2022
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Bin Xiao
Ce Liu
Lu Yuan
Jianfeng Gao
    VLM
    SSL
ArXivPDFHTML

Papers citing "Unified Contrastive Learning in Image-Text-Label Space"

50 / 165 papers shown
Title
GeoMM: On Geodesic Perspective for Multi-modal Learning
GeoMM: On Geodesic Perspective for Multi-modal Learning
Shibin Mei
Hang Wang
Bingbing Ni
22
0
0
16 May 2025
SECRET: Semi-supervised Clinical Trial Document Similarity Search
SECRET: Semi-supervised Clinical Trial Document Similarity Search
Trisha Das
Afrah Shafquat
Beigi Mandis
Jacob Aptekar
Jimeng Sun
12
0
0
16 May 2025
A Vision-Language Foundation Model for Leaf Disease Identification
A Vision-Language Foundation Model for Leaf Disease Identification
Khang Nguyen Quoc
Lan Le Thi Thu
Luyl-Da Quach
VLM
31
0
0
11 May 2025
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Post-pre-training for Modality Alignment in Vision-Language Foundation Models
Shinýa Yamaguchi
Dewei Feng
Sekitoshi Kanai
Kazuki Adachi
Daiki Chijiwa
VLM
34
1
0
17 Apr 2025
DiffCLIP: Differential Attention Meets CLIP
Hasan Hammoud
Guohao Li
VLM
44
0
0
09 Mar 2025
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression
Souvik Kundu
Anahita Bhiwandiwalla
Sungduk Yu
Phillip Howard
Tiep Le
S. N. Sridhar
David Cobbley
Hao Kang
Vasudev Lal
MQ
59
1
0
06 Mar 2025
Boltzmann Attention Sampling for Image Analysis with Small Objects
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao
Sid Kiblawi
Naoto Usuyama
Ho Hin Lee
Sam Preston
Hoifung Poon
Mu-Hsin Wei
MedIm
73
0
0
04 Mar 2025
Making Better Mistakes in CLIP-Based Zero-Shot Classification with Hierarchy-Aware Language Prompts
Tong Liang
Jim Davis
VLM
96
0
0
04 Mar 2025
Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion
Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion
Yuan Bian
Min Liu
Yunqi Yi
Xueping Wang
Yaonan Wang
AAML
45
0
0
27 Feb 2025
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Detecting Content Rating Violations in Android Applications: A Vision-Language Approach
Dishanika Denipitiyage
B. Silva
Suranga Seneviratne
A. Seneviratne
Sanjay Chawla
48
0
0
07 Feb 2025
Unified Framework for Open-World Compositional Zero-shot Learning
Unified Framework for Open-World Compositional Zero-shot Learning
Hirunima Jayasekara
Khoi Pham
Nirat Saini
Abhinav Shrivastava
64
0
0
05 Dec 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language
  Inference
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
81
0
0
24 Nov 2024
Uni-Mlip: Unified Self-supervision for Medical Vision Language
  Pre-training
Uni-Mlip: Unified Self-supervision for Medical Vision Language Pre-training
Ameera Bawazir
Kebin Wu
Wenbin Li
CLIP
77
1
0
20 Nov 2024
Transmission Line Defect Detection Based on UAV Patrol Images and Vision-language Pretraining
Ke Zhang
Zhaoye Zheng
Yurong Guo
Jiacun Wang
Jiyuan Yang
Yangjie Xiao
VLM
79
0
0
18 Nov 2024
Past, Present, and Future of Sensor-Based Human Activity Recognition Using Wearables: A Surveying Tutorial on a Still Challenging Task
Past, Present, and Future of Sensor-Based Human Activity Recognition Using Wearables: A Surveying Tutorial on a Still Challenging Task
H. Haresamudram
Chi Ian Tang
Sungho Suh
P. Lukowicz
Thomas Ploetz
76
2
0
11 Nov 2024
CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP
CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP
Tianyu Yang
Lisen Dai
Zheyuan Liu
Xiangqi Wang
Meng Jiang
Yapeng Tian
Xiangliang Zhang
VLM
MU
37
4
0
30 Oct 2024
MedImageInsight: An Open-Source Embedding Model for General Domain
  Medical Imaging
MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Noel C. F. Codella
Ying Jin
Shrey Jain
Yu Gu
Ho Hin Lee
...
Lei Li
Thomas Lin
Ivan Tarapov
M. Lungren
Mu-Hsin Wei
LM&MA
VLM
MedIm
48
8
0
09 Oct 2024
Knowledge-Enhanced Facial Expression Recognition with
  Emotional-to-Neutral Transformation
Knowledge-Enhanced Facial Expression Recognition with Emotional-to-Neutral Transformation
Hangyu Li
Yihan Xu
Jiangchao Yao
Nannan Wang
Xinbo Gao
Bo Han
41
0
0
13 Sep 2024
Optimizing CLIP Models for Image Retrieval with Maintained
  Joint-Embedding Alignment
Optimizing CLIP Models for Image Retrieval with Maintained Joint-Embedding Alignment
Konstantin Schall
Kai Uwe Barthel
Nico Hezel
Klaus Jung
VLM
36
3
0
03 Sep 2024
Brant-X: A Unified Physiological Signal Alignment Framework
Brant-X: A Unified Physiological Signal Alignment Framework
Daoze Zhang
Zhizhang Yuan
Junru Chen
Kerui Chen
Yang Yang
33
7
0
28 Aug 2024
HPT++: Hierarchically Prompting Vision-Language Models with
  Multi-Granularity Knowledge Generation and Improved Structure Modeling
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
Yubin Wang
Xinyang Jiang
De Cheng
Wenli Sun
Dongsheng Li
Cairong Zhao
VLM
48
0
0
27 Aug 2024
Limitations in Employing Natural Language Supervision for Sensor-Based
  Human Activity Recognition -- And Ways to Overcome Them
Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition -- And Ways to Overcome Them
H. Haresamudram
Apoorva Beedu
Mashfiqui Rabbi
Sankalita Saha
Irfan Essa
Thomas Ploetz
33
4
0
21 Aug 2024
FMiFood: Multi-modal Contrastive Learning for Food Image Classification
FMiFood: Multi-modal Contrastive Learning for Food Image Classification
Xinyue Pan
Jiangpeng He
F. Zhu
44
2
0
07 Aug 2024
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang
Jongoh Jeong
Kuk-Jin Yoon
AAML
VLM
60
4
0
30 Jul 2024
Detached and Interactive Multimodal Learning
Detached and Interactive Multimodal Learning
Yunfeng Fan
Wenchao Xu
Yining Qi
Junhong Liu
Song Guo
49
3
0
28 Jul 2024
Data Adaptive Traceback for Vision-Language Foundation Models in Image
  Classification
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
Wenshuo Peng
Kaipeng Zhang
Yue Yang
Hao Zhang
Ping Luo
VLM
29
2
0
11 Jul 2024
Foundational Models for Pathology and Endoscopy Images: Application for
  Gastric Inflammation
Foundational Models for Pathology and Endoscopy Images: Application for Gastric Inflammation
H. Kerdegari
Kyle Higgins
Dennis Veselkov
I. Laponogov
I. Poļaka
...
Junior Andrea Pescino
M. Leja
M. Dinis-Ribeiro
T. F. Kanonnikoff
Kirill Veselkov
35
3
0
26 Jun 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
48
3
0
17 Jun 2024
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks
  and Algorithms
Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Miaosen Zhang
Yixuan Wei
Zhen Xing
Yifei Ma
Zuxuan Wu
...
Zheng-Wei Zhang
Qi Dai
Chong Luo
Xin Geng
Baining Guo
VLM
51
1
0
13 Jun 2024
Exploring the Spectrum of Visio-Linguistic Compositionality and
  Recognition
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition
Youngtaek Oh
Pyunghwan Ahn
Jinhyung Kim
Gwangmo Song
Soonyoung Lee
In So Kweon
Junmo Kim
CoGe
48
2
0
13 Jun 2024
Understanding Visual Concepts Across Models
Understanding Visual Concepts Across Models
Brandon Trabucco
Max Gurinas
Kyle Doherty
Ruslan Salakhutdinov
VLM
45
0
0
11 Jun 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
33
13
0
16 May 2024
UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification
UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification
Siddhant Kharbanda
Devaansh Gupta
K. Gururaj
Pankaj Malhotra
Cho-Jui Hsieh
Rohit Babbar
Rohit Babbar
44
0
0
04 May 2024
Understanding Retrieval-Augmented Task Adaptation for Vision-Language
  Models
Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models
Yifei Ming
Yixuan Li
VLM
39
7
0
02 May 2024
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster
  Pre-training on Web-scale Image-Text Data
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Sachin Mehta
Maxwell Horton
Fartash Faghri
Mohammad Hossein Sekhavat
Mahyar Najibi
Mehrdad Farajtabar
Oncel Tuzel
Mohammad Rastegari
VLM
CLIP
44
6
0
24 Apr 2024
Adaptive Prompt Learning with Negative Textual Semantics and Uncertainty
  Modeling for Universal Multi-Source Domain Adaptation
Adaptive Prompt Learning with Negative Textual Semantics and Uncertainty Modeling for Universal Multi-Source Domain Adaptation
Yuxiang Yang
L. Wen
Yuanyuan Xu
Jiliu Zhou
Yan Wang
VLM
35
1
0
23 Apr 2024
A Progressive Framework of Vision-language Knowledge Distillation and
  Alignment for Multilingual Scene
A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene
Wenbo Zhang
Yifan Zhang
Jianfeng Lin
Binqiang Huang
Jinlu Zhang
Wenhao Yu
VLM
44
2
0
17 Apr 2024
Unifying Global and Local Scene Entities Modelling for Precise Action
  Spotting
Unifying Global and Local Scene Entities Modelling for Precise Action Spotting
Kim Hoang Tran
Phuc Vuong Do
Ngoc Quoc Ly
Ngan Le
36
4
0
15 Apr 2024
PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical
  Image Classification
PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification
Zhenwei Wang
Qiule Sun
Bingbing Zhang
Pengfei Wang
Jianxin Zhang
Qiang Zhang
VLM
38
1
0
13 Apr 2024
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic
  Segmentation
Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation
Sina Hajimiri
Ismail Ben Ayed
Jose Dolz
VLM
41
22
0
12 Apr 2024
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong
Yanbei Chen
Jiarui Cai
Davide Modolo
VLM
ObjD
36
7
0
07 Apr 2024
Heterogeneous Contrastive Learning for Foundation Models and Beyond
Heterogeneous Contrastive Learning for Foundation Models and Beyond
Lecheng Zheng
Baoyu Jing
Zihao Li
Hanghang Tong
Jingrui He
VLM
38
19
0
30 Mar 2024
TransFusion: Contrastive Learning with Transformers
TransFusion: Contrastive Learning with Transformers
Huanran Li
Daniel Pimentel-Alarcón
42
0
0
27 Mar 2024
DreamLIP: Language-Image Pre-training with Long Captions
DreamLIP: Language-Image Pre-training with Long Captions
Kecheng Zheng
Yifei Zhang
Wei Wu
Fan Lu
Shuailei Ma
Xin Jin
Wei Chen
Yujun Shen
VLM
CLIP
47
26
0
25 Mar 2024
UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind
  Them All
UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All
Yuanhuiyi Lyu
Xueye Zheng
Jiazhou Zhou
Lin Wang
32
18
0
19 Mar 2024
Compositional Kronecker Context Optimization for Vision-Language Models
Compositional Kronecker Context Optimization for Vision-Language Models
Kun Ding
Xiaohui Li
Qiang Yu
Ying Wang
Haojian Zhang
Shiming Xiang
VLM
44
0
0
18 Mar 2024
Computer User Interface Understanding. A New Dataset and a Learning
  Framework
Computer User Interface Understanding. A New Dataset and a Learning Framework
Andrés Munoz
Daniel Borrajo
30
0
0
15 Mar 2024
Decomposing Disease Descriptions for Enhanced Pathology Detection: A
  Multi-Aspect Vision-Language Pre-training Framework
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework
Vu Minh Hieu Phan
Yutong Xie
Yuankai Qi
Lingqiao Liu
Liyang Liu
Bowen Zhang
Zhibin Liao
Qi Wu
Minh Nguyen Nhat To
Johan W. Verjans
70
11
0
12 Mar 2024
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in
  Human-Centric Tasks
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks
Muhammad Gul Zain Ali Khan
Muhammad Ferjad Naeem
F. Tombari
Luc Van Gool
Didier Stricker
Muhammad Zeshan Afzal
VLM
CLIP
47
3
0
11 Mar 2024
Binding Touch to Everything: Learning Unified Multimodal Tactile
  Representations
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
Fengyu Yang
Chao Feng
Ziyang Chen
Hyoungseob Park
Daniel Wang
...
Ziyao Zeng
Xien Chen
Rit Gangopadhyay
Andrew Owens
Alex Wong
42
59
0
31 Jan 2024
1234
Next