ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00020
  4. Cited By
Learning Transferable Visual Models From Natural Language Supervision

Learning Transferable Visual Models From Natural Language Supervision

26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
    CLIPVLM
ArXiv (abs)PDFHTMLGithub (29177★)

Papers citing "Learning Transferable Visual Models From Natural Language Supervision"

50 / 1,722 papers shown
Title
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model
Huan Ma
Yan Zhu
Changqing Zhang
Peilin Zhao
Baoyuan Wu
Long-Kai Huang
Qinghua Hu
Bing Wu
VLM
115
2
0
01 Mar 2024
Large Convolutional Model Tuning via Filter Subspace
Large Convolutional Model Tuning via Filter Subspace
Wei Chen
Zichen Miao
Qiang Qiu
221
4
0
01 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
Frank Breitinger
Mark Scanlon
133
9
0
29 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
276
22
0
28 Feb 2024
Balancing Act: Distribution-Guided Debiasing in Diffusion Models
Balancing Act: Distribution-Guided Debiasing in Diffusion Models
Rishubh Parihar
Abhijnya Bhat
Abhipsa Basu
Saswat Mallick
Jogendra Nath Kundu
R. V. Babu
163
20
0
28 Feb 2024
Learning to Deblur Polarized Images
Learning to Deblur Polarized Images
Chu Zhou
Minggui Teng
Xinyu Zhou
Chao Xu
Boxin Shi
Boxin Shi
81
2
0
28 Feb 2024
Diffusion Model-Based Image Editing: A Survey
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
220
103
0
27 Feb 2024
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Ashok Urlana
Charaka Vinayak Kumar
Ajeet Kumar Singh
B. Garlapati
S. Chalamala
Rahul Mishra
103
8
0
22 Feb 2024
Subobject-level Image Tokenization
Subobject-level Image Tokenization
Delong Chen
Samuel Cahyawijaya
Jianfeng Liu
Baoyuan Wang
Pascale Fung
VLMOCL
258
9
0
22 Feb 2024
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Tanzila Rahman
Shweta Mahajan
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Leonid Sigal
134
4
0
18 Feb 2024
ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment
ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment
Yang Luo
Zhuo Chen
Lingbing Guo
Qian Li
Wenxuan Zeng
Zhixin Cai
Jianxin Li
137
5
0
16 Feb 2024
ProtChatGPT: Towards Understanding Proteins with Large Language Models
ProtChatGPT: Towards Understanding Proteins with Large Language Models
Chao Wang
Hehe Fan
Ruijie Quan
Yi Yang
93
15
0
15 Feb 2024
World Model on Million-Length Video And Language With Blockwise RingAttention
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
114
85
0
13 Feb 2024
Exploring Saliency Bias in Manipulation Detection
Exploring Saliency Bias in Manipulation Detection
Joshua Krinsky
Alan Bettis
Qiuyu Tang
Daniel Moreira
Aparna Bharati
76
3
0
12 Feb 2024
CIC: A Framework for Culturally-Aware Image Captioning
CIC: A Framework for Culturally-Aware Image Captioning
Youngsik Yun
Jihie Kim
VLM
114
6
0
08 Feb 2024
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Shoubin Yu
Jaehong Yoon
Mohit Bansal
153
7
0
08 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
212
116
0
08 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
79
21
0
08 Feb 2024
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue
Kun Ouyang
Liqiang Jing
Xuemeng Song
Meng Liu
Yupeng Hu
Liqiang Nie
178
3
0
06 Feb 2024
Multimodal Rationales for Explainable Visual Question Answering
Multimodal Rationales for Explainable Visual Question Answering
Kun Li
G. Vosselman
Michael Ying Yang
128
2
0
06 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
146
85
0
03 Feb 2024
Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations
Neural Slot Interpreters: Grounding Object Semantics in Emergent Slot Representations
Bhishma Dedhia
N. Jha
OCL
125
1
0
02 Feb 2024
Segment Any Change
Segment Any Change
Zhuo Zheng
Yanfei Zhong
Liangpei Zhang
Stefano Ermon
VLM
83
13
0
02 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Menghua Wu
Yujia Bao
Regina Barzilay
Tommi Jaakkola
CML
118
7
0
02 Feb 2024
Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks
Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks
Maan Qraitem
Nazia Tasnim
Piotr Teterwak
Kate Saenko
Bryan A. Plummer
AAMLVLM
86
12
0
01 Feb 2024
Multimodal Action Quality Assessment
Multimodal Action Quality Assessment
Ling-an Zeng
Wei-Shi Zheng
106
15
0
31 Jan 2024
CCA: Collaborative Competitive Agents for Image Editing
CCA: Collaborative Competitive Agents for Image Editing
Tiankai Hang
Shuyang Gu
Dong Chen
Xin Geng
Baining Guo
155
5
0
23 Jan 2024
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Ci-Siang Lin
Chien-Yi Wang
Yu-Chiang Frank Wang
Min-Hung Chen
VLM
240
0
0
22 Jan 2024
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy
Will LeVine
Benjamin Pikus
Jacob Phillips
Berk Norman
Fernando Amat Gil
Sean Hendryx
OODD
166
1
0
22 Jan 2024
The Faiss library
The Faiss library
Matthijs Douze
Alexandr Guzhva
Chengqi Deng
Jeff Johnson
Gergely Szilvasy
Pierre-Emmanuel Mazaré
Maria Lomeli
Lucas Hosseini
Hervé Jégou
197
183
0
16 Jan 2024
End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction
End-to-End Crystal Structure Prediction from Powder X-Ray Diffraction
Qingsi Lai
Lin Yao
Zhifeng Gao
Siyuan Liu
Hongshuai Wang
...
Di He
Liwei Wang
Cheng Wang
Guolin Ke
Guolin Ke
71
8
0
08 Jan 2024
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing
Hugo Chan-To-Hing
B. Veeravalli
88
9
0
05 Jan 2024
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse
Hongzhan Lin
Ziyang Luo
Bo Wang
Ruichao Yang
Jing Ma
106
31
0
03 Jan 2024
AliFuse: Aligning and Fusing Multi-modal Medical Data for Computer-Aided Diagnosis
AliFuse: Aligning and Fusing Multi-modal Medical Data for Computer-Aided Diagnosis
Qiuhui Chen
Yi Hong
MedIm
111
1
0
02 Jan 2024
Morphing Tokens Draw Strong Masked Image Models
Morphing Tokens Draw Strong Masked Image Models
Taekyung Kim
Byeongho Heo
Dongyoon Han
169
3
0
30 Dec 2023
Discrete Distribution Networks
Discrete Distribution Networks
Lei Yang
121
1
0
29 Dec 2023
3VL: Using Trees to Improve Vision-Language Models' Interpretability
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGeVLM
279
3
0
28 Dec 2023
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training
Xinyan Chen
Jiaxin Ge
Tianjun Zhang
Jiaming Liu
Shanghang Zhang
VLMEGVM
156
0
0
23 Dec 2023
Leveraging Habitat Information for Fine-grained Bird Identification
Leveraging Habitat Information for Fine-grained Bird Identification
Tin Nguyen
Peijie Chen
Anh Totti Nguyen
VLM
101
0
0
22 Dec 2023
RealCraft: Attention Control as A Tool for Zero-Shot Consistent Video Editing
RealCraft: Attention Control as A Tool for Zero-Shot Consistent Video Editing
Shutong Jin
Ruiyu Wang
Florian T. Pokorny
DiffMVGen
193
1
0
19 Dec 2023
Scene-Conditional 3D Object Stylization and Composition
Scene-Conditional 3D Object Stylization and Composition
Jinghao Zhou
Tomas Jakab
Philip Torr
Christian Rupprecht
DiffM
135
3
0
19 Dec 2023
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Tianlin Li
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
142
3
0
18 Dec 2023
Tell Me What You See: Text-Guided Real-World Image Denoising
Tell Me What You See: Text-Guided Real-World Image Denoising
E. Yosef
Raja Giryes
DiffM
117
2
0
15 Dec 2023
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
115
4
0
15 Dec 2023
Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations
Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations
Mushfiqur Rahman
Runze Liu
Chau-Wai Wong
Huaiyu Dai
107
0
0
13 Dec 2023
Optimized View and Geometry Distillation from Multi-view Diffuser
Optimized View and Geometry Distillation from Multi-view Diffuser
Youjia Zhang
Zikai Song
Junqing Yu
Yawei Luo
Wei Yang
122
0
0
11 Dec 2023
Unsupervised Multi-modal Feature Alignment for Time Series Representation Learning
Unsupervised Multi-modal Feature Alignment for Time Series Representation Learning
Cheng Liang
Donghua Yang
Zhiyu Liang
Hongzhi Wang
Zheng Liang
Xiyang Zhang
Jianfeng Huang
AI4TS
455
2
0
09 Dec 2023
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu
Shujian Yu
Jingzheng Wu
S. Picek
AAML
88
0
0
08 Dec 2023
Auto-Vocabulary Semantic Segmentation
Auto-Vocabulary Semantic Segmentation
Osman Ülger
Maksymilian Kulicki
Yuki M. Asano
Martin R. Oswald
VLM
144
2
0
07 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
120
4
0
05 Dec 2023
Previous
123...293031...333435
Next