ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00020
  4. Cited By
Learning Transferable Visual Models From Natural Language Supervision

Learning Transferable Visual Models From Natural Language Supervision

26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Learning Transferable Visual Models From Natural Language Supervision"

50 / 10,408 papers shown
Title
IRFL: Image Recognition of Figurative Language
IRFL: Image Recognition of Figurative Language
Ron Yosef
Yonatan Bitton
Dafna Shahaf
48
18
0
27 Mar 2023
On the Stepwise Nature of Self-Supervised Learning
On the Stepwise Nature of Self-Supervised Learning
James B. Simon
Maksis Knutins
Liu Ziyin
Daniel Geisz
Abraham J. Fetterman
Joshua Albrecht
SSL
42
30
0
27 Mar 2023
Anti-DreamBooth: Protecting users from personalized text-to-image
  synthesis
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis
T. Le
Hao Phung
Thuan Hoang Nguyen
Quan Dao
Ngoc N. Tran
Anh Tran
33
92
0
27 Mar 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
81
473
0
27 Mar 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
44
984
0
27 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
38
108
0
27 Mar 2023
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot
  Learning
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Siteng Huang
Biao Gong
Yutong Feng
Min Zhang
Yiliang Lv
Donglin Wang
CoGe
37
10
0
27 Mar 2023
Contrastive Learning Is Spectral Clustering On Similarity Graph
Contrastive Learning Is Spectral Clustering On Similarity Graph
Zhi-Hao Tan
Yifan Zhang
Jingqin Yang
Yang Yuan
SSL
59
18
0
27 Mar 2023
Blind Image Quality Assessment via Vision-Language Correspondence: A
  Multitask Learning Perspective
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Weixia Zhang
Guangtao Zhai
Ying Wei
Xiaokang Yang
Kede Ma
VLM
46
173
0
27 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffM
VGen
26
40
0
27 Mar 2023
$Δ$-Patching: A Framework for Rapid Adaptation of Pre-trained
  Convolutional Networks without Base Performance Loss
ΔΔΔ-Patching: A Framework for Rapid Adaptation of Pre-trained Convolutional Networks without Base Performance Loss
Chaitanya Devaguptapu
Samarth Sinha
K. J. Joseph
V. Balasubramanian
Animesh Garg
70
1
0
26 Mar 2023
ZBS: Zero-shot Background Subtraction via Instance-level Background
  Modeling and Foreground Selection
ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection
Yongqi An
Xu Zhao
Tao Yu
Haiyun Guo
Chaoyang Zhao
Ming Tang
Jinqiao Wang
40
20
0
26 Mar 2023
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang
Yilu Wu
Sheng Guo
Limin Wang
VGen
DiffM
78
30
0
26 Mar 2023
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
POAR: Towards Open Vocabulary Pedestrian Attribute Recognition
Yue Zhang
Suchen Wang
Shichao Kan
Zhenyu Weng
Yigang Cen
Yap-Peng Tan
ViT
42
3
0
26 Mar 2023
An Evaluation of Memory Optimization Methods for Training Neural
  Networks
An Evaluation of Memory Optimization Methods for Training Neural Networks
Xiaoxuan Liu
Siddharth Jha
Alvin Cheung
34
0
0
26 Mar 2023
Learning video embedding space with Natural Language Supervision
Learning video embedding space with Natural Language Supervision
P. Uppala
Abhishek Bamotra
S. Priya
Vaidehi Joshi
CLIP
29
1
0
25 Mar 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image Synthesis
Han Xue
Z. Huang
Qianru Sun
Li Song
Wenjun Zhang
DiffM
21
62
0
25 Mar 2023
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic
  Scene Graph Prediction in Point Cloud
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
Ziqin Wang
Bowen Cheng
Lichen Zhao
Dong Xu
Yang Tang
Lu Sheng
3DPC
32
27
0
25 Mar 2023
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object
  Detection
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection
Hwanjun Song
Jihwan Bang
VLM
ObjD
34
14
0
25 Mar 2023
Video-Text as Game Players: Hierarchical Banzhaf Interaction for
  Cross-Modal Representation Learning
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
Peng Jin
Jinfa Huang
Pengfei Xiong
Shangxuan Tian
Chang-rui Liu
Xiang Ji
Li-ming Yuan
Jie Chen
50
50
0
25 Mar 2023
Local Contrastive Learning for Medical Image Recognition
Local Contrastive Learning for Medical Image Recognition
S. A. Rizvi
Ruixiang Tang
X. Jiang
X. Ma
X. Hu
33
6
0
24 Mar 2023
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity
  with Semantic and Structural Diffusion
MindDiffuser: Controlled Image Reconstruction from Human Brain Activity with Semantic and Structural Diffusion
Yizhuo Lu
Changde Du
Dianpeng Wang
Huiguang He
DiffM
138
42
0
24 Mar 2023
Best of Both Worlds: Multimodal Contrastive Learning with Tabular and
  Imaging Data
Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
Paul Hager
M. Menten
Daniel Rueckert
36
48
0
24 Mar 2023
Category Query Learning for Human-Object Interaction Classification
Category Query Learning for Human-Object Interaction Classification
Chi Xie
Fangao Zeng
Yue Hu
Shuang Liang
Yichen Wei
VLM
31
20
0
24 Mar 2023
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Jiahao Zhang
A. Cherian
Yanbin Liu
Yizhak Ben-Shabat
Cristian Rodriguez-Opazo
Stephen Gould
37
8
0
24 Mar 2023
Artificial-intelligence-based molecular classification of diffuse
  gliomas using rapid, label-free optical imaging
Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging
Todd C. Hollon
Cheng Jiang
Asadur Chowdury
Mustafa Nasir-Moin
A. Kondepudi
...
M. Snuderl
S. Camelo-Piragua
C. Freudiger
Ho Hin Lee
D. Orringer
40
88
0
23 Mar 2023
Three ways to improve feature alignment for open vocabulary detection
Three ways to improve feature alignment for open vocabulary detection
Relja Arandjelović
A. Andonian
A. Mensch
Olivier J. Hénaff
Jean-Baptiste Alayrac
Andrew Zisserman
VLM
ObjD
55
19
0
23 Mar 2023
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the
  MineRL BASALT 2022 Competition
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
...
Vinicius G. Goecks
Nicholas R. Waytowich
David Watkins
J. Miller
Rohin Shah
37
16
0
23 Mar 2023
DreamBooth3D: Subject-Driven Text-to-3D Generation
DreamBooth3D: Subject-Driven Text-to-3D Generation
Amit Raj
S. Kaza
Ben Poole
Michael Niemeyer
Nataniel Ruiz
...
Kfir Aberman
Michael Rubinstein
Jonathan T. Barron
Yuanzhen Li
Varun Jampani
DiffM
24
220
0
23 Mar 2023
ReVersion: Diffusion-Based Relation Inversion from Images
ReVersion: Diffusion-Based Relation Inversion from Images
Ziqi Huang
Tianxing Wu
Yuming Jiang
Kelvin C. K. Chan
Ziwei Liu
54
68
0
23 Mar 2023
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You
Mandy Guo
Zhecan Wang
Kai-Wei Chang
Jason Baldridge
Jiahui Yu
DiffM
54
13
0
23 Mar 2023
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
Jiacheng Wei
Hao Wang
Jiashi Feng
Guosheng Lin
Kim-Hui Yap
26
30
0
23 Mar 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation
  in an Open World
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Qifan Yu
Juncheng Li
Yuehua Wu
Siliang Tang
Wei Ji
Yueting Zhuang
40
34
0
23 Mar 2023
Exploring Structured Semantic Prior for Multi Label Recognition with
  Incomplete Labels
Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels
Zixuan Ding
Ao Wang
Hui Chen
Qiaosheng Zhang
Pengzhang Liu
Yongjun Bao
Weipeng P. Yan
Jungong Han
29
27
0
23 Mar 2023
Explore the Power of Synthetic Data on Few-shot Object Detection
Explore the Power of Synthetic Data on Few-shot Object Detection
Shaobo Lin
Kun Wang
Xingyu Zeng
Ruili Zhao
40
32
0
23 Mar 2023
Calibrated Out-of-Distribution Detection with a Generic Representation
Calibrated Out-of-Distribution Detection with a Generic Representation
Tomás Vojír
Jan Sochman
Rahaf Aljundi
Juan E. Sala Matas
OODD
UQCV
32
6
0
23 Mar 2023
Exploring Visual Prompts for Whole Slide Image Classification with
  Multiple Instance Learning
Exploring Visual Prompts for Whole Slide Image Classification with Multiple Instance Learning
Yi-Mou Lin
Zhongchen Zhao
Zhengjie Zhu
Lisheng Wang
Kwang-Ting Cheng
Hao Chen
VLM
37
1
0
23 Mar 2023
Keypoint-Guided Optimal Transport
Keypoint-Guided Optimal Transport
Xiang Gu
Yucheng Yang
Weizhen Zeng
Jian Sun
Zongben Xu
51
1
0
23 Mar 2023
Top-Down Visual Attention from Analysis by Synthesis
Top-Down Visual Attention from Analysis by Synthesis
Baifeng Shi
Trevor Darrell
Xin Eric Wang
35
30
0
23 Mar 2023
An Extended Study of Human-like Behavior under Adversarial Training
An Extended Study of Human-like Behavior under Adversarial Training
Paul Gavrikov
J. Keuper
Margret Keuper
AAML
37
9
0
22 Mar 2023
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image
  Person Retrieval
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
Ding Jiang
Mang Ye
40
140
0
22 Mar 2023
MV-MR: multi-views and multi-representations for self-supervised
  learning and knowledge distillation
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation
Vitaliy Kinakh
M. Drozdova
Slava Voloshynovskiy
45
1
0
21 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual
  Transformers
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
40
1
0
21 Mar 2023
VideoXum: Cross-modal Visual and Textural Summarization of Videos
VideoXum: Cross-modal Visual and Textural Summarization of Videos
Jingyang Lin
Hang Hua
Ming Chen
Yikang Li
Jenhao Hsiao
C. Ho
Jiebo Luo
38
30
0
21 Mar 2023
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Lukas Höllein
Ang Cao
Andrew Owens
Justin Johnson
Matthias Nießner
DiffM
40
179
0
21 Mar 2023
Multi-modal Prompting for Low-Shot Temporal Action Localization
Multi-modal Prompting for Low-Shot Temporal Action Localization
Chen Ju
Zeqian Li
Peisen Zhao
Ya Zhang
Xiaopeng Zhang
Qi Tian
Yanfeng Wang
Weidi Xie
44
18
0
21 Mar 2023
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic
  Segmentation Using Diffusion Models
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
Weijia Wu
Yuzhong Zhao
Mike Zheng Shou
Hong Zhou
Chunhua Shen
50
140
0
21 Mar 2023
Detecting the open-world objects with the help of the Brain
Detecting the open-world objects with the help of the Brain
Shuailei Ma
Yuefeng Wang
Ying-yu Wei
Peihao Chen
Zhixiang Ye
Jiaqi Fan
Enming Zhang
Thomas H. Li
VLM
ObjD
32
2
0
21 Mar 2023
Transformers in Speech Processing: A Survey
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
48
48
0
21 Mar 2023
MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds
MT-SNN: Enhance Spiking Neural Network with Multiple Thresholds
Xiaoting Wang
Yanxiang Zhang
Yongzhen Zhang
41
5
0
20 Mar 2023
Previous
123...172173174...207208209
Next