Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 9,770 papers shown
Title
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
61
689
0
04 Sep 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,271
0
02 Sep 2021
Aligning Cross-lingual Sentence Representations with Dual Momentum Contrast
Liang Wang
Wei-Ye Zhao
Jingming Liu
32
14
0
01 Sep 2021
Fine-Grained Chemical Entity Typing with Multimodal Knowledge Representation
Chenkai Sun
Weijian Li
Jinfeng Xiao
Nikolaus Nova Parulian
ChengXiang Zhai
Heng Ji
44
4
0
29 Aug 2021
LocTex: Learning Data-Efficient Visual Representations from Localized Textual Supervision
Zhijian Liu
Simon Stent
Jie Li
John Gideon
Song Han
VLM
25
10
0
26 Aug 2021
EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning
Hongbin Liu
Jinyuan Jia
Wenjie Qu
Neil Zhenqiang Gong
4
94
0
25 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
51
779
0
24 Aug 2021
Supervised Compression for Resource-Constrained Edge Computing Systems
Yoshitomo Matsubara
Ruihan Yang
Marco Levorato
Stephan Mandt
19
56
0
21 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
Patrick Esser
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
24
156
0
19 Aug 2021
Contrastive Language-Image Pre-training for the Italian Language
Federico Bianchi
Giuseppe Attanasio
Raphael Pisoni
Silvia Terragni
Gabriele Sarti
S. Lakshmi
VLM
CLIP
31
29
0
19 Aug 2021
MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions
Penghua Zhai
Huaiwei Cong
Gangming Zhao
Chaowei Fang
Jinpeng Li
Ting Cai
Huiguang He
21
10
0
17 Aug 2021
Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations
Josh Beal
Hao Wu
Dong Huk Park
Andrew Zhai
Dmitry Kislyuk
ViT
18
29
0
12 Aug 2021
SoK: How Robust is Image Classification Deep Neural Network Watermarking? (Extended Version)
Nils Lukas
Edward Jiang
Xinda Li
Florian Kerschbaum
AAML
36
86
0
11 Aug 2021
Knowledge accumulating: The general pattern of learning
Zhuoran Xu
Hao Liu
CLL
24
0
0
09 Aug 2021
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
Jinyuan Jia
Yupei Liu
Neil Zhenqiang Gong
SILM
SSL
24
151
0
01 Aug 2021
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining
Xunlin Zhan
Yangxin Wu
Xiao Dong
Yunchao Wei
Minlong Lu
Yichi Zhang
Hang Xu
Xiaodan Liang
ViT
29
64
0
30 Jul 2021
Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization
Chiyuan Zhang
M. Raghu
Jon M. Kleinberg
Samy Bengio
OOD
32
30
0
27 Jul 2021
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
D. Pakhomov
Sanchit Hira
Narayani Wagle
K. Green
Nassir Navab
VLM
32
31
0
26 Jul 2021
LARGE: Latent-Based Regression through GAN Semantics
Yotam Nitzan
Rinon Gal
Ofir Brenner
Daniel Cohen-Or
GAN
29
26
0
22 Jul 2021
Theoretical foundations and limits of word embeddings: what types of meaning can they capture?
Alina Arseniev-Koehler
30
19
0
22 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
33
231
0
21 Jul 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
ViT
24
62
0
20 Jul 2021
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Shafiq R. Joty
Caiming Xiong
S. Hoi
FaML
62
1,889
0
16 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
202
405
0
13 Jul 2021
eProduct: A Million-Scale Visual Search Benchmark to Address Product Recognition Challenges
Jiangbo Yuan
An-Ti Chiang
Wen Tang
A. Haro
VLM
22
6
0
13 Jul 2021
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset
Hannah Rose Kirk
Yennie Jun
Paulius Rauba
Gal Wachtel
Ruining Li
Xingjian Bai
Noah Broestl
Martin Doff-Sotta
Aleksandar Shtedritski
Yuki M. Asano
16
25
0
09 Jul 2021
LanguageRefer: Spatial-Language Model for 3D Visual Grounding
Junha Roh
Karthik Desingh
Ali Farhadi
D. Fox
22
95
0
07 Jul 2021
Predicting with Confidence on Unseen Distributions
Devin Guillory
Vaishaal Shankar
Sayna Ebrahimi
Trevor Darrell
Ludwig Schmidt
UQCV
OOD
20
116
0
07 Jul 2021
CLIP-It! Language-Guided Video Summarization
Medhini Narasimhan
Anna Rohrbach
Trevor Darrell
CLIP
17
113
0
01 Jul 2021
Applications of the Free Energy Principle to Machine Learning and Neuroscience
Beren Millidge
DRL
20
7
0
30 Jun 2021
The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning
Anders Andreassen
Yasaman Bahri
Behnam Neyshabur
Rebecca Roelofs
OOD
OODD
30
78
0
30 Jun 2021
Data Poisoning Won't Save You From Facial Recognition
Evani Radiya-Dixit
Sanghyun Hong
Nicholas Carlini
Florian Tramèr
AAML
PICV
15
57
0
28 Jun 2021
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders
Kevin Frans
Lisa Soros
Olaf Witkowski
CLIP
19
204
0
28 Jun 2021
Visual Conceptual Blending with Large-scale Language and Vision Models
Songwei Ge
Devi Parikh
VLM
DiffM
22
14
0
27 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
47
45
0
26 Jun 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
53
749
0
25 Jun 2021
Fairness for Image Generation with Uncertain Sensitive Attributes
A. Jalal
Sushrut Karmalkar
Jessica Hoffmann
A. Dimakis
Eric Price
DiffM
32
39
0
23 Jun 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
32
270
0
22 Jun 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
23
292
0
21 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
37
209
0
17 Jun 2021
Poisoning and Backdooring Contrastive Learning
Nicholas Carlini
Andreas Terzis
41
156
0
17 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
33
216
0
16 Jun 2021
Revisiting the Calibration of Modern Neural Networks
Matthias Minderer
Josip Djolonga
Rob Romijnders
F. Hubis
Xiaohua Zhai
N. Houlsby
Dustin Tran
Mario Lucic
UQCV
51
358
0
15 Jun 2021
Communicating Natural Programs to Humans and Machines
Samuel Acquaviva
Yewen Pu
Marta Kryven
Theo Sechopoulos
Catherine Wong
Gabrielle Ecanow
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
30
40
0
15 Jun 2021
Improved Transformer for High-Resolution GANs
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
34
95
0
14 Jun 2021
Partial success in closing the gap between human and machine vision
Robert Geirhos
Kantharaju Narayanappa
Benjamin Mitzkus
Tizian Thieringer
Matthias Bethge
Felix Wichmann
Wieland Brendel
VLM
AAML
48
221
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
40
815
0
14 Jun 2021
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
Abhishek Sinha
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
30
118
0
12 Jun 2021
Previous
1
2
3
...
193
194
195
196
Next