Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 10,282 papers shown
Title
Contrastive Language-Image Pre-training for the Italian Language
Federico Bianchi
Giuseppe Attanasio
Raphael Pisoni
Silvia Terragni
Gabriele Sarti
S. Lakshmi
VLM
CLIP
38
30
0
19 Aug 2021
An Information Theory-inspired Strategy for Automatic Network Pruning
Xiawu Zheng
Yuexiao Ma
Teng Xi
Gang Zhang
Errui Ding
Yuchao Li
Jie Chen
Yonghong Tian
Rongrong Ji
54
13
0
19 Aug 2021
MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3D CT Lesions
Penghua Zhai
Huaiwei Cong
Gangming Zhao
Chaowei Fang
Jinpeng Li
Ting Cai
Huiguang He
27
10
0
17 Aug 2021
Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations
Josh Beal
Hao Wu
Dong Huk Park
Andrew Zhai
Dmitry Kislyuk
ViT
23
29
0
12 Aug 2021
SoK: How Robust is Image Classification Deep Neural Network Watermarking? (Extended Version)
Nils Lukas
Edward Jiang
Xinda Li
Florian Kerschbaum
AAML
36
87
0
11 Aug 2021
Knowledge accumulating: The general pattern of learning
Zhuoran Xu
Hao Liu
CLL
26
0
0
09 Aug 2021
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
Jinyuan Jia
Yupei Liu
Neil Zhenqiang Gong
SILM
SSL
42
152
0
01 Aug 2021
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining
Xunlin Zhan
Yangxin Wu
Xiao Dong
Yunchao Wei
Minlong Lu
Yichi Zhang
Hang Xu
Xiaodan Liang
ViT
34
64
0
30 Jul 2021
Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization
Chiyuan Zhang
M. Raghu
Jon M. Kleinberg
Samy Bengio
OOD
32
30
0
27 Jul 2021
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
D. Pakhomov
Sanchit Hira
Narayani Wagle
K. Green
Nassir Navab
VLM
32
31
0
26 Jul 2021
Language Grounding with 3D Objects
Jesse Thomason
Mohit Shridhar
Yonatan Bisk
Chris Paxton
Luke Zettlemoyer
LM&Ro
28
53
0
26 Jul 2021
Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph
Wentian Zhao
Yao Hu
Heda Wang
Xinxiao Wu
Jiebo Luo
23
47
0
26 Jul 2021
LARGE: Latent-Based Regression through GAN Semantics
Yotam Nitzan
Rinon Gal
Ofir Brenner
Daniel Cohen-Or
GAN
29
26
0
22 Jul 2021
Theoretical foundations and limits of word embeddings: what types of meaning can they capture?
Alina Arseniev-Koehler
36
19
0
22 Jul 2021
CycleMLP: A MLP-like Architecture for Dense Prediction
Shoufa Chen
Enze Xie
Chongjian Ge
Runjian Chen
Ding Liang
Ping Luo
33
231
0
21 Jul 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
ViT
24
63
0
20 Jul 2021
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Chenyu You
Caiming Xiong
Guosheng Lin
FaML
83
1,893
0
16 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
69
256
0
14 Jul 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
202
406
0
13 Jul 2021
eProduct: A Million-Scale Visual Search Benchmark to Address Product Recognition Challenges
Jiangbo Yuan
An-Ti Chiang
Wen Tang
A. Haro
VLM
22
6
0
13 Jul 2021
Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset
Hannah Rose Kirk
Yennie Jun
Paulius Rauba
Gal Wachtel
Ruining Li
Xingjian Bai
Noah Broestl
Martin Doff-Sotta
Aleksandar Shtedritski
Yuki M. Asano
24
25
0
09 Jul 2021
LanguageRefer: Spatial-Language Model for 3D Visual Grounding
Junha Roh
Karthik Desingh
Ali Farhadi
Dieter Fox
22
95
0
07 Jul 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
86
5,161
0
07 Jul 2021
Predicting with Confidence on Unseen Distributions
Devin Guillory
Vaishaal Shankar
Sayna Ebrahimi
Trevor Darrell
Ludwig Schmidt
UQCV
OOD
30
117
0
07 Jul 2021
CLIP-It! Language-Guided Video Summarization
Medhini Narasimhan
Anna Rohrbach
Trevor Darrell
CLIP
26
113
0
01 Jul 2021
Applications of the Free Energy Principle to Machine Learning and Neuroscience
Beren Millidge
DRL
28
7
0
30 Jun 2021
The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning
Anders Andreassen
Yasaman Bahri
Behnam Neyshabur
Rebecca Roelofs
OOD
OODD
30
79
0
30 Jun 2021
Data Poisoning Won't Save You From Facial Recognition
Evani Radiya-Dixit
Sanghyun Hong
Nicholas Carlini
Florian Tramèr
AAML
PICV
22
57
0
28 Jun 2021
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders
Kevin Frans
Lisa Soros
Olaf Witkowski
CLIP
35
205
0
28 Jun 2021
Visual Conceptual Blending with Large-scale Language and Vision Models
Songwei Ge
Devi Parikh
VLM
DiffM
30
14
0
27 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
54
45
0
26 Jun 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
88
755
0
25 Jun 2021
Fairness for Image Generation with Uncertain Sensitive Attributes
A. Jalal
Sushrut Karmalkar
Jessica Hoffmann
A. Dimakis
Eric Price
DiffM
35
39
0
23 Jun 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
41
273
0
22 Jun 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
35
292
0
21 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
40
209
0
17 Jun 2021
Poisoning and Backdooring Contrastive Learning
Nicholas Carlini
Andreas Terzis
46
158
0
17 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
33
219
0
16 Jun 2021
Revisiting the Calibration of Modern Neural Networks
Matthias Minderer
Josip Djolonga
Rob Romijnders
F. Hubis
Xiaohua Zhai
N. Houlsby
Dustin Tran
Mario Lucic
UQCV
51
358
0
15 Jun 2021
Communicating Natural Programs to Humans and Machines
Samuel Acquaviva
Yewen Pu
Marta Kryven
Theo Sechopoulos
Catherine Wong
Gabrielle Ecanow
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
38
40
0
15 Jun 2021
Improved Transformer for High-Resolution GANs
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
34
95
0
14 Jun 2021
Partial success in closing the gap between human and machine vision
Robert Geirhos
Kantharaju Narayanappa
Benjamin Mitzkus
Tizian Thieringer
Matthias Bethge
Felix Wichmann
Wieland Brendel
VLM
AAML
50
222
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
818
0
14 Jun 2021
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
Abhishek Sinha
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
30
118
0
12 Jun 2021
Assessing Multilingual Fairness in Pre-trained Multimodal Representations
Jialu Wang
Yang Liu
Junfeng Fang
EGVM
26
35
0
12 Jun 2021
Neural Symbolic Regression that Scales
Luca Biggio
Tommaso Bendinelli
Alexander Neitz
Aurelien Lucchi
Giambattista Parascandolo
54
171
0
11 Jun 2021
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data
Yang Hu
Adriane P. Chapman
Guihua Wen
Dame Wendy Hall
44
24
0
11 Jun 2021
Learning to See by Looking at Noise
Manel Baradad
Jonas Wulff
Tongzhou Wang
Phillip Isola
Antonio Torralba
33
89
0
10 Jun 2021
Pivotal Tuning for Latent-based Editing of Real Images
Daniel Roich
Ron Mokady
Amit H. Bermano
Daniel Cohen-Or
DiffM
34
522
0
10 Jun 2021
Taxonomy of Machine Learning Safety: A Survey and Primer
Sina Mohseni
Haotao Wang
Zhiding Yu
Chaowei Xiao
Zhangyang Wang
J. Yadawa
31
31
0
09 Jun 2021
Previous
1
2
3
...
203
204
205
206
Next