Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 10,414 papers shown
Title
Evaluating Synthetic Pre-Training for Handwriting Processing Tasks
Vittorio Pippi
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
33
5
0
04 Apr 2023
Learning to Name Classes for Vision and Language Models
Sarah Parisot
Yongxin Yang
Jingyu Sun
VLM
19
10
0
04 Apr 2023
Black Box Few-Shot Adaptation for Vision-Language models
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
39
31
0
04 Apr 2023
Towards Open-Vocabulary Video Instance Segmentation
Haochen Wang
Cilin Yan
Shuailong Wang
Xiaolong Jiang
XU Tang
Yao Hu
Weidi Xie
E. Gavves
VOS
VLM
30
29
0
04 Apr 2023
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models
Jaewoong Lee
Sang-Sub Jang
Jaehyeong Jo
Jaehong Yoon
Yunji Kim
Jin-Hwa Kim
Jung-Woo Ha
Sung Ju Hwang
DiffM
37
4
0
04 Apr 2023
Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
Ajinkya Tejankar
Maziar Sanjabi
Qifan Wang
Sinong Wang
Hamed Firooz
Hamed Pirsiavash
L Tan
AAML
30
19
0
04 Apr 2023
The Vector Grounding Problem
Dimitri Coelho Mollo
Raphael Milliere
49
26
0
04 Apr 2023
Unsupervised Brain Tumor Segmentation with Image-based Prompts
Xinru Zhang
N. Ou
Chenghao Liu
Z. Zhuo
Yao Liu
Chuyang Ye
VLM
27
2
0
04 Apr 2023
Exploring Vision-Language Models for Imbalanced Learning
Yidong Wang
Zhuohao Yu
Jindong Wang
Qiang Heng
Haoxing Chen
Wei Ye
Rui Xie
Xingxu Xie
Shi-Bo Zhang
VLM
46
30
0
04 Apr 2023
Navigating to Objects Specified by Images
Jacob Krantz
Théophile Gervet
Karmesh Yadav
Austin S. Wang
Chris Paxton
Roozbeh Mottaghi
Dhruv Batra
Jitendra Malik
Stefan Lee
Devendra Singh Chaplot
50
36
0
03 Apr 2023
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
Mingyuan Zhang
Xinying Guo
Liang Pan
Zhongang Cai
Fangzhou Hong
Huirong Li
Lei Yang
Ziwei Liu
DiffM
VGen
44
158
0
03 Apr 2023
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
Jihan Yang
Runyu Ding
Weipeng Deng
Zhe Wang
Xiaojuan Qi
33
62
0
03 Apr 2023
HypLiLoc: Towards Effective LiDAR Pose Regression with Hyperbolic Fusion
Sijie Wang
Qiyu Kang
Rui She
Wei Wang
K. Zhao
Yang Song
Wee Peng Tay
47
18
0
03 Apr 2023
MetaHead: An Engine to Create Realistic Digital Head
Dingyun Zhang
Chenglai Zhong
Yudong Guo
Yang Hong
Juyong Zhang
3DH
23
4
0
03 Apr 2023
Probabilistic Prompt Learning for Dense Prediction
Hyeongjun Kwon
Taeyong Song
Somi Jeong
Jin-Hwa Kim
Jinhyun Jang
Kwanghoon Sohn
VLM
56
19
0
03 Apr 2023
Multi-Modal Representation Learning with Text-Driven Soft Masks
Jaeyoo Park
Bohyung Han
SSL
30
4
0
03 Apr 2023
SPAN: Learning Similarity between Scene Graphs and Images with Transformers
Yuren Cong
Wentong Liao
Bodo Rosenhahn
M. Yang
42
6
0
02 Apr 2023
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
54
29
0
02 Apr 2023
Sketch-based Video Object Localization
Sangmin Woo
So-Yeong Jeon
Jinyoung Park
Minji Son
Sumin Lee
Changick Kim
36
0
0
02 Apr 2023
Subject-driven Text-to-Image Generation via Apprenticeship Learning
Wenhu Chen
Hexiang Hu
Yandong Li
Nataniel Rui
Xuhui Jia
Ming-Wei Chang
William W. Cohen
DiffM
36
187
0
01 Apr 2023
PrefGen: Preference Guided Image Generation with Relative Attributes
Alec Helbling
Christopher Rozell
Matthew R. O’Shaughnessy
Kion Fallah
CVBM
16
0
0
01 Apr 2023
Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding
Xiang Zhang
Taoyue Wang
Xiaotian Li
Huiyuan Yang
L. Yin
54
9
0
31 Mar 2023
∞
\infty
∞
-Diff: Infinite Resolution Diffusion with Subsampled Mollified States
Sam Bond-Taylor
Chris G. Willcocks
39
15
0
31 Mar 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
Ximeng Sun
Pengchuan Zhang
Peizhao Zhang
Hardik Shah
Kate Saenko
Xide Xia
VLM
35
20
0
31 Mar 2023
Procedure-Aware Pretraining for Instructional Video Understanding
Honglu Zhou
Roberto Martín-Martín
Mubbasir Kapadia
Silvio Savarese
Juan Carlos Niebles
59
39
0
31 Mar 2023
Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability
Mischa Dombrowski
Hadrien Reynaud
Johanna P. Müller
Matthew Baugh
Bernhard Kainz
MedIm
26
6
0
31 Mar 2023
Vision-Language Modelling For Radiological Imaging and Reports In The Low Data Regime
Rhydian Windsor
A. Jamaludin
T. Kadir
Andrew Zisserman
VLM
32
11
0
30 Mar 2023
When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning
Zichen Zhang
Luca Weihs
OffRL
29
5
0
30 Mar 2023
Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
Wen Wang
Yan Jiang
K. Xie
Zide Liu
Hao Chen
Yue Cao
Xinlong Wang
Chunhua Shen
DiffM
VGen
36
112
0
30 Mar 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zhangyang Wang
Humphrey Shi
DiffM
51
177
0
30 Mar 2023
FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation
Jie Qin
Jie Wu
Pengxiang Yan
Ming Li
Ren Yuxi
...
Yitong Wang
Rui Wang
Shilei Wen
X. Pan
Xingang Wang
SSeg
VLM
26
89
0
30 Mar 2023
Discriminative Class Tokens for Text-to-Image Diffusion Models
Idan Schwartz
Vésteinn Snaebjarnarson
Hila Chefer
Ryan Cotterell
Serge Belongie
Lior Wolf
Sagie Benaim
38
9
0
30 Mar 2023
Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling
Ethan Wisdom
Tejas Gokhale
Chaowei Xiao
Yezhou Yang
33
0
0
30 Mar 2023
DiffCollage: Parallel Generation of Large Content with Diffusion Models
Qinsheng Zhang
Jiaming Song
Xun Huang
Yongxin Chen
Xuan Li
DiffM
36
82
0
30 Mar 2023
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
Brian Chen
Nina Shvetsova
Andrew Rouditchenko
D. Kondermann
Samuel Thomas
Shih-Fu Chang
Rogerio Feris
James R. Glass
Hilde Kuehne
42
7
0
29 Mar 2023
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
31
35
0
29 Mar 2023
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Weicheng Kuo
A. Piergiovanni
Dahun Kim
Xiyang Luo
Benjamin Caine
...
Luowei Zhou
Andrew M. Dai
Zhifeng Chen
Claire Cui
A. Angelova
MLLM
VLM
44
24
0
29 Mar 2023
Qualitative Failures of Image Generation Models and Their Application in Detecting Deepfakes
Ali Borji
36
30
0
29 Mar 2023
MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
30
17
0
29 Mar 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
Abhaysinh Zala
Jaemin Cho
Satwik Kottur
Xilun Chen
Barlas Ouguz
Yasher Mehdad
Joey Tianyi Zhou
3DV
22
51
0
29 Mar 2023
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation
Julio Silva-Rodríguez
Jose Dolz
Ismail Ben Ayed
66
13
0
29 Mar 2023
Data Efficient Contrastive Learning in Histopathology using Active Sampling
Tahsin Reasat
David S. Smith
MedIm
33
0
0
28 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
63
227
0
28 Mar 2023
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
79
751
0
28 Mar 2023
Visual Chain-of-Thought Diffusion Models
William Harvey
Frank Wood
DiffM
VLM
41
7
0
28 Mar 2023
Variational Distribution Learning for Unsupervised Text-to-Image Generation
Minsoo Kang
Doyup Lee
Jiseob Kim
Saehoon Kim
Bohyung Han
DRL
OOD
37
3
0
28 Mar 2023
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
59
156
0
28 Mar 2023
Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery
Mingxuan Liu
Subhankar Roy
Zhun Zhong
N. Sebe
Elisa Ricci
CLL
SSL
44
10
0
28 Mar 2023
OpenInst: A Simple Query-Based Method for Open-World Instance Segmentation
Cheng Wang
Guoli Wang
Qian Zhang
Pengning Guo
Wenyu Liu
Xinggang Wang
ISeg
VLM
32
7
0
28 Mar 2023
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
Xiangyang Li
Zihan Wang
Jiahao Yang
Yaowei Wang
Shuqiang Jiang
LM&Ro
26
38
0
28 Mar 2023
Previous
1
2
3
...
171
172
173
...
207
208
209
Next