Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 10,282 papers shown
Title
SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification
Fang Peng
Xiaoshan Yang
Linhui Xiao
Yaowei Wang
Changsheng Xu
VLM
40
44
0
28 Nov 2022
The Myth of Culturally Agnostic AI Models
E. Cetinic
DiffM
VLM
20
10
0
28 Nov 2022
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Kashu Yamazaki
Khoa T. Vo
Sang Truong
Bhiksha Raj
Ngan Le
31
35
0
28 Nov 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
27
47
0
27 Nov 2022
Post-Processing Temporal Action Detection
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
26
9
0
27 Nov 2022
Learning Object-Language Alignments for Open-Vocabulary Object Detection
Chuang Lin
Pei Sun
Yi-Xin Jiang
Ping Luo
Lizhen Qu
Gholamreza Haffari
Zehuan Yuan
Jianfei Cai
VLM
ObjD
29
95
0
27 Nov 2022
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Minghui Hu
Chuanxia Zheng
Heliang Zheng
Tat-Jen Cham
Chaoyue Wang
Zuopeng Yang
Dacheng Tao
Ponnuthurai Nagaratnam Suganthan
DiffM
25
23
0
27 Nov 2022
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Huaishao Luo
Junwei Bao
Youzheng Wu
Xiaodong He
Tianrui Li
VLM
32
146
0
27 Nov 2022
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs
Guangrun Wang
Philip Torr
28
8
0
27 Nov 2022
Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning
Yunchao Zhang
Zonglin Di
KAI-QING Zhou
Cihang Xie
Xin Eric Wang
FedML
AAML
34
2
0
27 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
38
6
0
27 Nov 2022
AvatarGen: A 3D Generative Model for Animatable Human Avatars
Jianfeng Zhang
Zihang Jiang
Dingdong Yang
Hongyi Xu
Yichun Shi
Guoxian Song
Zhongcong Xu
Xinchao Wang
Jiashi Feng
3DH
27
76
0
26 Nov 2022
DynaGAN: Dynamic Few-shot Adaptation of GANs to Multiple Domains
S. Kim
Kyoungkook Kang
Geon-Yeong Kim
Seung-Hwan Baek
Sunghyun Cho
32
19
0
26 Nov 2022
PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators
Md Ashiqur Rahman
Jasorsi Ghosh
Hrishikesh Viswanath
Kamyar Azizzadenesheli
Aniket Bera
32
8
0
25 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
40
203
0
25 Nov 2022
Multiverse: Multilingual Evidence for Fake News Detection
Daryna Dementieva
Mikhail Kuimov
Alexander Panchenko
36
4
0
25 Nov 2022
3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Gang Li
Heliang Zheng
Chaoyue Wang
Chang Li
C. Zheng
Dacheng Tao
DiffM
26
59
0
25 Nov 2022
CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text Labels
Siyuan Li
Li Sun
Qingli Li
VLM
35
150
0
25 Nov 2022
Expanding Small-Scale Datasets with Guided Imagination
Yifan Zhang
Daquan Zhou
Bryan Hooi
Kaixin Wang
Jiashi Feng
49
46
0
25 Nov 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Bin Cui
GNN
MoE
45
61
0
25 Nov 2022
Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark
Floriana Ciaglia
Francesco Saverio Zuppichini
Paul Guerrie
Mark McQuade
Jacob Solawetz
ObjD
16
46
0
24 Nov 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
44
11
0
24 Nov 2022
On the Importance of Image Encoding in Automated Chest X-Ray Report Generation
Otabek Nazarov
Mohammad Yaqub
Karthik Nandakumar
MedIm
24
3
0
24 Nov 2022
Delving into Out-of-Distribution Detection with Vision-Language Representations
Yifei Ming
Ziyan Cai
Jiuxiang Gu
Yiyou Sun
W. Li
Yixuan Li
VLM
OODD
69
161
0
24 Nov 2022
CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation
Yang You
Wenhao He
Jin Liu
Hongkai Xiong
Weiming Wang
Cewu Lu
3DPC
43
4
0
24 Nov 2022
Shifted Diffusion for Text-to-image Generation
Yufan Zhou
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
27
40
0
24 Nov 2022
Ham2Pose: Animating Sign Language Notation into Pose Sequences
Rotem Shalev-Arkushin
Amit Moryossef
Ohad Fried
SLR
38
19
0
24 Nov 2022
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Tanzila Rahman
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Shweta Mahajan
Leonid Sigal
DiffM
19
68
0
23 Nov 2022
How do Cross-View and Cross-Modal Alignment Affect Representations in Contrastive Learning?
Thomas M. Hehn
Julian F. P. Kooij
D. Gavrila
SSL
26
0
0
23 Nov 2022
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Binxin Yang
Shuyang Gu
Bo Zhang
Ting Zhang
Xuejin Chen
Xiaoyan Sun
Dong Chen
Fang Wen
DiffM
60
403
0
23 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
34
37
0
23 Nov 2022
TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation
Nikolai Kalischek
T. Peters
Jan Dirk Wegner
Konrad Schindler
DiffM
30
12
0
23 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
56
140
0
23 Nov 2022
Evolutionary Generalized Zero-Shot Learning
Dubing Chen
Haofeng Zhang
Yang Long
VLM
36
1
0
23 Nov 2022
Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition
Jennifer C. White
Ryan Cotterell
DiffM
38
5
0
23 Nov 2022
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
Kevin Frans
Phillip Isola
OffRL
47
9
0
23 Nov 2022
Open-vocabulary Attribute Detection
M. A. Bravo
Sudhanshu Mittal
Simon Ging
Thomas Brox
VLM
ObjD
19
30
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-Jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
63
37
0
23 Nov 2022
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang
Biao Gong
Yulin Pan
Jianwen Jiang
Yiliang Lv
Yuyuan Li
Donglin Wang
VLM
VPVLM
22
41
0
23 Nov 2022
PANeRF: Pseudo-view Augmentation for Improved Neural Radiance Fields Based on Few-shot Inputs
Young Chun Ahn
Seokhwan Jang
Sungheon Park
Ji-Yeon Kim
Nahyup Kang
33
12
0
23 Nov 2022
Texts as Images in Prompt Tuning for Multi-Label Image Recognition
Zixian Guo
Bowen Dong
Zhilong Ji
Jinfeng Bai
Yiwen Guo
W. Zuo
VLM
VPVLM
33
57
0
23 Nov 2022
RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Pierre J. Chambon
Christian Blüthgen
Jean-Benoit Delbrouck
Rogier van der Sluijs
M. Polacin
Juan Manuel Zambrano Chaves
Tanishq Mathew Abraham
Shivanshu Purohit
C. Langlotz
Akshay S. Chaudhari
LM&MA
DiffM
MedIm
37
99
0
23 Nov 2022
Promises and Pitfalls of Threshold-based Auto-labeling
Harit Vishwakarma
Heguang Lin
Frederic Sala
Ramya Korlakai Vinayak
39
9
0
22 Nov 2022
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation
Narek Tumanyan
Michal Geyer
Shai Bagon
Tali Dekel
77
642
0
22 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
22
95
0
22 Nov 2022
On the Transferability of Visual Features in Generalized Zero-Shot Learning
Paola Cascante-Bonilla
Leonid Karlinsky
James Smith
Yanjun Qi
Vicente Ordonez
33
2
0
22 Nov 2022
ModelDiff: A Framework for Comparing Learning Algorithms
Harshay Shah
Sung Min Park
Andrew Ilyas
A. Madry
SyDa
56
26
0
22 Nov 2022
SinDiffusion: Learning a Diffusion Model from a Single Natural Image
Weilun Wang
Jianmin Bao
Wen-gang Zhou
Dongdong Chen
Dong Chen
Lu Yuan
Houqiang Li
DiffM
36
49
0
22 Nov 2022
DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models
Shengqu Cai
E. R. Chan
Songyou Peng
Mohamad Shahbazi
Anton Obukhov
Luc Van Gool
Gordon Wetzstein
DiffM
43
34
0
22 Nov 2022
PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding
Honggu Zhou
Xiaogang Peng
Jiawei Mao
Zizhao Wu
Ming Zeng
3DPC
22
3
0
22 Nov 2022
Previous
1
2
3
...
181
182
183
...
204
205
206
Next