Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 10,407 papers shown
Title
CHATEDIT: Towards Multi-turn Interactive Facial Image Editing via Dialogue
Xing Cui
Zekun Li
Peipei Li
Yibo Hu
Hailin Shi
Zhaofeng He
41
7
0
20 Mar 2023
Pluralistic Aging Diffusion Autoencoder
Peipei Li
Rui Wang
Huaibo Huang
Ran He
Zhaofeng He
DiffM
35
15
0
20 Mar 2023
Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models
René Haas
Inbar Huberman-Spiegelglas
Rotem Mulayoff
Stella Graßhof
Sami S. Brandt
T. Michaeli
DiffM
40
40
0
20 Mar 2023
Location-Free Scene Graph Generation
Ege Özsoy
Felix Holm
Tobias Czempiel
Tobias Czempiel
Benjamin Busam
Nassir Navab
Benjamin Busam
55
4
0
20 Mar 2023
Decomposed Prototype Learning for Few-Shot Scene Graph Generation
Xingchen Li
Long Chen
Guikun Chen
Yinfu Feng
Yi Yang
Jun Xiao
35
6
0
20 Mar 2023
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
Matthew Lyle Olson
Shusen Liu
Rushil Anirudh
Jayaraman J. Thiagarajan
P. Bremer
Weng-Keen Wong
31
5
0
19 Mar 2023
A Region-Prompted Adapter Tuning for Visual Abductive Reasoning
Hao Zhang
Yeo Keat Ee
Basura Fernando
VLM
34
3
0
18 Mar 2023
MRIS: A Multi-modal Retrieval Approach for Image Synthesis on Diverse Modalities
Boqi Chen
Marc Niethammer
38
1
0
17 Mar 2023
A Recipe for Watermarking Diffusion Models
Yunqing Zhao
Tianyu Pang
Chao Du
Xiao Yang
Ngai-man Cheung
Min Lin
WIGM
35
115
0
17 Mar 2023
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
Can Qin
Ning Yu
Chen Xing
Shu Zhen Zhang
Zeyuan Chen
Stefano Ermon
Yun Fu
Caiming Xiong
Ran Xu
DiffM
53
20
0
17 Mar 2023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Xiang Ji
Chang-rui Liu
Li-ming Yuan
Jie Chen
DiffM
VGen
35
54
0
17 Mar 2023
PersonalTailor: Personalizing 2D Pattern Design from 3D Garment Point Clouds
Sauradip Nag
Anran Qi
Xiatian Zhu
Ariel Shamir
3DPC
44
6
0
17 Mar 2023
VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object Detection
Arushi Rai
Adriana Kovashka
32
0
0
16 Mar 2023
P+: Extended Textual Conditioning in Text-to-Image Generation
A. Voynov
Qinghao Chu
Daniel Cohen-Or
Kfir Aberman
VLM
DiffM
51
176
0
16 Mar 2023
Data Roaming and Quality Assessment for Composed Image Retrieval
Matan Levy
Rami Ben-Ari
N. Darshan
Dani Lischinski
48
23
0
16 Mar 2023
ShabbyPages: A Reproducible Document Denoising and Binarization Dataset
Alexander Groleau
Kok Wei Chee
Stefan Larson
Samay Maini
Jonathan Boarman
24
2
0
16 Mar 2023
SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective
Zipeng Xu
Songlong Xing
E. Sangineto
N. Sebe
CLIP
32
2
0
16 Mar 2023
GridCLIP: One-Stage Object Detection by Grid-Level CLIP Representation Learning
Jiaying Lin
S. Gong
VLM
CLIP
ObjD
25
22
0
16 Mar 2023
A Dual Branch Network for Emotional Reaction Intensity Estimation
Jun-chen Yu
Jichao Zhu
Wangyuan Zhu
Zhongpeng Cai
Guochen Xie
Renda Li
Gongpeng Zhao
37
6
0
16 Mar 2023
Rethinking Model Ensemble in Transfer-based Adversarial Attacks
Huanran Chen
Yichi Zhang
Yinpeng Dong
Xiao Yang
Hang Su
Junyi Zhu
AAML
38
57
0
16 Mar 2023
Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models
Xinyang Liu
Dongsheng Wang
Bowei Fang
Miaoge Li
Zhibin Duan
Yishi Xu
Bo Chen
Mingyuan Zhou
VLM
VPVLM
36
5
0
16 Mar 2023
Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation
Xingyu Chen
Yu Deng
Baoyuan Wang
37
14
0
16 Mar 2023
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
D. Kothandaraman
Dinesh Manocha
Ming Lin
Dinesh Manocha
36
5
0
15 Mar 2023
Evaluating gesture generation in a large-scale open challenge: The GENEA Challenge 2022
Taras Kucherenko
Pieter Wolfert
Youngwoo Yoon
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
37
24
0
15 Mar 2023
Mining False Positive Examples for Text-Based Person Re-identification
Wenhao Xu
Zhiyin Shao
Changxing Ding
35
4
0
15 Mar 2023
Lana: A Language-Capable Navigator for Instruction Following and Generation
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAG
LM&Ro
46
38
0
15 Mar 2023
Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis
Liangchen Song
Zhong Li
Xuan Gong
Lele Chen
Zhaoyu Chen
Yinghao Xu
Junsong Yuan
67
6
0
15 Mar 2023
Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey
Huali Xu
Shuaifeng Zhi
Shuzhou Sun
Vishal M. Patel
Li Liu
46
13
0
15 Mar 2023
Diversity-Aware Meta Visual Prompting
Qidong Huang
Xiaoyi Dong
DongDong Chen
Weiming Zhang
Feifei Wang
Gang Hua
Neng H. Yu
VLM
VPVLM
46
53
0
14 Mar 2023
Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis
Renrui Zhang
Liuhui Wang
Ziyu Guo
Yali Wang
Peng Gao
Hongsheng Li
Jianbo Shi
3DPC
32
52
0
14 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
52
437
0
14 Mar 2023
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo
Wooseok Jang
Minseop Kwak
Ines Hyeonsu Kim
Jaehoon Ko
Junho Kim
Jin-Hwa Kim
Jiyoung Lee
Seung Wook Kim
DiffM
46
136
0
14 Mar 2023
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data
Xuenan Xu
Zhiling Zhang
Zelin Zhou
Pingyue Zhang
Zeyu Xie
Mengyue Wu
Ke Zhu
CLIP
78
14
0
14 Mar 2023
WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminant Analysis
Yiye Chen
Yunzhi Lin
Ruinian Xu
Patricio A. Vela
OODD
37
3
0
14 Mar 2023
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He
Jun Wang
Jielin Qiu
Trung Bui
Abhinav Shrivastava
Zhaowen Wang
27
66
0
13 Mar 2023
Evaluating Visual Number Discrimination in Deep Neural Networks
Ivana Kajić
Aida Nematzadeh
14
0
0
13 Mar 2023
Prompting AI Art: An Investigation into the Creative Skill of Prompt Engineering
J. Oppenlaender
Rhema Linder
Johanna M. Silvennoinen
21
73
0
13 Mar 2023
ViM: Vision Middleware for Unified Downstream Transferring
Yutong Feng
Biao Gong
Jianwen Jiang
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
37
1
0
13 Mar 2023
Towards General Purpose Medical AI: Continual Learning Medical Foundation Model
Huahui Yi
Ziyuan Qin
Qicheng Lao
Wei Xu
Zekun Jiang
Dequan Wang
Shaoting Zhang
Kang Li
OOD
MedIm
CLL
40
11
0
12 Mar 2023
Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review
Asim Waqas
Aakash Tripathi
Ravichandran Ramachandran
Paul Stewart
Ghulam Rasool
AI4CE
42
32
0
11 Mar 2023
Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip Torr
51
23
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
60
5
0
11 Mar 2023
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models
Tom van Sonsbeek
Mohammad Mahdi Derakhshani
Ivona Najdenkoska
Cees G. M. Snoek
M. Worring
LM&MA
16
52
0
10 Mar 2023
GECCO: Geometrically-Conditioned Point Diffusion Models
M. Tyszkiewicz
Pascal Fua
Eduard Trulls
DiffM
26
21
0
10 Mar 2023
Distributionally Robust Optimization with Probabilistic Group
Soumya Suvra Ghosal
Yixuan Li
OOD
24
7
0
10 Mar 2023
CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
Jiangbin Zheng
Yile Wang
Cheng Tan
Siyuan Li
Ge Wang
Jun Xia
Yidong Chen
Stan Z. Li
SLR
38
63
0
10 Mar 2023
Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity
Tong Liang
Jim Davis
43
11
0
10 Mar 2023
HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
Shixiang Tang
Cheng Chen
Qingsong Xie
Meilin Chen
Yizhou Wang
...
Feng Zhu
Haiyang Yang
Li Yi
Rui Zhao
Wanli Ouyang
VLM
37
36
0
10 Mar 2023
Iterative Few-shot Semantic Segmentation from Image Label Text
Haohan Wang
Lu Liu
Wuhao Zhang
Jiangning Zhang
Zhenye Gan
Yabiao Wang
Chengjie Wang
Haoqian Wang
VLM
24
16
0
10 Mar 2023
Knowledge-augmented Few-shot Visual Relation Detection
Tianyu Yu
Yongqian Li
Jiaoyan Chen
Hai-Tao Zheng
Haitao Zheng
...
Qingbin Liu
Wenqiang Liu
Dongxiao Huang
Bei Wu
Yexin Wang
55
6
0
09 Mar 2023
Previous
1
2
3
...
173
174
175
...
207
208
209
Next