Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 10,315 papers shown
Title
Boundary Guided Learning-Free Semantic Control with Diffusion Models
Ye Zhu
Yuehua Wu
Zhiwei Deng
Olga Russakovsky
Yan Yan
DiffM
32
23
0
16 Feb 2023
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
36
29
0
16 Feb 2023
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal
Lior Yariv
Y. Lipman
Tali Dekel
45
365
1
16 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
36
7
0
16 Feb 2023
Learning to Substitute Ingredients in Recipes
Bahare Fatemi
Quentin Duval
Rohit Girdhar
M. Drozdzal
Adriana Romero Soriano
29
7
0
15 Feb 2023
Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation
Joshua Vendrow
Saachi Jain
Logan Engstrom
Aleksander Madry
OOD
34
34
0
15 Feb 2023
Cliff-Learning
T. T. Wang
I. Zablotchi
Nir Shavit
Jonathan S. Rosenfeld
47
0
0
14 Feb 2023
From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten
Derya Soydaner
42
23
0
14 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?
Kathleen C. Fraser
S. Kiritchenko
I. Nejadgholi
DiffM
40
36
0
14 Feb 2023
Universal Guidance for Diffusion Models
Arpit Bansal
Hong-Min Chu
Avi Schwarzschild
Soumyadip Sengupta
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
54
246
0
14 Feb 2023
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
29
56
0
14 Feb 2023
Guiding Pretraining in Reinforcement Learning with Large Language Models
Yuqing Du
Olivia Watkins
Zihan Wang
Cédric Colas
Trevor Darrell
Pieter Abbeel
Abhishek Gupta
Jacob Andreas
LM&Ro
27
175
0
13 Feb 2023
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
72
353
0
13 Feb 2023
Explainable Anomaly Detection in Images and Videos: A Survey
Yizhou Wang
Dongliang Guo
Sheng Li
Octavia Camps
Yun Fu
42
5
0
13 Feb 2023
3D-aware Blending with Generative NeRFs
Hyunsung Kim
Gayoung Lee
Yunjey Choi
Jin-Hwa Kim
Jun-Yan Zhu
31
12
0
13 Feb 2023
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
Henrik Voigt
J. Hombeck
M. Meuschke
K. Lawonn
Sina Zarrieß
VLM
35
1
0
13 Feb 2023
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
42
50
0
13 Feb 2023
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
J. Allingham
Jie Jessie Ren
Michael W. Dusenberry
Xiuye Gu
Huayu Chen
Dustin Tran
J. Liu
Balaji Lakshminarayanan
LLMAG
VLM
37
33
0
13 Feb 2023
Understanding Multimodal Contrastive Learning and Incorporating Unpaired Data
Ryumei Nakada
Halil Ibrahim Gulluk
Zhun Deng
Wenlong Ji
James Zou
Linjun Zhang
SSL
VLM
42
37
0
13 Feb 2023
Learning to Scale Temperature in Masked Self-Attention for Image Inpainting
Xiang Zhou
Yuan Zeng
Yi Gong
40
2
0
13 Feb 2023
Knowledge from Large-Scale Protein Contact Prediction Models Can Be Transferred to the Data-Scarce RNA Contact Prediction Task
Yiren Jian
Chongyang Gao
Chen Zeng
Yunjie Zhao
Soroush Vosoughi
32
0
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
48
3
0
13 Feb 2023
One Transformer for All Time Series: Representing and Training with Time-Dependent Heterogeneous Tabular Data
Simone Luetto
Fabrizio Garuti
E. Sangineto
L. Forni
Rita Cucchiara
LMTD
AI4TS
89
10
0
13 Feb 2023
Rapid Development of Compositional AI
Lee Martie
Jessie Rosenberg
Véronique Demers
Gaoyuan Zhang
Onkar Bhardwaj
...
D. Adesina
Elahe Paikari
Oscar Resendiz
Sarah Shaw
David D. Cox
14
3
0
12 Feb 2023
SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao
Yue Ma
Shuyan Li
Hantao Zhou
Ran Liao
Xiu Li
13
8
0
12 Feb 2023
LipLearner: Customizable Silent Speech Interactions on Mobile Devices
Zixiong Su
Shitao Fang
Jun Rekimoto
18
26
0
12 Feb 2023
Flexible-modal Deception Detection with Audio-Visual Adapter
Zhaoxu Li
Zitong Yu
Nithish Muthuchamy Selvaraj
Xiaobao Guo
Bingquan Shen
A. Kong
Alex C. Kot
40
2
0
11 Feb 2023
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
90
574
0
10 Feb 2023
Boosting 3D Point Cloud Registration by Transferring Multi-modality Knowledge
Mingzhi Yuan
Xiaoshui Huang
Kexue Fu
Zhihao Li
Manning Wang
3DPC
31
6
0
10 Feb 2023
End-to-end Semantic Object Detection with Cross-Modal Alignment
Silvan Ferreira
Allan Martins
Ivan S. S. Silva
ObjD
27
0
0
10 Feb 2023
Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval
Ben Chen
Linbo Jin
Xinxin Wang
D. Gao
Wen Jiang
Wei Ning
22
3
0
10 Feb 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Zhuolin Yang
Ming-Yu Liu
Zihan Liu
V. Korthikanti
Weili Nie
...
Yuke Zhu
Mohammad Shoeybi
Bryan Catanzaro
Chaowei Xiao
Anima Anandkumar
VLM
RALM
34
39
0
09 Feb 2023
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
Pierre Harvey Richemond
Allison C. Tam
Yunhao Tang
Florian Strub
Bilal Piot
Felix Hill
SSL
36
9
0
09 Feb 2023
Lightweight Transformers for Clinical Natural Language Processing
Omid Rohanian
Mohammadmahdi Nouriborji
Hannah Jauncey
Samaneh Kouchaki
Isaric Clinical Characterisation Group
Lei A. Clifton
L. Merson
David Clifton
MedIm
LM&MA
29
12
0
09 Feb 2023
A Text-guided Protein Design Framework
Shengchao Liu
Yanjing Li
Zhuoxinran Li
A. Gitter
Yutao Zhu
...
Arvind Ramanathan
Chaowei Xiao
Jian Tang
Hongyu Guo
Anima Anandkumar
70
61
0
09 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
40
57
0
08 Feb 2023
Glaze: Protecting Artists from Style Mimicry by Text-to-Image Models
Shawn Shan
Jenna Cryan
Emily Wenger
Haitao Zheng
Rana Hanocka
Ben Y. Zhao
WIGM
17
177
0
08 Feb 2023
Prompting for Multimodal Hateful Meme Classification
Rui Cao
Roy Ka-wei Lee
Wen-Haw Chong
Jing Jiang
VLM
25
76
0
08 Feb 2023
Neural Congealing: Aligning Images to a Joint Semantic Atlas
Dolev Ofri-Amar
Michal Geyer
Yoni Kasten
Tali Dekel
28
20
0
08 Feb 2023
Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
Hyeonho Jeong
Gihyun Kwon
Jong Chul Ye
42
20
0
08 Feb 2023
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Felix Friedrich
Manuel Brack
Lukas Struppek
Dominik Hintersdorf
P. Schramowski
Sasha Luccioni
Kristian Kersting
45
120
0
07 Feb 2023
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
40
4
0
07 Feb 2023
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Kuniaki Saito
Kihyuk Sohn
Xiang Zhang
Chun-Liang Li
Chen-Yu Lee
Kate Saenko
Tomas Pfister
30
107
0
06 Feb 2023
Zero-shot Image-to-Image Translation
Gaurav Parmar
Krishna Kumar Singh
Richard Y. Zhang
Yijun Li
Jingwan Lu
Jun-Yan Zhu
DiffM
24
431
0
06 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Cen Chen
Mu Li
ViT
65
145
0
06 Feb 2023
Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
DiffM
VGen
99
509
0
06 Feb 2023
Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design
Lyle Regenwetter
Akash Srivastava
Dan Gutfreund
Faez Ahmed
31
28
0
06 Feb 2023
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval
Ziyang Luo
Pu Zhao
Can Xu
Xiubo Geng
Tao Shen
Chongyang Tao
Jing Ma
Qingwen Lin
Daxin Jiang
VLM
CLIP
34
3
0
06 Feb 2023
Perception Datasets for Anomaly Detection in Autonomous Driving: A Survey
Daniel Bogdoll
Svenja Uhlemeyer
K. Kowol
J. Marius Zöllner
38
19
0
06 Feb 2023
The SSL Interplay: Augmentations, Inductive Bias, and Generalization
Vivien A. Cabannes
B. Kiani
Randall Balestriero
Yann LeCun
A. Bietti
SSL
29
31
0
06 Feb 2023
Previous
1
2
3
...
175
176
177
...
205
206
207
Next