ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00020
  4. Cited By
Learning Transferable Visual Models From Natural Language Supervision

Learning Transferable Visual Models From Natural Language Supervision

26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Learning Transferable Visual Models From Natural Language Supervision"

50 / 10,594 papers shown
Title
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
Abhishek Sinha
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
35
118
0
12 Jun 2021
Assessing Multilingual Fairness in Pre-trained Multimodal
  Representations
Assessing Multilingual Fairness in Pre-trained Multimodal Representations
Jialu Wang
Yang Liu
Xinze Wang
EGVM
33
36
0
12 Jun 2021
Neural Symbolic Regression that Scales
Neural Symbolic Regression that Scales
Luca Biggio
Tommaso Bendinelli
Alexander Neitz
Aurelien Lucchi
Giambattista Parascandolo
59
173
0
11 Jun 2021
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot
  Learning for Structured Data
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data
Yang Hu
Adriane P. Chapman
Guihua Wen
Dame Wendy Hall
57
24
0
11 Jun 2021
Learning to See by Looking at Noise
Learning to See by Looking at Noise
Manel Baradad
Jonas Wulff
Tongzhou Wang
Phillip Isola
Antonio Torralba
49
90
0
10 Jun 2021
Pivotal Tuning for Latent-based Editing of Real Images
Pivotal Tuning for Latent-based Editing of Real Images
Daniel Roich
Ron Mokady
Amit H. Bermano
Daniel Cohen-Or
DiffM
41
524
0
10 Jun 2021
Taxonomy of Machine Learning Safety: A Survey and Primer
Taxonomy of Machine Learning Safety: A Survey and Primer
Sina Mohseni
Haotao Wang
Zhiding Yu
Chaowei Xiao
Zhangyang Wang
J. Yadawa
31
32
0
09 Jun 2021
Scaling Vision Transformers
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
85
1,067
0
08 Jun 2021
What Makes Multi-modal Learning Better than Single (Provably)
What Makes Multi-modal Learning Better than Single (Provably)
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
41
254
0
08 Jun 2021
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained
  Models
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
Chenfeng Xu
Shijia Yang
Tomer Galanti
Bichen Wu
Xiangyu Yue
Bohan Zhai
Wei Zhan
Peter Vajda
Kurt Keutzer
Masayoshi Tomizuka
3DPC
39
53
0
08 Jun 2021
Differentiable Quality Diversity
Differentiable Quality Diversity
Matthew C. Fontaine
Stefanos Nikolaidis
60
89
0
07 Jun 2021
On the Expressive Power of Self-Attention Matrices
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
42
34
0
07 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
41
374
0
04 Jun 2021
A Little Robustness Goes a Long Way: Leveraging Robust Features for
  Targeted Transfer Attacks
A Little Robustness Goes a Long Way: Leveraging Robust Features for Targeted Transfer Attacks
Jacob Mitchell Springer
Melanie Mitchell
Garrett Kenyon
AAML
48
43
0
03 Jun 2021
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and
  Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Mehdi Cherti
J. Jitsev
LM&MA
35
23
0
31 May 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
55
4,890
0
31 May 2021
Connecting Language and Vision for Natural Language-Based Vehicle
  Retrieval
Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
Shuai Bai
Zhedong Zheng
Xiaohan Wang
Junyang Lin
Zhu Zhang
Chang Zhou
Yi Yang
Hongxia Yang
38
27
0
31 May 2021
LIIR at SemEval-2021 task 6: Detection of Persuasion Techniques In Texts
  and Images using CLIP features
LIIR at SemEval-2021 task 6: Detection of Persuasion Techniques In Texts and Images using CLIP features
Erfan Ghadery
Damien Sileo
Marie-Francine Moens
VLM
24
2
0
31 May 2021
Contrastive Fine-tuning Improves Robustness for Neural Rankers
Contrastive Fine-tuning Improves Robustness for Neural Rankers
Xiaofei Ma
Cicero Nogueira dos Santos
Andrew O. Arnold
47
20
0
27 May 2021
CogView: Mastering Text-to-Image Generation via Transformers
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
54
772
0
26 May 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
32
428
0
24 May 2021
Improved OOD Generalization via Adversarial Training and Pre-training
Improved OOD Generalization via Adversarial Training and Pre-training
Mingyang Yi
Lu Hou
Jiacheng Sun
Lifeng Shang
Xin Jiang
Qun Liu
Zhi-Ming Ma
VLM
35
83
0
24 May 2021
Backdoor Attacks on Self-Supervised Learning
Backdoor Attacks on Self-Supervised Learning
Aniruddha Saha
Ajinkya Tejankar
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
SSL
AAML
27
103
0
21 May 2021
A Review on Explainability in Multimodal Deep Neural Nets
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
51
140
0
17 May 2021
Vision Transformers are Robust Learners
Vision Transformers are Robust Learners
Sayak Paul
Pin-Yu Chen
ViT
28
310
0
17 May 2021
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
  Solutions with Superior OOD Generalization
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization
Damien Teney
Ehsan Abbasnejad
Simon Lucey
Anton Van Den Hengel
56
87
0
12 May 2021
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
90
7,547
0
11 May 2021
Contrastive Attraction and Contrastive Repulsion for Representation
  Learning
Contrastive Attraction and Contrastive Repulsion for Representation Learning
Huangjie Zheng
Xu Chen
Jiangchao Yao
Hongxia Yang
Chunyuan Li
Ya Zhang
Hao Zhang
Ivor Tsang
Jingren Zhou
Mingyuan Zhou
SSL
47
12
0
08 May 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
230
900
0
28 Apr 2021
If your data distribution shifts, use self-learning
If your data distribution shifts, use self-learning
E. Rusak
Steffen Schneider
George Pachitariu
L. Eck
Peter V. Gehler
Oliver Bringmann
Wieland Brendel
Matthias Bethge
VLM
OOD
TTA
88
30
0
27 Apr 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
93
864
0
26 Apr 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
40
212
0
26 Apr 2021
Playing Lottery Tickets with Vision and Language
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
114
54
0
23 Apr 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
68
1,232
0
22 Apr 2021
Pri3D: Can 3D Priors Help 2D Representation Learning?
Pri3D: Can 3D Priors Help 2D Representation Learning?
Ji Hou
Saining Xie
Benjamin Graham
Angela Dai
Matthias Nießner
SSL
3DPC
MDE
87
80
0
22 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
35
41
0
19 Apr 2021
Data-Efficient Language-Supervised Zero-Shot Learning with
  Self-Distillation
Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation
Rui Cheng
Bichen Wu
Peizhao Zhang
Peter Vajda
Joseph E. Gonzalez
CLIP
VLM
21
31
0
18 Apr 2021
Towards Open-World Text-Guided Face Image Generation and Manipulation
Towards Open-World Text-Guided Face Image Generation and Manipulation
Weihao Xia
Yujiu Yang
Jing-Hao Xue
Baoyuan Wu
DiffM
46
39
0
18 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
337
788
0
18 Apr 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
40
1,477
0
18 Apr 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Shir Gur
Natalia Neverova
C. Stauffer
Ser-Nam Lim
Douwe Kiela
A. Reiter
29
26
0
16 Apr 2021
Exploring Visual Engagement Signals for Representation Learning
Exploring Visual Engagement Signals for Representation Learning
Menglin Jia
Zuxuan Wu
A. Reiter
Claire Cardie
Serge Belongie
Ser-Nam Lim
26
13
0
15 Apr 2021
Self-supervised Video Object Segmentation by Motion Grouping
Self-supervised Video Object Segmentation by Motion Grouping
Charig Yang
Hala Lamdouar
Erika Lu
Andrew Zisserman
Weidi Xie
VOS
OCL
35
157
0
15 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
46
97
0
05 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
75
1,824
0
05 Apr 2021
Towards General Purpose Vision Systems
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
20
50
0
01 Apr 2021
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay Jain
Matthew Tancik
Pieter Abbeel
42
491
0
01 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
74
1,144
0
01 Apr 2021
Composable Augmentation Encoding for Video Representation Learning
Composable Augmentation Encoding for Video Representation Learning
Chen Sun
Arsha Nagrani
Yonglong Tian
Cordelia Schmid
SSL
AI4TS
42
17
0
01 Apr 2021
Diagnosing Vision-and-Language Navigation: What Really Matters
Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu
Yuankai Qi
P. Narayana
Kazoo Sone
Sugato Basu
Xinze Wang
Qi Wu
Miguel P. Eckstein
Wenjie Wang
LM&Ro
43
50
0
30 Mar 2021
Previous
123...210211212
Next