ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00020
  4. Cited By
Learning Transferable Visual Models From Natural Language Supervision

Learning Transferable Visual Models From Natural Language Supervision

26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Learning Transferable Visual Models From Natural Language Supervision"

50 / 9,984 papers shown
Title
Efficient Self-supervised Vision Transformers for Representation
  Learning
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
37
209
0
17 Jun 2021
Poisoning and Backdooring Contrastive Learning
Poisoning and Backdooring Contrastive Learning
Nicholas Carlini
Andreas Terzis
46
156
0
17 Jun 2021
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection
Jie Jessie Ren
Stanislav Fort
J. Liu
Abhijit Guha Roy
Shreyas Padhy
Balaji Lakshminarayanan
UQCV
33
216
0
16 Jun 2021
Revisiting the Calibration of Modern Neural Networks
Revisiting the Calibration of Modern Neural Networks
Matthias Minderer
Josip Djolonga
Rob Romijnders
F. Hubis
Xiaohua Zhai
N. Houlsby
Dustin Tran
Mario Lucic
UQCV
51
358
0
15 Jun 2021
Communicating Natural Programs to Humans and Machines
Communicating Natural Programs to Humans and Machines
Samuel Acquaviva
Yewen Pu
Marta Kryven
Theo Sechopoulos
Catherine Wong
Gabrielle Ecanow
Maxwell Nye
Michael Henry Tessler
J. Tenenbaum
33
40
0
15 Jun 2021
Improved Transformer for High-Resolution GANs
Improved Transformer for High-Resolution GANs
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
34
95
0
14 Jun 2021
Partial success in closing the gap between human and machine vision
Partial success in closing the gap between human and machine vision
Robert Geirhos
Kantharaju Narayanappa
Benjamin Mitzkus
Tizian Thieringer
Matthias Bethge
Felix Wichmann
Wieland Brendel
VLM
AAML
48
221
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
46
815
0
14 Jun 2021
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
D2C: Diffusion-Denoising Models for Few-shot Conditional Generation
Abhishek Sinha
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
30
118
0
12 Jun 2021
Assessing Multilingual Fairness in Pre-trained Multimodal
  Representations
Assessing Multilingual Fairness in Pre-trained Multimodal Representations
Jialu Wang
Yang Liu
Qing Guo
EGVM
26
35
0
12 Jun 2021
Neural Symbolic Regression that Scales
Neural Symbolic Regression that Scales
Luca Biggio
Tommaso Bendinelli
Alexander Neitz
Aurelien Lucchi
Giambattista Parascandolo
54
170
0
11 Jun 2021
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot
  Learning for Structured Data
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data
Yang Hu
Adriane P. Chapman
Guihua Wen
Dame Wendy Hall
42
24
0
11 Jun 2021
Learning to See by Looking at Noise
Learning to See by Looking at Noise
Manel Baradad
Jonas Wulff
Tongzhou Wang
Phillip Isola
Antonio Torralba
28
89
0
10 Jun 2021
Taxonomy of Machine Learning Safety: A Survey and Primer
Taxonomy of Machine Learning Safety: A Survey and Primer
Sina Mohseni
Haotao Wang
Zhiding Yu
Chaowei Xiao
Zhangyang Wang
J. Yadawa
21
31
0
09 Jun 2021
Scaling Vision Transformers
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
79
1,060
0
08 Jun 2021
Differentiable Quality Diversity
Differentiable Quality Diversity
Matthew C. Fontaine
Stefanos Nikolaidis
51
89
0
07 Jun 2021
On the Expressive Power of Self-Attention Matrices
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
37
34
0
07 Jun 2021
MERLOT: Multimodal Neural Script Knowledge Models
MERLOT: Multimodal Neural Script Knowledge Models
Rowan Zellers
Ximing Lu
Jack Hessel
Youngjae Yu
J. S. Park
Jize Cao
Ali Farhadi
Yejin Choi
VLM
LRM
24
372
0
04 Jun 2021
A Little Robustness Goes a Long Way: Leveraging Robust Features for
  Targeted Transfer Attacks
A Little Robustness Goes a Long Way: Leveraging Robust Features for Targeted Transfer Attacks
Jacob Mitchell Springer
Melanie Mitchell
Garrett Kenyon
AAML
31
43
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
50
4,836
0
31 May 2021
Contrastive Fine-tuning Improves Robustness for Neural Rankers
Contrastive Fine-tuning Improves Robustness for Neural Rankers
Xiaofei Ma
Cicero Nogueira dos Santos
Andrew O. Arnold
18
20
0
27 May 2021
CogView: Mastering Text-to-Image Generation via Transformers
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
24
762
0
26 May 2021
Improved OOD Generalization via Adversarial Training and Pre-training
Improved OOD Generalization via Adversarial Training and Pre-training
Mingyang Yi
Lu Hou
Jiacheng Sun
Lifeng Shang
Xin Jiang
Qun Liu
Zhi-Ming Ma
VLM
31
83
0
24 May 2021
Backdoor Attacks on Self-Supervised Learning
Backdoor Attacks on Self-Supervised Learning
Aniruddha Saha
Ajinkya Tejankar
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
SSL
AAML
27
101
0
21 May 2021
A Review on Explainability in Multimodal Deep Neural Nets
A Review on Explainability in Multimodal Deep Neural Nets
Gargi Joshi
Rahee Walambe
K. Kotecha
29
139
0
17 May 2021
Vision Transformers are Robust Learners
Vision Transformers are Robust Learners
Sayak Paul
Pin-Yu Chen
ViT
28
307
0
17 May 2021
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
  Solutions with Superior OOD Generalization
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization
Damien Teney
Ehsan Abbasnejad
Simon Lucey
Anton Van Den Hengel
40
87
0
12 May 2021
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
83
7,441
0
11 May 2021
Contrastive Attraction and Contrastive Repulsion for Representation
  Learning
Contrastive Attraction and Contrastive Repulsion for Representation Learning
Huangjie Zheng
Xu Chen
Jiangchao Yao
Hongxia Yang
Chunyuan Li
Ya Zhang
Hao Zhang
Ivor Tsang
Jingren Zhou
Mingyuan Zhou
SSL
42
12
0
08 May 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
225
899
0
28 Apr 2021
If your data distribution shifts, use self-learning
If your data distribution shifts, use self-learning
E. Rusak
Steffen Schneider
George Pachitariu
L. Eck
Peter V. Gehler
Oliver Bringmann
Wieland Brendel
Matthias Bethge
VLM
OOD
TTA
79
29
0
27 Apr 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
66
861
0
26 Apr 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
35
212
0
26 Apr 2021
Playing Lottery Tickets with Vision and Language
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
109
54
0
23 Apr 2021
Multiscale Vision Transformers
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
63
1,224
0
22 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong-jin Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
30
41
0
19 Apr 2021
Data-Efficient Language-Supervised Zero-Shot Learning with
  Self-Distillation
Data-Efficient Language-Supervised Zero-Shot Learning with Self-Distillation
Rui Cheng
Bichen Wu
Peizhao Zhang
Peter Vajda
Joseph E. Gonzalez
CLIP
VLM
21
31
0
18 Apr 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
17
1,442
0
18 Apr 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
Shir Gur
Natalia Neverova
C. Stauffer
Ser-Nam Lim
Douwe Kiela
A. Reiter
19
26
0
16 Apr 2021
Exploring Visual Engagement Signals for Representation Learning
Exploring Visual Engagement Signals for Representation Learning
Menglin Jia
Zuxuan Wu
A. Reiter
Claire Cardie
Serge Belongie
Ser-Nam Lim
21
13
0
15 Apr 2021
Self-supervised Video Object Segmentation by Motion Grouping
Self-supervised Video Object Segmentation by Motion Grouping
Charig Yang
Hala Lamdouar
Erika Lu
Andrew Zisserman
Weidi Xie
VOS
OCL
30
157
0
15 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
39
97
0
05 Apr 2021
Towards General Purpose Vision Systems
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
11
50
0
01 Apr 2021
Composable Augmentation Encoding for Video Representation Learning
Composable Augmentation Encoding for Video Representation Learning
Chen Sun
Arsha Nagrani
Yonglong Tian
Cordelia Schmid
SSL
AI4TS
37
17
0
01 Apr 2021
Diagnosing Vision-and-Language Navigation: What Really Matters
Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu
Yuankai Qi
P. Narayana
Kazoo Sone
Sugato Basu
Qing Guo
Qi Wu
Miguel P. Eckstein
Luu Anh Tuan
LM&Ro
27
50
0
30 Mar 2021
LatentKeypointGAN: Controlling GANs via Latent Keypoints
LatentKeypointGAN: Controlling GANs via Latent Keypoints
Xingzhe He
Bastian Wandt
Helge Rhodin
GAN
30
6
0
29 Mar 2021
Generic Attention-model Explainability for Interpreting Bi-Modal and
  Encoder-Decoder Transformers
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
Hila Chefer
Shir Gur
Lior Wolf
ViT
31
303
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
B. Guo
ViT
148
20,710
0
25 Mar 2021
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
  Improved Cross-Modal Retrieval
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
Gregor Geigle
Jonas Pfeiffer
Nils Reimers
Ivan Vulić
Iryna Gurevych
35
59
0
22 Mar 2021
Paint by Word
Paint by Word
A. Andonian
David Bau
Audrey Cui
YeonHwan Park
Ali Jahanian
Antonio Torralba
A. Oliva
DiffM
20
125
0
19 Mar 2021
Previous
123...198199200
Next