Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.07991
Cited By
LiT: Zero-Shot Transfer with Locked-image text Tuning
15 November 2021
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LiT: Zero-Shot Transfer with Locked-image text Tuning"
22 / 422 papers shown
Title
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Zaid Khan
B. Vijaykumar
Xiang Yu
S. Schulter
Manmohan Chandraker
Y. Fu
CLIP
VLM
25
16
0
27 Mar 2022
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning
Hugo Elias Berg
S. Hall
Yash Bhalgat
Wonsuk Yang
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
VLM
22
99
0
22 Mar 2022
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
S. Gadre
Mitchell Wortsman
Gabriel Ilharco
Ludwig Schmidt
Shuran Song
CLIP
LM&Ro
32
142
0
20 Mar 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
22
90
0
12 Mar 2022
Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
Yufeng Cui
Lichen Zhao
Feng Liang
Yangguang Li
Jing Shao
UQCV
VLM
CLIP
19
43
0
11 Mar 2022
Geodesic Multi-Modal Mixup for Robust Fine-Tuning
Changdae Oh
Junhyuk So
Hoyoon Byun
Yongtaek Lim
Minchul Shin
Jong-June Jeon
Kyungwoo Song
33
26
0
08 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
43
14
0
01 Mar 2022
Delving Deeper into Cross-lingual Visual Question Answering
Chen Cecilia Liu
Jonas Pfeiffer
Anna Korhonen
Ivan Vulić
Iryna Gurevych
28
8
0
15 Feb 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
40
88
0
14 Feb 2022
Zero-Shot Text-Guided Object Generation with Dream Fields
Ajay Jain
B. Mildenhall
Jonathan T. Barron
Pieter Abbeel
Ben Poole
35
560
0
02 Dec 2021
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
8
193
0
19 Nov 2021
CoLLIE: Continual Learning of Language Grounding from Language-Image Embeddings
Gabriel Skantze
Bram Willemsen
VLM
16
13
0
15 Nov 2021
MURAL: Multimodal, Multitask Retrieval Across Languages
Aashi Jain
Mandy Guo
Krishna Srinivasan
Ting-Li Chen
Sneha Kudugunta
Chao Jia
Yinfei Yang
Jason Baldridge
VLM
115
52
0
10 Sep 2021
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
31
689
0
04 Sep 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
39
614
0
18 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
317
5,785
0
29 Apr 2021
Dataset Summarization by K Principal Concepts
Niv Cohen
Yedid Hoshen
18
1
0
08 Apr 2021
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
208
310
0
02 Mar 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
278
1,082
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
Previous
1
2
3
4
5
6
7
8
9