Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.04372
Cited By
Pre-training image-language transformers for open-vocabulary tasks
9 September 2022
A. Piergiovanni
Weicheng Kuo
A. Angelova
VLM
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pre-training image-language transformers for open-vocabulary tasks"
8 / 8 papers shown
Title
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra
Loic Matthey
Alexander Lerchner
Niloy J. Mitra
24
6
0
29 Nov 2023
Joint Adaptive Representations for Image-Language Learning
A. Piergiovanni
A. Angelova
VLM
34
0
0
31 May 2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Xi Chen
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Soravit Changpinyo
...
Mojtaba Seyedhosseini
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
VLM
56
187
0
29 May 2023
What Makes for Good Visual Tokenizers for Large Language Models?
Guangzhi Wang
Yixiao Ge
Xiaohan Ding
Mohan S. Kankanhalli
Ying Shan
MLLM
VLM
27
38
0
20 May 2023
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Lucas Beyer
Bo Wan
Gagan Madan
Filip Pavetić
Andreas Steiner
...
Emanuele Bugliarello
Tianlin Li
Qihang Yu
Liang-Chieh Chen
Xiaohua Zhai
54
8
0
30 Mar 2023
PaLM-E: An Embodied Multimodal Language Model
Danny Driess
F. Xia
Mehdi S. M. Sajjadi
Corey Lynch
Aakanksha Chowdhery
...
Marc Toussaint
Klaus Greff
Andy Zeng
Igor Mordatch
Peter R. Florence
LM&Ro
22
1,565
0
06 Mar 2023
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Xi Chen
Tianlin Li
Soravit Changpinyo
A. Piergiovanni
Piotr Padlewski
...
Andreas Steiner
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
MLLM
VLM
37
687
0
14 Sep 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
310
3,708
0
11 Feb 2021
1