Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.03927
Cited By
APE: Aligning Pretrained Encoders to Quickly Learn Aligned Multimodal Representations
8 October 2022
Elan Rosenfeld
Preetum Nakkiran
Hadi Pouransari
Oncel Tuzel
Fartash Faghri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"APE: Aligning Pretrained Encoders to Quickly Learn Aligned Multimodal Representations"
5 / 5 papers shown
Title
Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment
Shenghong Dai
Shiqi Jiang
Yifan Yang
Ting Cao
Mo Li
Suman Banerjee
Lili Qiu
49
2
0
25 Jul 2024
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Sachin Mehta
Maxwell Horton
Fartash Faghri
Mohammad Hossein Sekhavat
Mahyar Najibi
Mehrdad Farajtabar
Oncel Tuzel
Mohammad Rastegari
VLM
CLIP
38
6
0
24 Apr 2024
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Haoxiang Wang
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Mehrdad Farajtabar
Sachin Mehta
Mohammad Rastegari
Oncel Tuzel
Hadi Pouransari
VLM
27
67
0
23 Oct 2023
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
275
1,081
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,693
0
11 Feb 2021
1