Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.07143
Cited By
Reproducible scaling laws for contrastive language-image learning
14 December 2022
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reproducible scaling laws for contrastive language-image learning"
34 / 134 papers shown
Title
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
Xunguang Wang
Zhenlan Ji
Pingchuan Ma
Zongjie Li
Shuai Wang
MLLM
41
11
0
04 Dec 2023
Meta ControlNet: Enhancing Task Adaptation via Meta Learning
Junjie Yang
Jinze Zhao
Peihao Wang
Zhangyang Wang
Yingbin Liang
31
2
0
03 Dec 2023
Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing
Piper Wolters
F. Bastani
Aniruddha Kembhavi
24
2
0
29 Nov 2023
Event Camera Data Dense Pre-training
Yan Yang
Liyuan Pan
Liu Liu
30
4
0
20 Nov 2023
What's left can't be right -- The remaining positional incompetence of contrastive vision-language models
Nils Hoehing
Ellen Rushe
Anthony Ventresque
VLM
18
2
0
20 Nov 2023
Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing
Ashutosh Nirala
Ameya Joshi
Chinmay Hegde
S Sarkar
VLM
36
0
0
15 Nov 2023
LipSim: A Provably Robust Perceptual Similarity Metric
Sara Ghazanfari
Alexandre Araujo
Prashanth Krishnamurthy
Farshad Khorrami
Siddharth Garg
38
5
0
27 Oct 2023
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper
Hadar Averbuch-Elor
40
10
0
25 Oct 2023
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
39
2
0
08 Oct 2023
Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition
W. He
Kai Han
Ying Nie
Chengcheng Wang
Yunhe Wang
VLM
45
6
0
25 Sep 2023
MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP
Prajwal Ganugula
Y. Kumar
N. Reddy
Prabhath Chellingi
A. Thakur
Neeraj Kasera
C. S. Anand
CLIP
DiffM
11
3
0
24 Sep 2023
Uncovering Neural Scaling Laws in Molecular Representation Learning
Dingshuo Chen
Yanqiao Zhu
Jieyu Zhang
Yuanqi Du
Zhixun Li
Qiang Liu
Shu Wu
Liang Wang
32
16
0
15 Sep 2023
Multimodal Foundation Models For Echocardiogram Interpretation
M. Christensen
Milos Vukadinovic
N. Yuan
David Ouyang
MedIm
21
7
0
29 Aug 2023
Adversarial Illusions in Multi-Modal Embeddings
Tingwei Zhang
Rishi Jha
Eugene Bagdasaryan
Vitaly Shmatikov
AAML
34
8
0
22 Aug 2023
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Fulong Ye
Guangyi Liu
Xinya Wu
Ledell Yu Wu
VLM
39
25
0
19 Aug 2023
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
37
2
0
07 Aug 2023
A Parameter-efficient Multi-subject Model for Predicting fMRI Activity
Connor Lane
Gregory Kiar
22
2
0
04 Aug 2023
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Muhammad Awais
Muzammal Naseer
Salman Khan
Rao Muhammad Anwer
Hisham Cholakkal
M. Shah
Ming Yang
F. Khan
VLM
38
118
0
25 Jul 2023
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
Kate Sanders
David Etter
Reno Kriz
Benjamin Van Durme
VGen
39
7
0
06 Jul 2023
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
S. Hall
F. G. Abrantes
Hanwen Zhu
Grace A. Sodunke
Aleksandar Shtedritski
Hannah Rose Kirk
CoGe
21
39
0
21 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
33
7
0
14 Jun 2023
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
Yuxin Wen
John Kirchenbauer
Jonas Geiping
Tom Goldstein
WIGM
46
100
0
31 May 2023
Learning without Forgetting for Vision-Language Models
Da-Wei Zhou
Yuanhan Zhang
Jingyi Ning
Jingyi Ning
De-Chuan Zhan
De-Chuan Zhan
Ziwei Liu
VLM
CLL
71
37
0
30 May 2023
In-Context Impersonation Reveals Large Language Models' Strengths and Biases
Leonard Salewski
Stephan Alaniz
Isabel Rio-Torto
Eric Schulz
Zeynep Akata
41
149
0
24 May 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
110
3,030
0
14 Apr 2023
OPI at SemEval 2023 Task 1: Image-Text Embeddings and Multimodal Information Retrieval for Visual Word Sense Disambiguation
Slawomir Dadas
22
5
0
14 Apr 2023
DIME-FM: DIstilling Multimodal and Efficient Foundation Models
Ximeng Sun
Pengchuan Zhang
Peizhao Zhang
Hardik Shah
Kate Saenko
Xide Xia
VLM
25
20
0
31 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
43
224
0
28 Mar 2023
A Region-Prompted Adapter Tuning for Visual Abductive Reasoning
Hao Zhang
Yeo Keat Ee
Basura Fernando
VLM
27
3
0
18 Mar 2023
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
J. Allingham
Jie Jessie Ren
Michael W. Dusenberry
Xiuye Gu
Huayu Chen
Dustin Tran
J. Liu
Balaji Lakshminarayanan
LLMAG
VLM
35
33
0
13 Feb 2023
Discovering and Mitigating Visual Biases through Keyword Explanation
Younghyun Kim
Sangwoo Mo
Minkyu Kim
Kyungmin Lee
Jaeho Lee
Jinwoo Shin
40
32
0
26 Jan 2023
Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods
Skanda Koppula
Yazhe Li
Evan Shelhamer
Andrew Jaegle
Nikhil Parthasarathy
Relja Arandjelović
João Carreira
Olivier J. Hénaff
33
9
0
30 Sep 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
240
4,469
0
23 Jan 2020
Previous
1
2
3