Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.01862
Cited By
A Vision Check-up for Language Models
3 January 2024
Pratyusha Sharma
Tamar Rott Shaham
Manel Baradad
Stephanie Fu
Adrian Rodriguez-Munoz
Shivam Duggal
Phillip Isola
Antonio Torralba
VLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Vision Check-up for Language Models"
19 / 19 papers shown
Title
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi
Eddy Ilg
Margret Keuper
Hideki Tanaka
Masao Utiyama
Raj Dabre
Steffen Eger
Simone Paolo Ponzetto
99
0
0
14 Mar 2025
MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models
Vanya Cohen
Raymond J. Mooney
64
0
0
15 Feb 2025
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
Dimitrios Mallis
Ahmet Serdar Karadeniz
Sebastian Cavada
Danila Rukhovich
Niki Maria Foteinopoulou
K. Cherenkova
Anis Kacem
Djamila Aouada
117
5
0
18 Dec 2024
SceneLLM: Implicit Language Reasoning in LLM for Dynamic Scene Graph Generation
Hang Zhang
Zhuoling Li
Jun Liu
LRM
128
1
0
15 Dec 2024
Perception-guided Jailbreak against Text-to-Image Models
Yihao Huang
Le Liang
Tianlin Li
Xiaojun Jia
Run Wang
Weikai Miao
G. Pu
Yang Liu
56
8
0
20 Aug 2024
Synthetic Data from Diffusion Models Improves ImageNet Classification
Shekoofeh Azizi
Simon Kornblith
Chitwan Saharia
Mohammad Norouzi
David J. Fleet
VLM
DiffM
80
307
0
17 Apr 2023
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
E. Azarnasab
Faisal Ahmed
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
ReLM
KELM
LRM
61
379
0
20 Mar 2023
Procedural Image Programs for Representation Learning
Manel Baradad
Chun-Fu
Jonas Wulff
Tongzhou Wang
Rogerio Feris
Antonio Torralba
Phillip Isola
46
20
0
29 Nov 2022
MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
C. Eichenberg
Sid Black
Samuel Weinbach
Letitia Parcalabescu
Anette Frank
MLLM
VLM
49
100
0
09 Dec 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
50
193
0
29 Nov 2021
Improving Fractal Pre-training
Connor Anderson
Ryan Farrell
104
27
0
06 Oct 2021
Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color
Mostafa Abdou
Artur Kulmizev
Daniel Hershcovich
Stella Frank
Ellie Pavlick
Anders Søgaard
51
117
0
13 Sep 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
127
766
0
25 Jun 2021
Learning to See by Looking at Noise
Manel Baradad
Jonas Wulff
Tongzhou Wang
Phillip Isola
Antonio Torralba
59
91
0
10 Jun 2021
Constructing Taxonomies from Pretrained Language Models
Catherine Chen
Kevin Lin
Dan Klein
93
32
0
24 Oct 2020
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
426
3,397
0
09 Mar 2020
What do you learn from context? Probing for sentence structure in contextualized word representations
Ian Tenney
Patrick Xia
Berlin Chen
Alex Jinpeng Wang
Adam Poliak
...
Najoung Kim
Benjamin Van Durme
Samuel R. Bowman
Dipanjan Das
Ellie Pavlick
159
853
0
15 May 2019
Learning Semantic Segmentation from Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach
Yuhua Chen
Wen Li
Xiaoran Chen
Luc Van Gool
57
248
0
12 Dec 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
299
11,610
0
11 Jan 2018
1