Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.10282
Cited By
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
13 February 2023
Henrik Voigt
J. Hombeck
M. Meuschke
K. Lawonn
Sina Zarrieß
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions"
4 / 4 papers shown
Title
MemoVis: A GenAI-Powered Tool for Creating Companion Reference Images for 3D Design Feedback
Chen Chen
Cuong Nguyen
Thibault Groueix
Vladimir G. Kim
Nadir Weibel
DiffM
26
3
0
09 Sep 2024
Aerial Vision-and-Dialog Navigation
Yue Fan
Winson X. Chen
Tongzhou Jiang
Chun-ni Zhou
Yi Zhang
X. Wang
44
19
0
24 May 2022
VOS: Learning What You Don't Know by Virtual Outlier Synthesis
Xuefeng Du
Zhaoning Wang
Mu Cai
Yixuan Li
OODD
178
220
0
02 Feb 2022
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
196
405
0
13 Jul 2021
1