Towards Perceiving Small Visual Details in Zero-shot Visual Question
  Answering with Multimodal LLMs

Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs

Papers citing "Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs"