ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.11038
41
0

QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models

15 April 2025
Yudong Zhang
Ruobing Xie
Jiansheng Chen
Xingwu Sun
Zhanhui Kang
Yu Wang
    AAML
ArXivPDFHTML
Abstract

In typical multimodal tasks, such as Visual Question Answering (VQA), adversarial attacks targeting a specific image and question can lead large vision-language models (LVLMs) to provide incorrect answers. However, it is common for a single image to be associated with multiple questions, and LVLMs may still answer other questions correctly even for an adversarial image attacked by a specific question. To address this, we introduce the query-agnostic visual attack (QAVA), which aims to create robust adversarial examples that generate incorrect responses to unspecified and unknown questions. Compared to traditional adversarial attacks focused on specific images and questions, QAVA significantly enhances the effectiveness and efficiency of attacks on images when the question is unknown, achieving performance comparable to attacks on known target questions. Our research broadens the scope of visual adversarial attacks on LVLMs in practical settings, uncovering previously overlooked vulnerabilities, particularly in the context of visual adversarial threats. The code is available atthis https URL.

View on arXiv
@article{zhang2025_2504.11038,
  title={ QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models },
  author={ Yudong Zhang and Ruobing Xie and Jiansheng Chen and Xingwu Sun and Zhanhui Kang and Yu Wang },
  journal={arXiv preprint arXiv:2504.11038},
  year={ 2025 }
}
Comments on this paper