ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13957
56
0

Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation

20 May 2025
Jiankun Zhang
Shenglai Zeng
Jie Ren
Tianqi Zheng
Hui Liu
Xianfeng Tang
Hui Liu
Yi Chang
ArXiv (abs)PDFHTML
Main:7 Pages
19 Figures
Bibliography:4 Pages
12 Tables
Appendix:11 Pages
Abstract

Multimodal Retrieval-Augmented Generation (MRAG) systems enhance LMMs by integrating external multimodal databases, but introduce unexplored privacy vulnerabilities. While text-based RAG privacy risks have been studied, multimodal data presents unique challenges. We provide the first systematic analysis of MRAG privacy vulnerabilities across vision-language and speech-language modalities. Using a novel compositional structured prompt attack in a black-box setting, we demonstrate how attackers can extract private information by manipulating queries. Our experiments reveal that LMMs can both directly generate outputs resembling retrieved content and produce descriptions that indirectly expose sensitive information, highlighting the urgent need for robust privacy-preserving MRAG techniques.

View on arXiv
@article{zhang2025_2505.13957,
  title={ Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation },
  author={ Jiankun Zhang and Shenglai Zeng and Jie Ren and Tianqi Zheng and Hui Liu and Xianfeng Tang and Hui Liu and Yi Chang },
  journal={arXiv preprint arXiv:2505.13957},
  year={ 2025 }
}
Comments on this paper