60
0

Mapping User Trust in Vision Language Models: Research Landscape, Challenges, and Prospects

Abstract

The rapid adoption of Vision Language Models (VLMs), pre-trained on large image-text and video-text datasets, calls for protecting and informing users about when to trust these systems. This survey reviews studies on trust dynamics in user-VLM interactions, through a multi-disciplinary taxonomy encompassing different cognitive science capabilities, collaboration modes, and agent behaviours. Literature insights and findings from a workshop with prospective VLM users inform preliminary requirements for future VLM trust studies.

View on arXiv
@article{chiatti2025_2505.05318,
  title={ Mapping User Trust in Vision Language Models: Research Landscape, Challenges, and Prospects },
  author={ Agnese Chiatti and Sara Bernardini and Lara Shibelski Godoy Piccolo and Viola Schiaffonati and Matteo Matteucci },
  journal={arXiv preprint arXiv:2505.05318},
  year={ 2025 }
}
Comments on this paper