ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.13860
7
0

Domain Adaptation of VLM for Soccer Video Understanding

20 May 2025
Tiancheng Jiang
Henry Wang
Md Sirajus Salekin
Parmida Atighehchian
Shinan Zhang
    VLM
ArXivPDFHTML
Abstract

Vision Language Models (VLMs) have demonstrated strong performance in multi-modal tasks by effectively aligning visual and textual representations. However, most video understanding VLM research has been domain-agnostic, leaving the understanding of their transfer learning capability to specialized domains under-explored. In this work, we address this by exploring the adaptability of open-source VLMs to specific domains, and focusing on soccer as an initial case study. Our approach uses large-scale soccer datasets and LLM to create instruction-following data, and use them to iteratively fine-tune the general-domain VLM in a curriculum learning fashion (first teaching the model key soccer concepts to then question answering tasks). The final adapted model, trained using a curated dataset of 20k video clips, exhibits significant improvement in soccer-specific tasks compared to the base model, with a 37.5% relative improvement for the visual question-answering task and an accuracy improvement from 11.8% to 63.5% for the downstream soccer action classification task.

View on arXiv
@article{jiang2025_2505.13860,
  title={ Domain Adaptation of VLM for Soccer Video Understanding },
  author={ Tiancheng Jiang and Henry Wang and Md Sirajus Salekin and Parmida Atighehchian and Shinan Zhang },
  journal={arXiv preprint arXiv:2505.13860},
  year={ 2025 }
}
Comments on this paper