Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition

5 May 2025

Abstract

Sign language recognition (SLR) faces fundamental challenges in creating accurate annotations due to the inherent complexity of simultaneous manual and non-manual signals. To the best of our knowledge, this is the first work to integrate generative large language models (LLMs) into SLR tasks. We propose a novel Generative Sign-description Prompts Multi-positive Contrastive learning (GSP-MC) method that leverages retrieval-augmented generation (RAG) with domain-specific LLMs, incorporating multi-step prompt engineering and expert-validated sign language corpora to produce precise multipart descriptions. The GSP-MC method also employs a dual-encoder architecture to bidirectionally align hierarchical skeleton features with multiple text descriptions (global, synonym, and part level) through probabilistic matching. Our approach combines global and part-level losses, optimizing KL divergence to ensure robust alignment across all relevant text-skeleton pairs while capturing both sign-level semantics and detailed part dynamics. Experiments demonstrate state-of-the-art performance against existing methods on the Chinese SLR500 (reaching 97.1%) and Turkish AUTSL datasets (97.07% accuracy). The method's cross-lingual effectiveness highlight its potential for developing inclusive communication technologies.

View on arXiv

@article{liang2025_2505.02304,
  title={ Generative Sign-description Prompts with Multi-positive Contrastive Learning for Sign Language Recognition },
  author={ Siyu Liang and Yunan Li and Wentian Xin and Huizhou Chen and Xujie Liu and Kang Liu and Qiguang Miao },
  journal={arXiv preprint arXiv:2505.02304},
  year={ 2025 }
}

Comments on this paper