CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition

27 March 2025

Abstract

Human Activity Recognition (HAR) is a fundamental technology for numerous human - centered intelligent applications. Although deep learning methods have been utilized to accelerate feature extraction, issues such as multimodal data mixing, activity heterogeneity, and complex model deployment remain largely unresolved. The aim of this paper is to address issues such as multimodal data mixing, activity heterogeneity, and complex model deployment in sensor-based human activity recognition. We propose a spatiotemporal attention modal decomposition alignment fusion strategy to tackle the problem of the mixed distribution of sensor data. Key discriminative features of activities are captured through cross-modal spatio-temporal disentangled representation, and gradient modulation is combined to alleviate data heterogeneity. In addition, a wearable deployment simulation system is constructed. We conducted experiments on a large number of public datasets, demonstrating the effectiveness of the model.

View on arXiv

@article{liu2025_2503.21843,
  title={ CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition },
  author={ Hanyu Liu and Siyao Li and Ying Yu and Yixuan Jiang and Hang Xiao and Jingxi Long and Haotian Tang },
  journal={arXiv preprint arXiv:2503.21843},
  year={ 2025 }
}

Comments on this paper