356

Extreme Low Resolution Activity Recognition with Spatial-Temporal Attention Transfer

Abstract

Activity recognition on extreme low-resolution videos, e.g., a resolution of 12 * 6 pixels, plays a vital role in far-view surveillance and privacy-preserving multimedia analysis. Low-resolution videos only contain limited information. Given the fact that one same activity may be represented by videos in both high resolution(HR) and low resolution (LR), it is worth studying to utilize the relevant HR data to improve the LR activity recognition. In this work, we propose a novel Spatial-Temporal Attention Transfer (STAT) for LR activity recognition. STAT can acquire information from HR data by reducing the attention differences with a transfer-learning strategy. Experimental results on two well-known datasets, i.e., UCF101 and HMDB51, demonstrate that, the proposed method can effectively improve the accuracy of LR activity recognition, and achieves an accuracy of 58.12% on 12 * 16 videos in HMDB51, a state-of-the-art performance.

View on arXiv
Comments on this paper