ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.01065
25
241

Memory-augmented Dense Predictive Coding for Video Representation Learning

3 August 2020
Tengda Han
Weidi Xie
Andrew Zisserman
    SSL
ArXivPDFHTML
Abstract

The objective of this paper is self-supervised learning from video, in particular for representations for action recognition. We make the following contributions: (i) We propose a new architecture and learning framework Memory-augmented Dense Predictive Coding (MemDPC) for the task. It is trained with a predictive attention mechanism over the set of compressed memories, such that any future states can always be constructed by a convex combination of the condense representations, allowing to make multiple hypotheses efficiently. (ii) We investigate visual-only self-supervised video representation learning from RGB frames, or from unsupervised optical flow, or both. (iii) We thoroughly evaluate the quality of learnt representation on four different downstream tasks: action recognition, video retrieval, learning with scarce annotations, and unintentional action classification. In all cases, we demonstrate state-of-the-art or comparable performance over other approaches with orders of magnitude fewer training data.

View on arXiv
Comments on this paper