ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18359
54
1

Context-Enhanced Memory-Refined Transformer for Online Action Detection

24 March 2025
Zhanzhong Pang
Fadime Sener
Angela Yao
    OffRL
ArXivPDFHTML
Abstract

Online Action Detection (OAD) detects actions in streaming videos using past observations. State-of-the-art OAD approaches model past observations and their interactions with an anticipated future. The past is encoded using short- and long-term memories to capture immediate and long-range dependencies, while anticipation compensates for missing future context. We identify a training-inference discrepancy in existing OAD methods that hinders learning effectiveness. The training uses varying lengths of short-term memory, while inference relies on a full-length short-term memory. As a remedy, we propose a Context-enhanced Memory-Refined Transformer (CMeRT). CMeRT introduces a context-enhanced encoder to improve frame representations using additional near-past context. It also features a memory-refined decoder to leverage near-future generation to enhance performance. CMeRT achieves state-of-the-art in online detection and anticipation on THUMOS'14, CrossTask, and EPIC-Kitchens-100.

View on arXiv
@article{pang2025_2503.18359,
  title={ Context-Enhanced Memory-Refined Transformer for Online Action Detection },
  author={ Zhanzhong Pang and Fadime Sener and Angela Yao },
  journal={arXiv preprint arXiv:2503.18359},
  year={ 2025 }
}
Comments on this paper