With a Little Help from my Temporal Context: Multimodal Egocentric
Action Recognition

With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition

1 November 2021

Evangelos Kazakos

Andrew Zisserman

Dima Damen

Papers citing "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition"

18 / 18 papers shown

Title
An Outlook into the Future of Egocentric Vision Chiara Plizzari Gabriele Goletto Antonino Furnari Siddhant Bansal Francesco Ragusa G. Farinella Dima Damen Tatiana Tommasi EgoV 40 38 0 14 Aug 2023
Multimodal Distillation for Egocentric Action Recognition Gorjan Radevski Dusan Grujicic Marie-Francine Moens Matthew Blaschko Tinne Tuytelaars EgoV 23 23 0 14 Jul 2023
Procedure-Aware Pretraining for Instructional Video Understanding Honglu Zhou Roberto Martín-Martín Mubbasir Kapadia Silvio Savarese Juan Carlos Niebles 31 39 0 31 Mar 2023
Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization Congqi Cao Yizhe Wang Yuelie Lu X. Zhang Yanning Zhang 33 4 0 15 Mar 2023
Epic-Sounds: A Large-scale Dataset of Actions That Sound Jaesung Huh Jacob Chalk Evangelos Kazakos Dima Damen Andrew Zisserman EgoV 18 41 0 01 Feb 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring Huiyu Wang Mitesh Singh Lorenzo Torresani EgoV 72 23 0 03 Jan 2023
A Survey on Human Action Recognition Zhou Shuchang 29 0 0 20 Dec 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries Jinjie Mai Abdullah Hamdi Silvio Giancola Chen Zhao Guohao Li EgoV 38 14 0 14 Dec 2022
Bringing Online Egocentric Action Recognition into the wild Gabriele Goletto M. Planamente Barbara Caputo Giuseppe Averta EgoV 19 3 0 06 Nov 2022
Learning State-Aware Visual Representations from Audible Interactions Himangi Mittal Pedro Morgado Unnat Jain Abhinav Gupta 78 23 0 27 Sep 2022
Vision Transformers for Action Recognition: A Survey Anwaar Ulhaq Naveed Akhtar Ganna Pogrebna Ajmal Mian ViT 19 44 0 13 Sep 2022
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound Yan-Bo Lin Jie Lei Joey Tianyi Zhou Gedas Bertasius 46 39 0 06 Apr 2022
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection Rui Dai Srijan Das Kumara Kahatapitiya Michael S. Ryoo F. Brémond ViT 42 73 0 07 Dec 2021
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 283 1,984 0 09 Feb 2021
Video Transformer Network Daniel Neimark Omri Bar Maya Zohar Dotan Asselmann ViT 204 422 0 01 Feb 2021
AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ Ruilong Li Sha Yang David A. Ross Angjoo Kanazawa ViT 219 479 0 21 Jan 2021
Multi-modal Transformer for Video Retrieval Valentin Gabeur Chen Sun Alahari Karteek Cordelia Schmid ViT 424 596 0 21 Jul 2020
Audiovisual SlowFast Networks for Video Recognition Fanyi Xiao Yong Jae Lee Kristen Grauman Jitendra Malik Christoph Feichtenhofer 197 207 0 23 Jan 2020