Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video

4 May 2020

Papers citing "Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video"

38 / 38 papers shown

Title
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation Congqi Cao Lanshu Hu Yating Yu Y. Zhang VLM 185 0 0 03 May 2025
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos Luigi Seminara G. Farinella Antonino Furnari 64 8 0 10 Jan 2025
Interact with me: Joint Egocentric Forecasting of Intent to Interact, Attitude and Social Actions Tongfei Bian Yiming Ma Mathieu Chollet Victor Sanchez T. Guha EgoV 97 1 0 21 Dec 2024
ExpertAF: Expert Actionable Feedback from Video Kumar Ashutosh Tushar Nagarajan Georgios Pavlakos Kris Kitani Kristen Grauman VGen 44 2 0 01 Aug 2024
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models Himangi Mittal Nakul Agarwal Shao-Yuan Lo Kwonjoon Lee 44 14 0 30 May 2024
EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from Egocentric Videos Masashi Hatano Ryo Hachiuma Hideo Saito EgoV 37 3 0 30 May 2024
Bidirectional Progressive Transformer for Interaction Intention Anticipation Zichen Zhang Hongcheng Luo Wei Zhai Yang Cao Yu Kang 41 5 0 09 May 2024
LEAP: LLM-Generation of Egocentric Action Programs Eadom Dessalene Michael Maynord Cornelia Fermuller Yiannis Aloimonos 38 3 0 29 Nov 2023
Action Anticipation with Goal Consistency Olga Zatsarynna Juergen Gall 25 10 0 26 Jun 2023
Affordances from Human Videos as a Versatile Representation for Robotics Shikhar Bahl Russell Mendonca Lili Chen Unnat Jain Deepak Pathak 53 164 0 17 Apr 2023
HierVL: Learning Hierarchical Video-Language Embeddings Kumar Ashutosh Rohit Girdhar Lorenzo Torresani Kristen Grauman VLM AI4TS 26 53 0 05 Jan 2023
What You Say Is What You Show: Visual Narration Detection in Instructional Videos Kumar Ashutosh Rohit Girdhar Lorenzo Torresani Kristen Grauman 24 4 0 05 Jan 2023
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory Santhosh Kumar Ramakrishnan Ziad Al-Halah Kristen Grauman 119 39 0 02 Jan 2023
A Survey on Human Action Recognition Zhou Shuchang 29 0 0 20 Dec 2022
Inductive Attention for Video Action Anticipation Tsung-Ming Tai G. Fiameni Cheng-Kuang Lee Simon See Oswald Lanz 39 1 0 17 Dec 2022
Bringing Online Egocentric Action Recognition into the wild Gabriele Goletto M. Planamente Barbara Caputo Giuseppe Averta EgoV 19 3 0 06 Nov 2022
Rethinking Learning Approaches for Long-Term Action Anticipation Megha Nawhal Akash Abdu Jyothi Greg Mori AI4TS 39 26 0 20 Oct 2022
Learning State-Aware Visual Representations from Audible Interactions Himangi Mittal Pedro Morgado Unnat Jain Abhinav Gupta 78 23 0 27 Sep 2022
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain Francesco Ragusa Antonino Furnari G. Farinella EgoV 43 24 0 19 Sep 2022
Predicting the Next Action by Modeling the Abstract Goal Debaditya Roy Basura Fernando EgoV 26 18 0 12 Sep 2022
Unified Recurrence Modeling for Video Action Anticipation Tsung-Ming Tai G. Fiameni Cheng-Kuang Lee Simon See Oswald Lanz 21 8 0 02 Jun 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction Alexandros Stergiou Dima Damen AI4TS EgoV EDL 17 7 0 28 Apr 2022
Weakly Supervised Attended Object Detection Using Gaze Data as Annotations Michele Mazzamuto Francesco Ragusa Antonino Furnari G. Signorello G. Farinella 23 9 0 14 Apr 2022
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos Shao-Wei Liu Subarna Tripathi Somdeb Majumdar Xiaolong Wang EgoV 35 93 0 04 Apr 2022
On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis Dominik Rivoir Isabel Funke Stefanie Speidel 24 17 0 15 Mar 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition Chao-Yuan Wu Yanghao Li K. Mangalam Haoqi Fan Bo Xiong Jitendra Malik Christoph Feichtenhofer ViT 48 198 0 20 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound Rowan Zellers Jiasen Lu Ximing Lu Youngjae Yu Yanpeng Zhao Mohammadreza Salehi Aditya Kusupati Jack Hessel Ali Farhadi Yejin Choi 36 207 0 07 Jan 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 269 1,024 0 13 Oct 2021
Towards Streaming Egocentric Action Anticipation Antonino Furnari G. Farinella EgoV 33 6 0 11 Oct 2021
SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos Nada Osman Guglielmo Camporese Pasquale Coscia Lamberto Ballan EgoV 39 20 0 02 Sep 2021
Is First Person Vision Challenging for Object Tracking? Matteo Dunnhofer Antonino Furnari G. Farinella C. Micheloni 27 23 0 31 Aug 2021
TransAction: ICL-SJTU Submission to EPIC-Kitchens Action Anticipation Challenge 2021 Xiao Gu Jianing Qiu Yao Guo Benny Lo Guang-Zhong Yang 21 12 0 28 Jul 2021
Higher Order Recurrent Space-Time Transformer for Video Action Prediction Tsung-Ming Tai G. Fiameni Cheng-Kuang Lee Oswald Lanz 36 9 0 17 Apr 2021
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos Yanghao Li Tushar Nagarajan Bo Xiong Kristen Grauman EgoV 51 84 0 16 Apr 2021
Look Before you Speak: Visually Contextualized Utterances Paul Hongsuck Seo Arsha Nagrani Cordelia Schmid 21 66 0 10 Dec 2020
Rescaling Egocentric Vision Dima Damen Hazel Doughty G. Farinella Antonino Furnari Evangelos Kazakos ... Davide Moltisanti Jonathan Munro Toby Perrett Will Price Michael Wray EgoV 19 437 0 23 Jun 2020
Knowledge Distillation for Action Anticipation via Label Smoothing Guglielmo Camporese Pasquale Coscia Antonino Furnari G. Farinella Lamberto Ballan EgoV 40 36 0 16 Apr 2020
Prediction and Description of Near-Future Activities in Video T. Mahmud Mohammad Billah Mahmudul Hasan A. Roy-Chowdhury 28 16 0 02 Aug 2019