Long-Term Feature Banks for Detailed Video Understanding

12 December 2018

Chao-Yuan Wu

Christoph Feichtenhofer

Papers citing "Long-Term Feature Banks for Detailed Video Understanding"

50 / 306 papers shown

Title
A Grammatical Compositional Model for Video Action Detection Zhijun Zhang Xu Zou Jiahuan Zhou Sheng Zhong Ying Wu 31 0 0 04 Oct 2023
A Survey on Deep Learning Techniques for Action Anticipation Zeyun Zhong Manuel Martin Michael Voit Juergen Gall Jürgen Beyerer 24 7 0 29 Sep 2023
SkeleTR: Towrads Skeleton-based Action Recognition in the Wild Haodong Duan Mingze Xu Bing Shuai Davide Modolo Zhuowen Tu Joseph Tighe Alessandro Bergamo ViT 35 1 0 20 Sep 2023
JOADAA: joint online action detection and action anticipation Mohammed Guermal François Brémond Rui Dai Abid Ali 34 6 0 12 Sep 2023
Object-Centric Multiple Object Tracking Zixu Zhao Jiaze Wang Max Horn Yizhuo Ding Tong He ... Bernt Schiele Yanwei Fu Francesco Locatello Zheng-Wei Zhang Tianjun Xiao VOT OCL 24 6 0 01 Sep 2023
MOFO: MOtion FOcused Self-Supervision for Video Understanding Mona Ahmadian Frank Guerin Andrew Gilbert 34 2 0 23 Aug 2023
Video BagNet: short temporal receptive fields increase robustness in long-term action recognition Ombretta Strafforello X. Liu Klamer Schutte J. C. V. Gemert 29 2 0 22 Aug 2023
View while Moving: Efficient Video Recognition in Long-untrimmed Videos Ye Tian Meng Yang Lanshan Zhang Zhizhen Zhang Yang Liu Xiao-Zhu Xie Xirong Que Wendong Wang 24 7 0 09 Aug 2023
A Survey on Deep Learning-based Spatio-temporal Action Detection Peng Wang Fanwei Zeng Yu Qian 34 5 0 03 Aug 2023
Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities Kaijian Liu Shixiang Tang Ziyue Li Zhishuai Li Lei Bai Feng Zhu Rui Zhao 3DH 16 3 0 01 Aug 2023
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding Enxin Song Wenhao Chai Guanhong Wang Yucheng Zhang Haoyang Zhou ... Tianbo Ye Yanting Zhang Yang Lu Lei Li Gaoang Wang VLM MLLM 22 262 0 31 Jul 2023
TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition Isabel Funke Dominik Rivoir Stefanie Krell Stefanie Speidel 31 3 0 19 Jul 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling? Wenhao Wu Yuxin Song Zhun Sun Jingdong Wang Chang Xu Wanli Ouyang 40 8 0 18 Jul 2023
Human-to-Human Interaction Detection Zhenhua Wang Kaining Ying Jiajun Meng J. Ning 30 2 0 02 Jul 2023
How can objects help action recognition? Xingyi Zhou Anurag Arnab Chen Sun Cordelia Schmid 38 14 0 20 Jun 2023
Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages Michael P. J. Camilleri R. Bains Christopher K. I. Williams 11 0 0 05 Jun 2023
Metrics Matter in Surgical Phase Recognition Isabel Funke Dominik Rivoir Stefanie Speidel 30 8 0 23 May 2023
Modelling Spatio-Temporal Interactions for Compositional Action Recognition Ramanathan Rajendiran Debaditya Roy Basura Fernando 43 1 0 04 May 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers A. Gritsenko Xuehan Xiong Josip Djolonga Mostafa Dehghani Chen Sun Mario Lucic Cordelia Schmid Anurag Arnab ViT 37 13 0 24 Apr 2023
MRSN: Multi-Relation Support Network for Video Action Detection Yin-Dong Zheng Guo Chen Minglei Yuan Tong Lu 33 8 0 24 Apr 2023
Efficient Video Action Detection with Token Dropout and Context Refinement Lei Chen Zhan Tong Yibing Song Gangshan Wu Limin Wang 36 14 0 17 Apr 2023
Verbs in Action: Improving verb understanding in video-language models Liliane Momeni Mathilde Caron Arsha Nagrani Andrew Zisserman Cordelia Schmid 37 70 0 13 Apr 2023
Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection Wei-Jhe Huang Jheng-Hsien Yeh Min-Hung Chen Gueter Josmy Faure S. Lai 38 3 0 10 Apr 2023
Boundary-Denoising for Video Activity Localization Mengmeng Xu Mattia Soldan Jialin Gao Shuming Liu Juan-Manuel Perez-Rua Guohao Li 34 10 0 06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition Kumara Kahatapitiya Anurag Arnab Arsha Nagrani Michael S. Ryoo 36 19 0 05 Apr 2023
DOAD: Decoupled One Stage Action Detection Network Shuning Chang Pichao Wang Fan Wang Jiashi Feng Mike Zheng Show 26 4 0 01 Apr 2023
Streaming Video Model Yucheng Zhao Chong Luo Chuanxin Tang Dongdong Chen Noel Codella Zhengjun Zha 36 12 0 30 Mar 2023
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection Lei Chen Zhan Tong Yibing Song Gangshan Wu Limin Wang 25 3 0 28 Mar 2023
Open Set Action Recognition via Multi-Label Evidential Learning Chen Zhao Dawei Du A. Hoogs Christopher Funk EDL 17 23 0 27 Feb 2023
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection Jianhua Yang Kun Dai ObjD 27 17 0 14 Feb 2023
Program Generation from Diverse Video Demonstrations Anthony Manchin Jamie Sherrah Qi Wu Anton Van Den Hengel VGen 10 0 0 01 Feb 2023
Video Semantic Segmentation with Inter-Frame Feature Fusion and Inner-Frame Feature Refinement Jiafan Zhuang Zilei Wang Junjie Li VOS 17 1 0 10 Jan 2023
HierVL: Learning Hierarchical Video-Language Embeddings Kumar Ashutosh Rohit Girdhar Lorenzo Torresani Kristen Grauman VLM AI4TS 22 52 0 05 Jan 2023
Deep set conditioned latent representations for action recognition Akash Singh Tom De Schepper Kevin Mets P. Hellinckx José Oramas Steven Latré BDL 14 2 0 21 Dec 2022
A Survey on Human Action Recognition Zhou Shuchang 29 0 0 20 Dec 2022
Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance Congqi Cao Xin Zhang Shizhou Zhang Peng Wang Yanning Zhang 24 6 0 16 Dec 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries Jinjie Mai Abdullah Hamdi Silvio Giancola Chen Zhao Guohao Li EgoV 38 14 0 14 Dec 2022
Ego Vehicle Speed Estimation using 3D Convolution with Masked Attention Athul M. Mathew Thariq Khalid 18 2 0 11 Dec 2022
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection Chen Zhang Guorong Li Yuankai Qi Shuhui Wang Laiyun Qing Qingming Huang Ming-Hsuan Yang 35 53 0 08 Dec 2022
Spatio-Temporal Crop Aggregation for Video Representation Learning Sepehr Sameni Simon Jenni Paolo Favaro 24 3 0 30 Nov 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization Chen Zhao Shuming Liu K. Mangalam Guohao Li 38 17 0 25 Nov 2022
Multi-Task Learning of Object State Changes from Uncurated Videos Tomávs Souvcek Jean-Baptiste Alayrac Antoine Miech Ivan Laptev Josef Sivic 34 11 0 24 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions Yong-Lu Li Hongwei Fan Zuoyu Qiu Yiming Dou Liang Xu ... Peiyang Guo Haisheng Su Dongliang Wang Wei Yu Wu Cewu Lu 35 7 0 14 Nov 2022
End-to-end Transformer for Compressed Video Quality Enhancement Li Yu Wenshuai Chang Shiyu Wu Moncef Gabbouj ViT 24 8 0 25 Oct 2022
Holistic Interaction Transformer Network for Action Detection Gueter Josmy Faure Min-Hung Chen S. Lai 33 37 0 23 Oct 2022
YOWO-Plus: An Incremental Improvement Jianhua Yang ViT 11 5 0 20 Oct 2022
Grounded Video Situation Recognition Zeeshan Khan C. V. Jawahar Makarand Tapaswi 22 13 0 19 Oct 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Yuchong Sun Hongwei Xue Ruihua Song Bei Liu Huan Yang Jianlong Fu AI4TS VLM 20 68 0 12 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval A. Fragomeni Michael Wray Dima Damen CLIP ViT 25 3 0 09 Oct 2022
Compressed Vision for Efficient Video Understanding Olivia Wiles João Carreira Iain Barr Andrew Zisserman Mateusz Malinowski 27 7 0 06 Oct 2022