Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.05038
Cited By
Long-Term Feature Banks for Detailed Video Understanding
12 December 2018
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-Term Feature Banks for Detailed Video Understanding"
50 / 306 papers shown
Title
A Grammatical Compositional Model for Video Action Detection
Zhijun Zhang
Xu Zou
Jiahuan Zhou
Sheng Zhong
Ying Wu
31
0
0
04 Oct 2023
A Survey on Deep Learning Techniques for Action Anticipation
Zeyun Zhong
Manuel Martin
Michael Voit
Juergen Gall
Jürgen Beyerer
24
7
0
29 Sep 2023
SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Haodong Duan
Mingze Xu
Bing Shuai
Davide Modolo
Zhuowen Tu
Joseph Tighe
Alessandro Bergamo
ViT
35
1
0
20 Sep 2023
JOADAA: joint online action detection and action anticipation
Mohammed Guermal
François Brémond
Rui Dai
Abid Ali
34
6
0
12 Sep 2023
Object-Centric Multiple Object Tracking
Zixu Zhao
Jiaze Wang
Max Horn
Yizhuo Ding
Tong He
...
Bernt Schiele
Yanwei Fu
Francesco Locatello
Zheng-Wei Zhang
Tianjun Xiao
VOT
OCL
24
6
0
01 Sep 2023
MOFO: MOtion FOcused Self-Supervision for Video Understanding
Mona Ahmadian
Frank Guerin
Andrew Gilbert
34
2
0
23 Aug 2023
Video BagNet: short temporal receptive fields increase robustness in long-term action recognition
Ombretta Strafforello
X. Liu
Klamer Schutte
J. C. V. Gemert
29
2
0
22 Aug 2023
View while Moving: Efficient Video Recognition in Long-untrimmed Videos
Ye Tian
Meng Yang
Lanshan Zhang
Zhizhen Zhang
Yang Liu
Xiao-Zhu Xie
Xirong Que
Wendong Wang
24
7
0
09 Aug 2023
A Survey on Deep Learning-based Spatio-temporal Action Detection
Peng Wang
Fanwei Zeng
Yu Qian
34
5
0
03 Aug 2023
Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities
Kaijian Liu
Shixiang Tang
Ziyue Li
Zhishuai Li
Lei Bai
Feng Zhu
Rui Zhao
3DH
16
3
0
01 Aug 2023
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Enxin Song
Wenhao Chai
Guanhong Wang
Yucheng Zhang
Haoyang Zhou
...
Tianbo Ye
Yanting Zhang
Yang Lu
Lei Li
Gaoang Wang
VLM
MLLM
22
262
0
31 Jul 2023
TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition
Isabel Funke
Dominik Rivoir
Stefanie Krell
Stefanie Speidel
31
3
0
19 Jul 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
40
8
0
18 Jul 2023
Human-to-Human Interaction Detection
Zhenhua Wang
Kaining Ying
Jiajun Meng
J. Ning
30
2
0
02 Jul 2023
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
38
14
0
20 Jun 2023
Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages
Michael P. J. Camilleri
R. Bains
Christopher K. I. Williams
11
0
0
05 Jun 2023
Metrics Matter in Surgical Phase Recognition
Isabel Funke
Dominik Rivoir
Stefanie Speidel
30
8
0
23 May 2023
Modelling Spatio-Temporal Interactions for Compositional Action Recognition
Ramanathan Rajendiran
Debaditya Roy
Basura Fernando
43
1
0
04 May 2023
End-to-End Spatio-Temporal Action Localisation with Video Transformers
A. Gritsenko
Xuehan Xiong
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
Anurag Arnab
ViT
37
13
0
24 Apr 2023
MRSN: Multi-Relation Support Network for Video Action Detection
Yin-Dong Zheng
Guo Chen
Minglei Yuan
Tong Lu
33
8
0
24 Apr 2023
Efficient Video Action Detection with Token Dropout and Context Refinement
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
36
14
0
17 Apr 2023
Verbs in Action: Improving verb understanding in video-language models
Liliane Momeni
Mathilde Caron
Arsha Nagrani
Andrew Zisserman
Cordelia Schmid
37
70
0
13 Apr 2023
Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection
Wei-Jhe Huang
Jheng-Hsien Yeh
Min-Hung Chen
Gueter Josmy Faure
S. Lai
38
3
0
10 Apr 2023
Boundary-Denoising for Video Activity Localization
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
34
10
0
06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
36
19
0
05 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
26
4
0
01 Apr 2023
Streaming Video Model
Yucheng Zhao
Chong Luo
Chuanxin Tang
Dongdong Chen
Noel Codella
Zhengjun Zha
36
12
0
30 Mar 2023
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection
Lei Chen
Zhan Tong
Yibing Song
Gangshan Wu
Limin Wang
25
3
0
28 Mar 2023
Open Set Action Recognition via Multi-Label Evidential Learning
Chen Zhao
Dawei Du
A. Hoogs
Christopher Funk
EDL
17
23
0
27 Feb 2023
YOWOv2: A Stronger yet Efficient Multi-level Detection Framework for Real-time Spatio-temporal Action Detection
Jianhua Yang
Kun Dai
ObjD
27
17
0
14 Feb 2023
Program Generation from Diverse Video Demonstrations
Anthony Manchin
Jamie Sherrah
Qi Wu
Anton Van Den Hengel
VGen
10
0
0
01 Feb 2023
Video Semantic Segmentation with Inter-Frame Feature Fusion and Inner-Frame Feature Refinement
Jiafan Zhuang
Zilei Wang
Junjie Li
VOS
17
1
0
10 Jan 2023
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
VLM
AI4TS
22
52
0
05 Jan 2023
Deep set conditioned latent representations for action recognition
Akash Singh
Tom De Schepper
Kevin Mets
P. Hellinckx
José Oramas
Steven Latré
BDL
14
2
0
21 Dec 2022
A Survey on Human Action Recognition
Zhou Shuchang
29
0
0
20 Dec 2022
Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance
Congqi Cao
Xin Zhang
Shizhou Zhang
Peng Wang
Yanning Zhang
24
6
0
16 Dec 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries
Jinjie Mai
Abdullah Hamdi
Silvio Giancola
Chen Zhao
Guohao Li
EgoV
38
14
0
14 Dec 2022
Ego Vehicle Speed Estimation using 3D Convolution with Masked Attention
Athul M. Mathew
Thariq Khalid
18
2
0
11 Dec 2022
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
Chen Zhang
Guorong Li
Yuankai Qi
Shuhui Wang
Laiyun Qing
Qingming Huang
Ming-Hsuan Yang
35
53
0
08 Dec 2022
Spatio-Temporal Crop Aggregation for Video Representation Learning
Sepehr Sameni
Simon Jenni
Paolo Favaro
24
3
0
30 Nov 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao
Shuming Liu
K. Mangalam
Guohao Li
38
17
0
25 Nov 2022
Multi-Task Learning of Object State Changes from Uncurated Videos
Tomávs Souvcek
Jean-Baptiste Alayrac
Antoine Miech
Ivan Laptev
Josef Sivic
34
11
0
24 Nov 2022
Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions
Yong-Lu Li
Hongwei Fan
Zuoyu Qiu
Yiming Dou
Liang Xu
...
Peiyang Guo
Haisheng Su
Dongliang Wang
Wei Yu Wu
Cewu Lu
35
7
0
14 Nov 2022
End-to-end Transformer for Compressed Video Quality Enhancement
Li Yu
Wenshuai Chang
Shiyu Wu
Moncef Gabbouj
ViT
24
8
0
25 Oct 2022
Holistic Interaction Transformer Network for Action Detection
Gueter Josmy Faure
Min-Hung Chen
S. Lai
33
37
0
23 Oct 2022
YOWO-Plus: An Incremental Improvement
Jianhua Yang
ViT
11
5
0
20 Oct 2022
Grounded Video Situation Recognition
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
22
13
0
19 Oct 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Yuchong Sun
Hongwei Xue
Ruihua Song
Bei Liu
Huan Yang
Jianlong Fu
AI4TS
VLM
20
68
0
12 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
25
3
0
09 Oct 2022
Compressed Vision for Efficient Video Understanding
Olivia Wiles
João Carreira
Iain Barr
Andrew Zisserman
Mateusz Malinowski
27
7
0
06 Oct 2022
Previous
1
2
3
4
5
6
7
Next