Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.05038
Cited By
Long-Term Feature Banks for Detailed Video Understanding
12 December 2018
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-Term Feature Banks for Detailed Video Understanding"
50 / 306 papers shown
Title
Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
22
39
0
30 Mar 2021
Temporal Memory Relation Network for Workflow Recognition from Surgical Video
Yueming Jin
Yonghao Long
Cheng Chen
Zixu Zhao
Qi Dou
Pheng-Ann Heng
34
89
0
30 Mar 2021
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation
Shuning Chang
Pichao Wang
F. Wang
Hao Li
Jiashi Feng
ViT
42
41
0
30 Mar 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,088
0
29 Mar 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
30
3
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Anurag Arnab
Chen Sun
Cordelia Schmid
35
44
0
29 Mar 2021
Regular Polytope Networks
F. Pernici
Matteo Bruni
C. Baecchi
A. Bimbo
29
26
0
29 Mar 2021
On the hidden treasure of dialog in video question answering
Deniz Engin
Franccois Schnitzler
Ngoc Q. K. Duong
Yannis Avrithis
29
10
0
26 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
22
173
0
24 Mar 2021
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
16
145
0
22 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
19
12
0
21 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
29
2
0
18 Mar 2021
ROAD: The ROad event Awareness Dataset for Autonomous Driving
Gurkirt Singh
Stephen Akrigg
Manuele Di Maio
Valentina Fontana
Reza Javanmard Alitappeh
...
Salman Khan
S. Grazioso
Andrew Bradley
G. Gironimo
Fabio Cuzzolin
32
89
0
23 Feb 2021
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
27
15
0
16 Feb 2021
Win-Fail Action Recognition
Paritosh Parmar
B. Morris
24
5
0
15 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,984
0
09 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
204
422
0
01 Feb 2021
Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting
Sovan Biswas
Juergen Gall
24
2
0
21 Jan 2021
Smoothed Gaussian Mixture Models for Video Classification and Recommendation
Sirjan Kafle
Aman Gupta
Xue Xia
A. Sankar
Xi Chen
Di Wen
Liang Zhang
16
0
0
17 Dec 2020
NUTA: Non-uniform Temporal Aggregation for Action Recognition
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Hao Chen
Joseph Tighe
ViT
14
16
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
Yang Fu
Linjie Yang
Ding Liu
Thomas S. Huang
Humphrey Shi
VOS
32
71
0
07 Dec 2020
SAFCAR: Structured Attention Fusion for Compositional Action Recognition
Tae Soo Kim
Gregory Hager
CoGe
16
10
0
03 Dec 2020
Recent Progress in Appearance-based Action Recognition
J. Humphreys
Zhe Chen
Dacheng Tao
24
0
0
25 Nov 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
Humam Alwassel
Silvio Giancola
Guohao Li
33
123
0
23 Nov 2020
Memory Optimization for Deep Networks
Aashaka Shah
Chaoxia Wu
Jayashree Mohan
Vijay Chidambaram
Philipp Krahenbuhl
14
24
0
27 Oct 2020
Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
BDL
21
22
0
18 Oct 2020
Pose And Joint-Aware Action Recognition
Anshul B. Shah
Shlok Kumar Mishra
Ankan Bansal
Jun-Cheng Chen
Ramalingam Chellappa
Abhinav Shrivastava
39
33
0
16 Oct 2020
Deep Sequence Learning for Video Anticipation: From Discrete and Deterministic to Continuous and Stochastic
S. Aliakbarian
AI4TS
21
0
0
09 Oct 2020
Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing
Okan Kopuklu
Stefan Hormann
Fabian Herzog
Hakan Çevikalp
Gerhard Rigoll
3DPC
16
15
0
30 Sep 2020
Texture Memory-Augmented Deep Patch-Based Image Inpainting
Rui Xu
Minghao Guo
Jiaqi Wang
Xiaoxiao Li
Bolei Zhou
Chen Change Loy
3DV
31
39
0
28 Sep 2020
Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
Yanyi Zhang
Xinyu Li
I. Marsic
HAI
26
23
0
16 Sep 2020
Online Spatiotemporal Action Detection and Prediction via Causal Representations
Gurkirt Singh
3DPC
CML
19
0
0
31 Aug 2020
A Prospective Study on Sequence-Driven Temporal Sampling and Ego-Motion Compensation for Action Recognition in the EPIC-Kitchens Dataset
Alejandro López-Cifuentes
Marcos Escudero-Viñolo
Jesús Bescós
EgoV
13
1
0
26 Aug 2020
Query Twice: Dual Mixture Attention Meta Learning for Video Summarization
Junyan Wang
Yang Bai
Yang Long
Bingzhang Hu
Z. Chai
Yu Guan
Xiaolin K. Wei
EgoV
13
15
0
19 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
15
44
0
18 Aug 2020
Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network
Rui Li
Shunyi Zheng
Chenxi Duan
Ce Zhang
18
97
0
01 Aug 2020
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
Baoxiong Jia
Yixin Chen
Siyuan Huang
Yixin Zhu
Song-Chun Zhu
13
51
0
31 Jul 2020
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
6
41
0
21 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
30
79
0
20 Jul 2020
Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
Noa Garcia
Yuta Nakashima
20
32
0
17 Jul 2020
Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision
Peng Wu
Jing Liu
Yujia Shi
Yujia Sun
Fang Shao
Zhaoyang Wu
Zhiwei Yang
20
298
0
09 Jul 2020
Aligning Videos in Space and Time
Senthil Purushwalkam
Tian-Chun Ye
Saurabh Gupta
Abhinav Gupta
24
23
0
09 Jul 2020
Joint Learning of Social Groups, Individuals Action and Sub-group Activities in Videos
Mahsa Ehsanpour
Alireza Abedin
F. Saleh
Javen Qinfeng Shi
Ian Reid
Hamid Rezatofighi
29
71
0
06 Jul 2020
Video Representation Learning with Visual Tempo Consistency
Ceyuan Yang
Yinghao Xu
Bo Dai
Bolei Zhou
13
89
0
28 Jun 2020
1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020
Siyu Chen
Junting Pan
Guanglu Song
Manyuan Zhang
Hao Shao
Ziyi Lin
Jing Shao
Hongsheng Li
Yu Liu
3DPC
14
4
0
16 Jun 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
19
150
0
14 Jun 2020
Temporal Aggregate Representations for Long-Range Video Understanding
Fadime Sener
Dipika Singhania
Angela Yao
AI4TS
25
7
0
01 Jun 2020
In the Eye of the Beholder: Gaze and Actions in First Person Video
Yin Li
Miao Liu
James M. Rehg
EgoV
25
69
0
31 May 2020
Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts
Bo Pang
Kaiwen Zha
Hanwen Cao
Jiajun Tang
Minghui Yu
Cewu Lu
12
25
0
30 May 2020
Previous
1
2
3
4
5
6
7
Next