Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.00869
Cited By
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation
2 December 2019
Quanfu Fan
Chun-Fu Chen
Hilde Kuehne
Marco Pistoia
David D. Cox
Re-assign community
ArXiv
PDF
HTML
Papers citing
"More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation"
29 / 29 papers shown
Title
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
38
4
0
11 Mar 2024
Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration
Harry Cheng
Yangyang Guo
Liqiang Nie
Zhiyong Cheng
Mohan S. Kankanhalli
37
7
0
27 Jul 2023
Is end-to-end learning enough for fitness activity recognition?
Antoine Mercier
Guillaume Berger
Sunny Panchal
Florian Letsch
Cornelius Boehm
Nahua Kang
Ingo Bax
Roland Memisevic
23
2
0
14 May 2023
AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning
Xijun Wang
Ruiqi Xian
Tianrui Guan
Celso M. de Melo
Stephen M. Nogar
Aniket Bera
Tianyi Zhou
16
11
0
02 Mar 2023
Look More but Care Less in Video Recognition
Yitian Zhang
Yue Bai
Haiquan Wang
Yi Xu
Yun Fu
27
9
0
18 Nov 2022
Fully-attentive and interpretable: vision and video vision transformers for pain detection
Giacomo Fiorentini
Itir Onal Ertugrul
A. A. Salah
MedIm
ViT
18
2
0
27 Oct 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition
Duc-Quang Vu
T. Phung
Jia-Ching Wang
27
9
0
03 Sep 2022
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
29
1
0
23 Aug 2022
Deformable Video Transformer
Jue Wang
Lorenzo Torresani
ViT
27
28
0
31 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
22
22
0
16 Mar 2022
Action Keypoint Network for Efficient Video Recognition
Xu Chen
Yahong Han
Xiaohan Wang
Yifang Sun
Yi Yang
3DPC
27
6
0
17 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
42
25
0
11 Jan 2022
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
39
203
0
02 Dec 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
35
73
0
25 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
21
24
0
27 Oct 2021
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
30
9
0
30 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
25
543
0
30 Jun 2021
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
30
124
0
10 Jun 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Yikang Shen
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
33
47
0
11 May 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,088
0
29 Mar 2021
VA-RED
2
^2
2
: Video Adaptive Redundancy Reduction
Bowen Pan
Yikang Shen
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
15
19
0
15 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,984
0
09 Feb 2021
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
17
391
0
18 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
33
27
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
139
496
0
24 Apr 2018
1