ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.08897
  4. Cited By
Gate-Shift-Fuse for Video Action Recognition

Gate-Shift-Fuse for Video Action Recognition

16 March 2022
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
ArXivPDFHTML

Papers citing "Gate-Shift-Fuse for Video Action Recognition"

50 / 66 papers shown
Title
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Jongseo Lee
Joohyun Chang
Dongho Lee
Jinwoo Choi
211
0
0
30 Mar 2025
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
363
1,086
0
13 Oct 2021
CT-Net: Channel Tensorization Network for Video Classification
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li
Xianhang Li
Yali Wang
Jun Wang
Yu Qiao
ViT
57
55
0
03 Jun 2021
Temporal Query Networks for Fine-grained Video Understanding
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
75
85
0
19 Apr 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action
  Recognition
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
Yue Meng
Yikang Shen
Chung-Ching Lin
P. Sattigeri
Leonid Karlinsky
Kate Saenko
A. Oliva
Rogerio Feris
118
62
0
10 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
362
2,045
0
09 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
555
40,961
0
22 Oct 2020
A Short Note on the Kinetics-700-2020 Human Action Dataset
A Short Note on the Kinetics-700-2020 Human Action Dataset
Lucas Smaira
João Carreira
Eric Noland
Ellen Clancy
Amy Wu
Andrew Zisserman
78
139
0
21 Oct 2020
Maximum-Entropy Adversarial Data Augmentation for Improved
  Generalization and Robustness
Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness
Long Zhao
Ting Liu
Xi Peng
Dimitris N. Metaxas
OOD
AAML
94
168
0
15 Oct 2020
AssembleNet++: Assembling Modality Representations via Attention
  Connections
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
36
44
0
18 Aug 2020
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Yue Meng
Chung-Ching Lin
Yikang Shen
P. Sattigeri
Leonid Karlinsky
A. Oliva
Kate Saenko
Rogerio Feris
49
144
0
31 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
51
24
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video
  Classification
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris Kitani
Wei Hua
62
51
0
23 Jul 2020
Directional Temporal Modeling for Action Recognition
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
48
41
0
21 Jul 2020
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
FAtt
73
128
0
20 Jul 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
125
1,018
0
09 Apr 2020
Temporal Pyramid Network for Action Recognition
Temporal Pyramid Network for Action Recognition
Ceyuan Yang
Yinghao Xu
Jianping Shi
Bo Dai
Bolei Zhou
47
372
0
07 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
82
447
0
03 Apr 2020
V4D:4D Convolutional Neural Networks for Video-level Representation
  Learning
V4D:4D Convolutional Neural Networks for Video-level Representation Learning
Shiwen Zhang
Sheng Guo
Weilin Huang
Matthew R. Scott
Limin Wang
38
70
0
18 Feb 2020
LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video
  Recognition
LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition
Zuxuan Wu
Caiming Xiong
Yu-Gang Jiang
L. Davis
69
108
0
03 Dec 2019
More Is Less: Learning Efficient Video Representations by Big-Little
  Network and Depthwise Temporal Aggregation
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation
Quanfu Fan
Chun-Fu Chen
Hilde Kuehne
Marco Pistoia
David D. Cox
74
126
0
02 Dec 2019
Gate-Shift Networks for Video Action Recognition
Gate-Shift Networks for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
3DPC
55
155
0
01 Dec 2019
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Chenxu Luo
Alan Yuille
151
151
0
28 Sep 2019
Action recognition with spatial-temporal discriminative filter banks
Action recognition with spatial-temporal discriminative filter banks
Brais Martínez
Davide Modolo
Yuanjun Xiong
Joseph Tighe
51
66
0
20 Aug 2019
STM: SpatioTemporal and Motion Encoding for Action Recognition
STM: SpatioTemporal and Motion Encoding for Action Recognition
Boyuan Jiang
Mengmeng Wang
Weihao Gan
Wei Wu
Junjie Yan
79
382
0
07 Aug 2019
Video Modeling with Correlation Networks
Video Modeling with Correlation Networks
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
52
128
0
07 Jun 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
105
1,199
0
07 Jun 2019
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video
  Architectures
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures
Michael S. Ryoo
A. Piergiovanni
Mingxing Tan
A. Angelova
50
102
0
30 May 2019
VideoGraph: Recognizing Minutes-Long Human Activities in Videos
VideoGraph: Recognizing Minutes-Long Human Activities in Videos
Noureldien Hussein
E. Gavves
A. Smeulders
135
77
0
13 May 2019
SCSampler: Sampling Salient Clips from Video for Efficient Action
  Recognition
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition
Bruno Korbar
Du Tran
Lorenzo Torresani
64
224
0
08 Apr 2019
Video Classification with Channel-Separated Convolutional Networks
Video Classification with Channel-Separated Convolutional Networks
Du Tran
Heng Wang
Lorenzo Torresani
Matt Feiszli
3DV
61
586
0
04 Apr 2019
Collaborative Spatio-temporal Feature Learning for Video Action
  Recognition
Collaborative Spatio-temporal Feature Learning for Video Action Recognition
Chong Li
Qiaoyong Zhong
Di Xie
Shiliang Pu
58
82
0
04 Mar 2019
Efficient Video Classification Using Fewer Frames
Efficient Video Classification Using Fewer Frames
S. Bhardwaj
Mukundhan Srinivasan
Mitesh M. Khapra
68
88
0
27 Feb 2019
Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions
Saliency Tubes: Visual Explanations for Spatio-Temporal Convolutions
Alexandros Stergiou
G. Kapidis
Grigorios Kalliatakis
C. Chrysoulas
R. Veltkamp
R. Poppe
FAtt
49
47
0
04 Feb 2019
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video
  Action Recognition
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition
Zheng Shou
Xudong Lin
Yannis Kalantidis
Laura Sevilla-Lara
Marcus Rohrbach
Shih-Fu Chang
Zhicheng Yan
VGen
75
120
0
11 Jan 2019
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
161
480
0
12 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
162
3,262
0
10 Dec 2018
Timeception for Complex Action Recognition
Timeception for Complex Action Recognition
Noureldien Hussein
E. Gavves
A. Smeulders
101
214
0
04 Dec 2018
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
Zuxuan Wu
Caiming Xiong
Chih-Yao Ma
R. Socher
L. Davis
160
198
0
29 Nov 2018
LSTA: Long Short-Term Attention for Egocentric Action Recognition
LSTA: Long Short-Term Attention for Egocentric Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
55
143
0
26 Nov 2018
Representation Flow for Action Recognition
Representation Flow for Action Recognition
A. Piergiovanni
Michael S. Ryoo
75
147
0
02 Oct 2018
Multi-Fiber Networks for Video Recognition
Multi-Fiber Networks for Video Recognition
Yunpeng Chen
Yannis Kalantidis
Jianshu Li
Shuicheng Yan
Jiashi Feng
CVBM
100
218
0
30 Jul 2018
Motion Feature Network: Fixed Motion Filter for Action Recognition
Motion Feature Network: Fixed Motion Filter for Action Recognition
Myunggi Lee
Seungeui Lee
S. Son
Gyutae Park
Nojun Kwak
72
122
0
26 Jul 2018
Videos as Space-Time Region Graphs
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
83
756
0
05 Jun 2018
ECO: Efficient Convolutional Network for Online Video Understanding
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
183
498
0
24 Apr 2018
End-to-End Learning of Motion Representation for Video Understanding
End-to-End Learning of Motion Representation for Video Understanding
Lijie Fan
Wen-bing Huang
Chuang Gan
Stefano Ermon
Boqing Gong
Junzhou Huang
68
214
0
02 Apr 2018
Moments in Time Dataset: one million videos for event understanding
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
...
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
92
548
0
09 Jan 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
3DH
137
1,328
0
13 Dec 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
196
3,029
0
30 Nov 2017
Optical Flow Guided Feature: A Fast and Robust Motion Representation for
  Video Action Recognition
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
Shuyang Sun
Zhanghui Kuang
Wanli Ouyang
Lu Sheng
Wayne Zhang
74
296
0
29 Nov 2017
12
Next