ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.02155
  4. Cited By
Spatiotemporal Residual Networks for Video Action Recognition

Spatiotemporal Residual Networks for Video Action Recognition

7 November 2016
Christoph Feichtenhofer
A. Pinz
Richard P. Wildes
ArXivPDFHTML

Papers citing "Spatiotemporal Residual Networks for Video Action Recognition"

50 / 273 papers shown
Title
DirecFormer: A Directed Attention in Transformer Approach to Robust
  Action Recognition
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
S. L. Phung
Xin Li
Khoa Luu
ViT
42
49
0
19 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
22
22
0
16 Mar 2022
Enriched CNN-Transformer Feature Aggregation Networks for
  Super-Resolution
Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution
Jinsu Yoo
Taehoon Kim
Sihaeng Lee
Seunghyeon Kim
Hankook Lee
Tae Hyun Kim
SupR
ViT
31
51
0
15 Mar 2022
PAMI-AD: An Activity Detector Exploiting Part-attention and Motion
  Information in Surveillance Videos
PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos
Yunhao Du
Zhihang Tong
Jun-Jun Wan
Binyu Zhang
Yanyun Zhao
24
3
0
08 Mar 2022
RadioTransformer: A Cascaded Global-Focal Transformer for Visual
  Attention-guided Disease Classification
RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification
Moinak Bhattacharya
Shubham Jain
Prateek Prasanna
ViT
MedIm
22
33
0
23 Feb 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
26
212
0
12 Jan 2022
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so
  Naïve
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so Naïve
Lei Wang
Jun Liu
Piotr Koniusz
42
20
0
23 Dec 2021
Vision Transformer Based Video Hashing Retrieval for Tracing the Source
  of Fake Videos
Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos
Pengfei Pei
Xianfeng Zhao
Yun Cao
Jinchuan Li
Xiaowei Yi
ViT
24
8
0
15 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
29
17
0
13 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
75
678
0
02 Dec 2021
A Critical Study on the Recent Deep Learning Based Semi-Supervised Video
  Anomaly Detection Methods
A Critical Study on the Recent Deep Learning Based Semi-Supervised Video Anomaly Detection Methods
M. Baradaran
R. Bergevin
24
16
0
02 Nov 2021
Skeleton-Based Mutually Assisted Interacted Object Localization and
  Human Action Recognition
Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition
Liang Xu
Cuiling Lan
Wenjun Zeng
Cewu Lu
16
24
0
28 Oct 2021
Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
Zi-Jun Li
Bo Xu
Han Huang
Cheng Lu
Yandong Guo
3DH
23
13
0
22 Oct 2021
High-order Tensor Pooling with Attention for Action Recognition
High-order Tensor Pooling with Attention for Action Recognition
Lei Wang
Ke Sun
Piotr Koniusz
38
14
0
11 Oct 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
40
64
0
27 Sep 2021
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
42
9
0
30 Aug 2021
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Xuefan Zha
Wentao Zhu
Tingxun Lv
Sen Yang
Ji Liu
AI4TS
ViT
33
27
0
26 Aug 2021
When Video Classification Meets Incremental Classes
When Video Classification Meets Incremental Classes
Hanbin Zhao
Xin Qin
Shihao Su
Yongjian Fu
Zibo Lin
Xi Li
CLL
21
28
0
30 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
37
127
0
21 Jun 2021
MaCLR: Motion-aware Contrastive Learning of Representations for Videos
MaCLR: Motion-aware Contrastive Learning of Representations for Videos
Fanyi Xiao
Joseph Tighe
Davide Modolo
SSL
18
13
0
17 Jun 2021
SSAN: Separable Self-Attention Network for Video Representation Learning
SSAN: Separable Self-Attention Network for Video Representation Learning
Xudong Guo
Xun Guo
Yan Lu
ViT
AI4TS
14
26
0
27 May 2021
Anabranch Network for Camouflaged Object Segmentation
Anabranch Network for Camouflaged Object Segmentation
Trung-Nghia Le
Tam V. Nguyen
Zhongliang Nie
M. Tran
Akihiro Sugimoto
24
477
0
20 May 2021
What can human minimal videos tell us about dynamic recognition models?
What can human minimal videos tell us about dynamic recognition models?
Guy Ben-Yosef
Gabriel Kreiman
S. Ullman
19
2
0
19 Apr 2021
Adaptive Intermediate Representations for Video Understanding
Adaptive Intermediate Representations for Video Understanding
Juhana Kangaspunta
A. Piergiovanni
Rico Jonschkowski
Michael S. Ryoo
A. Angelova
26
3
0
14 Apr 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,088
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Unified Graph Structured Models for Video Understanding
Anurag Arnab
Chen Sun
Cordelia Schmid
38
44
0
29 Mar 2021
Busy-Quiet Video Disentangling for Video Classification
Busy-Quiet Video Disentangling for Video Classification
Guoxi Huang
A. Bors
28
6
0
29 Mar 2021
Learning to Recognize Actions on Objects in Egocentric Video with
  Attention Dictionaries
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
27
15
0
16 Feb 2021
RMS-Net: Regression and Masking for Soccer Event Spotting
RMS-Net: Regression and Masking for Soccer Event Spotting
Matteo Tomei
Lorenzo Baraldi
Simone Calderara
Simone Bronzin
Rita Cucchiara
35
28
0
15 Feb 2021
Video Transformer Network
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
204
422
0
01 Feb 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual
  Video Representation Learning
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
Sangho Lee
Jiwan Chung
Youngjae Yu
Gunhee Kim
Thomas Breuel
Gal Chechik
Yale Song
71
46
0
26 Jan 2021
RGB-D Salient Object Detection via 3D Convolutional Neural Networks
RGB-D Salient Object Detection via 3D Convolutional Neural Networks
Qian Chen
Ze Liu
Y. Zhang
Keren Fu
Qijun Zhao
H. Du
3DPC
35
150
0
25 Jan 2021
A Layer-Wise Information Reinforcement Approach to Improve Learning in
  Deep Belief Networks
A Layer-Wise Information Reinforcement Approach to Improve Learning in Deep Belief Networks
Mateus Roder
L. A. Passos
L. C. Ribeiro
C. R. Pereira
João Paulo Papa
14
1
0
17 Jan 2021
Human Action Recognition from Various Data Modalities: A Review
Human Action Recognition from Various Data Modalities: A Review
Zehua Sun
Qiuhong Ke
Hossein Rahmani
Mohammed Bennamoun
Gang Wang
Jun Liu
MU
56
504
0
22 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark
Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu
Yao Hu
S. Bai
Fei Ding
X. Bai
Philip Torr
46
81
0
17 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
Spatial-Temporal Alignment Network for Action Recognition and Detection
Spatial-Temporal Alignment Network for Action Recognition and Detection
Junwei Liang
Liangliang Cao
Xuehan Xiong
Ting Yu
Alexander G. Hauptmann
3DPC
16
9
0
04 Dec 2020
Recent Progress in Appearance-based Action Recognition
Recent Progress in Appearance-based Action Recognition
J. Humphreys
Zhe Chen
Dacheng Tao
24
0
0
25 Nov 2020
Play Fair: Frame Attributions in Video Models
Play Fair: Frame Attributions in Video Models
Will Price
Dima Damen
FAtt
28
5
0
24 Nov 2020
Improved Soccer Action Spotting using both Audio and Video Streams
Improved Soccer Action Spotting using both Audio and Video Streams
Bastien Vanderplaetse
Stéphane Dupont
41
42
0
09 Nov 2020
Multi-Temporal Convolutions for Human Action Recognition in Videos
Multi-Temporal Convolutions for Human Action Recognition in Videos
Alexandros Stergiou
R. Poppe
24
1
0
08 Nov 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action
  Recognition
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit
  Latent Features
Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features
Myeongah Cho
Taeoh Kim
Woojin Kim
Suhwan Cho
Sangyoun Lee
14
90
0
15 Oct 2020
The MECCANO Dataset: Understanding Human-Object Interactions from
  Egocentric Videos in an Industrial-like Domain
The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain
Francesco Ragusa
Antonino Furnari
S. Livatino
G. Farinella
EgoV
24
99
0
12 Oct 2020
Adversarial Semi-Supervised Multi-Domain Tracking
Adversarial Semi-Supervised Multi-Domain Tracking
Kourosh Meshgi
Maryam Sadat Mirzaei
14
1
0
30 Sep 2020
AssembleNet++: Assembling Modality Representations via Attention
  Connections
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
15
44
0
18 Aug 2020
A Unified Framework for Shot Type Classification Based on Subject
  Centric Lens
A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Anyi Rao
Jiaze Wang
Linning Xu
Xuekun Jiang
Qingqiu Huang
Bolei Zhou
Dahua Lin
18
60
0
08 Aug 2020
Self-supervised Video Representation Learning Using Inter-intra
  Contrastive Framework
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework
Li Tao
Xueting Wang
T. Yamasaki
SSL
25
106
0
06 Aug 2020
HAMLET: A Hierarchical Multimodal Attention-based Human Activity
  Recognition Algorithm
HAMLET: A Hierarchical Multimodal Attention-based Human Activity Recognition Algorithm
Md. Mofijul Islam
Tariq Iqbal
22
80
0
03 Aug 2020
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
Yue Meng
Chung-Ching Lin
Yikang Shen
P. Sattigeri
Leonid Karlinsky
A. Oliva
Kate Saenko
Rogerio Feris
23
141
0
31 Jul 2020
Previous
123456
Next