ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.11248
  4. Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition

A Closer Look at Spatiotemporal Convolutions for Action Recognition

30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
ArXivPDFHTML

Papers citing "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

50 / 477 papers shown
Title
Tensor Representations for Action Recognition
Tensor Representations for Action Recognition
Piotr Koniusz
Lei Wang
A. Cherian
39
69
0
28 Dec 2020
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the
  UDIVA Dataset
Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset
Cristina Palmero
Javier Selva
Sorina Smeureanu
Julio C. S. Jacques Junior
Albert Clapés
...
Zejian Zhang
D. Gallardo-Pujol
G. Guilera
D. Leiva
Sergio Escalera
28
53
0
28 Dec 2020
Learning Inter- and Intraframe Representations for Non-Lambertian
  Photometric Stereo
Learning Inter- and Intraframe Representations for Non-Lambertian Photometric Stereo
Yanlong Cao
Binjie Ding
Zewei He
Jiangxin Yang
Jingxi Chen
Yanpeng Cao
Xin Li
24
13
0
26 Dec 2020
A Multi-View Dynamic Fusion Framework: How to Improve the Multimodal
  Brain Tumor Segmentation from Multi-Views?
A Multi-View Dynamic Fusion Framework: How to Improve the Multimodal Brain Tumor Segmentation from Multi-Views?
Yi Ding
Wei Zheng
Guozheng Wu
Ji Geng
Mingsheng Cao
Zhiguang Qin
17
1
0
21 Dec 2020
TDN: Temporal Difference Networks for Efficient Action Recognition
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
28
391
0
18 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark
Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu
Yao Hu
S. Bai
Fei Ding
X. Bai
Philip Torr
46
81
0
17 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
33
27
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions
CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions
Tayfun Ates
Muhammed Samil Atesoglu
Cagatay Yigit
.Ilker Kesen
Mert Kobaş
Erkut Erdem
Aykut Erdem
T. Goksun
Deniz Yuret
27
31
0
08 Dec 2020
Spatial-Temporal Alignment Network for Action Recognition and Detection
Spatial-Temporal Alignment Network for Action Recognition and Detection
Junwei Liang
Liangliang Cao
Xuehan Xiong
Ting Yu
Alexander G. Hauptmann
3DPC
16
9
0
04 Dec 2020
Video Self-Stitching Graph Network for Temporal Action Localization
Video Self-Stitching Graph Network for Temporal Action Localization
Chen Zhao
Ali K. Thabet
Guohao Li
26
138
0
30 Nov 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization
  Tasks
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
Humam Alwassel
Silvio Giancola
Guohao Li
33
123
0
23 Nov 2020
Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised
  Video Representation Learning
Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning
Zehua Zhang
David J. Crandall
AI4TS
SSL
28
23
0
23 Nov 2020
ActBERT: Learning Global-Local Video-Text Representations
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
46
417
0
14 Nov 2020
Ontology-driven Event Type Classification in Images
Ontology-driven Event Type Classification in Images
Eric Müller-Budack
Matthias Springstein
Sherzod Hakimov
Kevin Mrutzek
Ralph Ewerth
14
9
0
09 Nov 2020
Multi-Temporal Convolutions for Human Action Recognition in Videos
Multi-Temporal Convolutions for Human Action Recognition in Videos
Alexandros Stergiou
R. Poppe
24
1
0
08 Nov 2020
Learning Representations from Audio-Visual Spatial Alignment
Learning Representations from Audio-Visual Spatial Alignment
Pedro Morgado
Yi Li
Nuno Vasconcelos
SSL
27
121
0
03 Nov 2020
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised
  Video Representation Leaning
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning
L. Tao
Xueting Wang
T. Yamasaki
VLM
SSL
23
14
0
29 Oct 2020
SAR-NAS: Skeleton-based Action Recognition via Neural Architecture
  Searching
SAR-NAS: Skeleton-based Action Recognition via Neural Architecture Searching
Haoyuan Zhang
Yonghong Hou
Pichao Wang
Zihui Guo
Wanqing Li
29
15
0
29 Oct 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action
  Recognition
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
Pose And Joint-Aware Action Recognition
Pose And Joint-Aware Action Recognition
Anshul B. Shah
Shlok Kumar Mishra
Ankan Bansal
Jun-Cheng Chen
Ramalingam Chellappa
Abhinav Shrivastava
39
33
0
16 Oct 2020
PERF-Net: Pose Empowered RGB-Flow Net
PERF-Net: Pose Empowered RGB-Flow Net
Yinxiao Li
Zhichao Lu
Xuehan Xiong
Jonathan Huang
3DH
40
17
0
28 Sep 2020
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention
  and Alertness Analysis
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
J. Ortega
Neslihan Köse
P. Cañas
Min-An Chao
A. Unnervik
Marcos Nieto
Oihana Otaegui
L. Salgado
24
91
0
27 Aug 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
21
86
0
26 Aug 2020
Effective Action Recognition with Embedded Key Point Shifts
Effective Action Recognition with Embedded Key Point Shifts
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
15
7
0
26 Aug 2020
Quantitative Survey of the State of the Art in Sign Language Recognition
Quantitative Survey of the State of the Art in Sign Language Recognition
Oscar Koller
SLR
27
94
0
22 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention
  Connections
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
15
44
0
18 Aug 2020
DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions
  in the Wild
DFEW: A Large-Scale Database for Recognizing Dynamic Facial Expressions in the Wild
Xingxun Jiang
Yuan Zong
Wenming Zheng
Chuangao Tang
Wanchuang Xia
Cheng Lu
Jiateng Liu
27
153
0
13 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised
  Audio-Visual Representation Learning
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
36
106
0
13 Aug 2020
2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors
  Challenges: An Efficient Optical Flow Stream Guided Framework
2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors Challenges: An Efficient Optical Flow Stream Guided Framework
Haoyu Chen
Zitong Yu
Xin Liu
Wei Peng
Yoon Lee
Guoying Zhao
3DPC
31
4
0
10 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action
  Recognition
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
39
140
0
03 Aug 2020
HAMLET: A Hierarchical Multimodal Attention-based Human Activity
  Recognition Algorithm
HAMLET: A Hierarchical Multimodal Attention-based Human Activity Recognition Algorithm
Md. Mofijul Islam
Tariq Iqbal
22
80
0
03 Aug 2020
Representation Learning with Video Deep InfoMax
Representation Learning with Video Deep InfoMax
R. Devon Hjelm
Philip Bachman
SSL
MDE
26
28
0
27 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
25
23
0
25 Jul 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video
  Parsing
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
34
180
0
21 Jul 2020
Hierarchical Contrastive Motion Learning for Video Action Recognition
Hierarchical Contrastive Motion Learning for Video Action Recognition
Xitong Yang
Xiaodong Yang
Sifei Liu
Deqing Sun
L. Davis
Jan Kautz
SSL
35
13
0
20 Jul 2020
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
FAtt
20
128
0
20 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
30
79
0
20 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Temporal Distinct Representation Learning for Action Recognition
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
17
26
0
15 Jul 2020
Fusing Motion Patterns and Key Visual Information for Semantic Event
  Recognition in Basketball Videos
Fusing Motion Patterns and Key Visual Information for Semantic Event Recognition in Basketball Videos
Lifang Wu
Zhou Yang
Qi Wang
Meng Jian
Boxuan Zhao
Junchi Yan
Chang Wen Chen
29
33
0
13 Jul 2020
AViD Dataset: Anonymized Videos from Diverse Countries
AViD Dataset: Anonymized Videos from Diverse Countries
A. Piergiovanni
Michael S. Ryoo
27
35
0
10 Jul 2020
Generalized Few-Shot Video Classification with Video Retrieval and
  Feature Generation
Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation
Yongqin Xian
Bruno Korbar
Matthijs Douze
Lorenzo Torresani
Bernt Schiele
Zeynep Akata
VGen
18
18
0
09 Jul 2020
Ultra2Speech -- A Deep Learning Framework for Formant Frequency
  Estimation and Tracking from Ultrasound Tongue Images
Ultra2Speech -- A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images
Pramit Saha
Yadong Liu
B. Gick
S. Fels
19
11
0
29 Jun 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
40
371
0
29 Jun 2020
Deepfake Detection using Spatiotemporal Convolutional Networks
Deepfake Detection using Spatiotemporal Convolutional Networks
Oscar de Lima
Sean Franklin
Shreshtha Basu
Blake Karwoski
A. George
3DPC
20
110
0
26 Jun 2020
Video Playback Rate Perception for Self-supervisedSpatio-Temporal
  Representation Learning
Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning
Yuan Yao
Chang-rui Liu
Dezhao Luo
Yu Zhou
QiXiang Ye
29
169
0
20 Jun 2020
Learn to cycle: Time-consistent feature discovery for action recognition
Learn to cycle: Time-consistent feature discovery for action recognition
Alexandros Stergiou
R. Poppe
22
23
0
15 Jun 2020
Telling Left from Right: Learning Spatial Correspondence of Sight and
  Sound
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound
Karren D. Yang
Bryan C. Russell
Justin Salamon
SSL
24
75
0
11 Jun 2020
Preterm infants' pose estimation with spatio-temporal features
Preterm infants' pose estimation with spatio-temporal features
S. Moccia
Lucia Migliorelli
V. Carnielli
Emanuele Frontoni
3DH
25
44
0
08 May 2020
Adaptive Interaction Modeling via Graph Operations Search
Adaptive Interaction Modeling via Graph Operations Search
Haoxin Li
Weishi Zheng
Yu Tao
Haifeng Hu
Jianhuang Lai
26
5
0
05 May 2020
Previous
123...10789
Next