ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.07632
  4. Cited By
Learning Spatio-Temporal Features with 3D Residual Networks for Action
  Recognition

Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition

25 August 2017
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
    3DPC
ArXivPDFHTML

Papers citing "Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition"

50 / 74 papers shown
Title
TSLFormer: A Lightweight Transformer Model for Turkish Sign Language Recognition Using Skeletal Landmarks
TSLFormer: A Lightweight Transformer Model for Turkish Sign Language Recognition Using Skeletal Landmarks
Kutay Ertürk
Furkan Altınışık
İrem Sarıaltın
Ömer Nezih Gerek
SLR
42
0
0
11 May 2025
LMLCC-Net: A Semi-Supervised Deep Learning Model for Lung Nodule Malignancy Prediction from CT Scans using a Novel Hounsfield Unit-Based Intensity Filtering
LMLCC-Net: A Semi-Supervised Deep Learning Model for Lung Nodule Malignancy Prediction from CT Scans using a Novel Hounsfield Unit-Based Intensity Filtering
Adhora Madhuri
Nusaiba Sobir
Tasnia Binte Mamun
Taufiq Hasan
29
0
0
09 May 2025
Cross-Modal Consistency Learning for Sign Language Recognition
Cross-Modal Consistency Learning for Sign Language Recognition
Kepeng Wu
Zecheng Li
Weichao Zhao
Hezhen Hu
Wengang Zhou
SLR
47
0
0
16 Mar 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
Automated Detection of Epileptic Spikes and Seizures Incorporating a Novel Spatial Clustering Prior
Automated Detection of Epileptic Spikes and Seizures Incorporating a Novel Spatial Clustering Prior
Hanyang Dong
Shurong Sheng
Xiongfei Wang
Jiahong Gao
Yi Sun
Wanli Yang
Kuntao Xiao
Pengfei Teng
Guoming Luan
Zhao Lv
21
0
0
05 Jan 2025
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal
  Model
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model
Lu Xu
Sijie Zhu
Chunyuan Li
Chia-Wen Kuo
Fan Chen
Xinyao Wang
Guang Chen
Dawei Du
Ye Yuan
Longyin Wen
44
4
0
15 Jun 2024
Video-based Exercise Classification and Activated Muscle Group
  Prediction with Hybrid X3D-SlowFast Network
Video-based Exercise Classification and Activated Muscle Group Prediction with Hybrid X3D-SlowFast Network
Manvik Pasula
Pramit Saha
29
0
0
10 Jun 2024
MaskFi: Unsupervised Learning of WiFi and Vision Representations for
  Multimodal Human Activity Recognition
MaskFi: Unsupervised Learning of WiFi and Vision Representations for Multimodal Human Activity Recognition
Jianfei Yang
Shijie Tang
Yuecong Xu
Yunjiao Zhou
Lihua Xie
35
4
0
29 Feb 2024
Automated Sperm Assessment Framework and Neural Network Specialized for
  Sperm Video Recognition
Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition
T. Fujii
Hayato Nakagawa
T. Takeshima
Y. Yumura
T. Hamagami
30
3
0
10 Nov 2023
Large Models for Time Series and Spatio-Temporal Data: A Survey and
  Outlook
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
Ming Jin
Qingsong Wen
Keli Zhang
Chaoli Zhang
Siqiao Xue
...
Shirui Pan
Vincent S. Tseng
Yu Zheng
Lei Chen
Hui Xiong
AI4TS
SyDa
40
118
0
16 Oct 2023
DeePoint: Visual Pointing Recognition and Direction Estimation
DeePoint: Visual Pointing Recognition and Direction Estimation
Shu Nakamura
Yasutomo Kawanishi
S. Nobuhara
Ko Nishino
3DH
3DPC
28
2
0
14 Apr 2023
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting
Aoting Zhang
He Wang
Pengcheng Guo
Yihui Fu
Linfu Xie
Yingying Gao
Shilei Zhang
Junlan Feng
13
4
0
27 Feb 2023
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Kaiwen Zhang
Jialun Peng
Jingjing Fu
Dong Liu
ViT
27
8
0
24 Jan 2023
SVFormer: Semi-supervised Video Transformer for Action Recognition
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing
Qi Dai
Hang-Rui Hu
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
ViT
33
69
0
23 Nov 2022
ALT: Boosting Deep Learning Performance by Breaking the Wall between
  Graph and Operator Level Optimizations
ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations
Zhiying Xu
Jiafan Xu
H. Peng
Wei Wang
Xiaoliang Wang
...
Haipeng Dai
Yixu Xu
Hao Cheng
Kun Wang
Guihai Chen
35
0
0
22 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State
  Spaces
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
39
0
12 Oct 2022
Neighbourhood Representative Sampling for Efficient End-to-end Video
  Quality Assessment
Neighbourhood Representative Sampling for Efficient End-to-end Video Quality Assessment
Haoning Wu
Chaofeng Chen
Liang Liao
Jingwen Hou
Wenxiu Sun
Qiong Yan
Liang Feng
Weisi Lin
59
44
0
11 Oct 2022
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition
  Analysis for Adversarial Multi-task Video Understanding
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding
Stephen Su
Sam Kwong
Qingyu Zhao
De-An Huang
Juan Carlos Niebles
Ehsan Adeli
27
0
0
22 Aug 2022
Blockwise Temporal-Spatial Pathway Network
Blockwise Temporal-Spatial Pathway Network
SeulGi Hong
Min-Kook Choi
26
1
0
05 Aug 2022
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real
  Recognition of Activities of Daily Living
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living
Zdravko Marinov
David Schneider
Alina Roitberg
Rainer Stiefelhagen
VGen
32
2
0
03 Aug 2022
BYOLMed3D: Self-Supervised Representation Learning of Medical Videos using Gradient Accumulation Assisted 3D BYOL Framework
Siladittya Manna
Rakesh Dey
Souvik Chakraborty
SSL
18
0
0
31 Jul 2022
Adaptive occlusion sensitivity analysis for visually explaining video
  recognition networks
Adaptive occlusion sensitivity analysis for visually explaining video recognition networks
Tomoki Uchiyama
Naoya Sogi
S. Iizuka
Koichiro Niinuma
Kazuhiro Fukui
24
2
0
26 Jul 2022
Vision-based Human Fall Detection Systems using Deep Learning: A Review
Vision-based Human Fall Detection Systems using Deep Learning: A Review
Ekram Alam
Abu Sufian
P. Dutta
Marco Leo
36
82
0
22 Jul 2022
Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in
  VIS and NIR Scenario
Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in VIS and NIR Scenario
Yukai Wang
Chunlei Peng
Decheng Liu
N. Wang
Xinbo Gao
47
14
0
05 Jul 2022
Large-scale Robustness Analysis of Video Action Recognition Models
Large-scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Schiappa
Naman Biyani
Prudvi Kamtam
Shruti Vyas
Hamid Palangi
Vibhav Vineet
Yogesh S Rawat
AAML
37
24
0
04 Jul 2022
Surgical-VQA: Visual Question Answering in Surgical Scenes using
  Transformer
Surgical-VQA: Visual Question Answering in Surgical Scenes using Transformer
Lalithkumar Seenivasan
Mobarakol Islam
Adithya K. Krishna
Hongliang Ren
MedIm
21
45
0
22 Jun 2022
A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying
  Static vs. Dynamic Information
A Deeper Dive Into What Deep Spatiotemporal Networks Encode: Quantifying Static vs. Dynamic Information
M. Kowal
Mennatullah Siam
Md. Amirul Islam
Neil D. B. Bruce
Richard P. Wildes
Konstantinos G. Derpanis
23
25
0
06 Jun 2022
Micro-Expression Recognition Based on Attribute Information Embedding
  and Cross-modal Contrastive Learning
Micro-Expression Recognition Based on Attribute Information Embedding and Cross-modal Contrastive Learning
Yanxing Song
Jianzong Wang
Tianbo Wu
Zhangcheng Huang
Jing Xiao
CVBM
37
2
0
29 May 2022
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for
  Action Recognition
Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition
Kazuki Omi
Jun Kimata
Toru Tamaki
23
7
0
15 Apr 2022
Probabilistic Representations for Video Contrastive Learning
Probabilistic Representations for Video Contrastive Learning
Jungin Park
Jiyoung Lee
Ig-Jae Kim
Kwanghoon Sohn
SSL
31
43
0
08 Apr 2022
Robust Deepfake On Unrestricted Media: Generation And Detection
Robust Deepfake On Unrestricted Media: Generation And Detection
Trung-Nghia Le
H. Nguyen
Junichi Yamagishi
Isao Echizen
36
7
0
13 Feb 2022
Video Violence Recognition and Localization Using a Semi-Supervised Hard
  Attention Model
Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model
Hamid Reza Mohammadi
Ehsan Nazerfard
27
24
0
04 Feb 2022
Should I take a walk? Estimating Energy Expenditure from Video Data
Should I take a walk? Estimating Energy Expenditure from Video Data
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
16
4
0
01 Feb 2022
A New Measure of Model Redundancy for Compressed Convolutional Neural
  Networks
A New Measure of Model Redundancy for Compressed Convolutional Neural Networks
Feiqing Huang
Yuefeng Si
Yao Zheng
Guodong Li
39
1
0
09 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video
  Recognition
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Keli Zhang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
32
21
0
09 Dec 2021
MASTAF: A Model-Agnostic Spatio-Temporal Attention Fusion Network for
  Few-shot Video Classification
MASTAF: A Model-Agnostic Spatio-Temporal Attention Fusion Network for Few-shot Video Classification
Rex Liu
Huan Zhang
Hamed Pirsiavash
Xin Liu
ViT
25
11
0
08 Dec 2021
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned
  Meta-Adaptation
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
18
19
0
30 Sep 2021
Multi-Source Video Domain Adaptation with Temporal Attentive Moment
  Alignment
Multi-Source Video Domain Adaptation with Temporal Attentive Moment Alignment
Yuecong Xu
Jianfei Yang
Haozhi Cao
Keyu Wu
Min-man Wu
Rui Zhao
Zhenghua Chen
TTA
32
22
0
21 Sep 2021
4D-Net for Learned Multi-Modal Alignment
4D-Net for Learned Multi-Modal Alignment
A. Piergiovanni
Vincent Casser
Michael S. Ryoo
A. Angelova
3DPC
99
55
0
02 Sep 2021
Blindly Assess Quality of In-the-Wild Videos via Quality-aware
  Pre-training and Motion Perception
Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception
Bowen Li
Weixia Zhang
Meng Tian
Guangtao Zhai
Xianpei Wang
43
120
0
19 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
35
42
0
04 Aug 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
35
41
0
22 Jul 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action
  Recognition
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
F. Brémond
27
47
0
19 Jul 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
37
127
0
21 Jun 2021
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
  Configurations
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Taojiannan Yang
Sijie Zhu
Matías Mendieta
Pu Wang
Ravikumar Balakrishnan
Minwoo Lee
T. Han
M. Shah
Chong Chen
3DH
OOD
28
23
0
14 May 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
17
55
0
03 Apr 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
29
33
0
18 Mar 2021
Coarse-Fine Networks for Temporal Activity Detection in Videos
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya
Michael S. Ryoo
AI4TS
53
38
0
01 Mar 2021
Predicting post-operative right ventricular failure using video-based
  deep learning
Predicting post-operative right ventricular failure using video-based deep learning
R. Shad
Nicolas Quach
R. Fong
P. Kasinpila
C. Bowles
...
Y. Woo
J. Teuteberg
John P. Cunningham
C. Langlotz
W. Hiesinger
22
40
0
28 Feb 2021
Patch-VQ: 'Patching Up' the Video Quality Problem
Patch-VQ: 'Patching Up' the Video Quality Problem
Zhenqiang Ying
Maniratnam Mandal
Deepti Ghadiyaram
AI Facebook
8
164
0
27 Nov 2020
12
Next