Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 1,270 papers shown
Title
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar
Yongqin Xian
A. Tonioni
Andrew Zisserman
Federico Tombari
38
12
0
19 Dec 2023
Deep Learning Approaches for Seizure Video Analysis: A Review
David Ahmedt-Aristizabal
M. Armin
Zeeshan Hayder
Norberto Garcia-Cairasco
Lars Petersson
Clinton Fookes
Simon Denman
A. McGonigal
32
21
0
18 Dec 2023
Benchmarks for Physical Reasoning AI
Andrew Melnik
Robin Schiewer
Moritz Lange
Andrei Muresanu
Mozhgan Saeidi
Animesh Garg
Helge J. Ritter
29
8
0
17 Dec 2023
Hourglass-AVSR: Down-Up Sampling-based Computational Efficiency Model for Audio-Visual Speech Recognition
Fan Yu
Haoxu Wang
Ziyang Ma
Shiliang Zhang
57
2
0
14 Dec 2023
Generative Model-based Feature Knowledge Distillation for Action Recognition
Guiqin Wang
Peng Zhao
Yanjiang Shi
Cong Zhao
Shusen Yang
VLM
49
3
0
14 Dec 2023
ConFormer: A Novel Collection of Deep Learning Models to Assist Cardiologists in the Assessment of Cardiac Function
Ethan Thomas
Salman Aslam
MedIm
34
0
0
13 Dec 2023
From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
Yin Chen
Jia Li
Shiguang Shan
Meng Wang
Richang Hong
48
32
0
09 Dec 2023
MuRF: Multi-Baseline Radiance Fields
Haofei Xu
Anpei Chen
Yuedong Chen
Daniel Gehrig
Yulun Zhang
Marc Pollefeys
Andreas Geiger
Fisher Yu
18
26
0
07 Dec 2023
Low-power, Continuous Remote Behavioral Localization with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
32
7
0
06 Dec 2023
From Detection to Action Recognition: An Edge-Based Pipeline for Robot Human Perception
Petros Toupas
Georgios Tsamis
Dimitrios Giakoumis
K. Votis
Dimitrios Tzovaras
32
0
0
06 Dec 2023
D
2
^2
2
ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei
Qizhong Tan
Guangming Lu
Jiandong Tian
41
3
0
03 Dec 2023
Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement
Ziyu Wang
Yue Xu
Cewu Lu
Yong-Lu Li
DD
41
8
0
01 Dec 2023
CAST: Cross-Attention in Space and Time for Video Action Recognition
Dongho Lee
Jongseo Lee
Jinwoo Choi
EgoV
35
12
0
30 Nov 2023
DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding
Kyungho Bae
Geo Ahn
Youngrae Kim
Jinwoo Choi
30
3
0
30 Nov 2023
Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models
Dong Li
Jiandong Jin
Yuhao Zhang
Yanlin Zhong
Yaoyang Wu
Lan Chen
Tianlin Li
Bin Luo
71
6
0
30 Nov 2023
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tom Tongjia Chen
Hongshan Yu
Zhengeng Yang
Zechuan Li
Wei Sun
Chen Chen
23
8
0
30 Nov 2023
Combined Scheduling, Memory Allocation and Tensor Replacement for Minimizing Off-Chip Data Accesses of DNN Accelerators
Yi Li
Aarti Gupta
Sharad Malik
13
1
0
30 Nov 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
23
1
0
29 Nov 2023
F4D: Factorized 4D Convolutional Neural Network for Efficient Video-level Representation Learning
Mohammad Al-Saad
Lakshmish Ramaswamy
S. Bhandarkar
AI4TS
24
0
0
28 Nov 2023
Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen
Dapeng Chen
Ruijin Liu
Sai Zhou
Wenyuan Xue
Wei Peng
33
6
0
27 Nov 2023
Introducing SSBD+ Dataset with a Convolutional Pipeline for detecting Self-Stimulatory Behaviours in Children using raw videos
Vaibhavi Lokegaonkar
Vijay Jaisankar
Pon Deepika
Madhav Rao
T. Srikanth
Sarbani Mallick
Manjit Sodhi
11
1
0
25 Nov 2023
VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG
Yankun Xu
Junzhe Wang
Yun-Hsuan Chen
Jie Yang
Wenjie Ming
Shuangquan Wang
Mohamad Sawan
17
0
0
24 Nov 2023
Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition
Sumin Lee
Sangmin Woo
Muhammad Adi Nugroho
Changick Kim
30
0
0
21 Nov 2023
Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing
Yating Xu
Conghui Hu
Gim Hee Lee
22
2
0
14 Nov 2023
ELF: An End-to-end Local and Global Multimodal Fusion Framework for Glaucoma Grading
Wenyun Li
Chi-Man Pun
14
1
0
14 Nov 2023
INCODE: Implicit Neural Conditioning with Prior Knowledge Embeddings
A. Kazerouni
Reza Azad
Alireza Hosseini
Dorit Merhof
Ulas Bagci
AI4CE
30
15
0
28 Oct 2023
Diversifying Spatial-Temporal Perception for Video Domain Generalization
Kun-Yu Lin
Jia-Run Du
Yipeng Gao
Jiaming Zhou
Wei-Shi Zheng
45
14
0
27 Oct 2023
Deepfake Detection: Leveraging the Power of 2D and 3D CNN Ensembles
Aagam Bakliwal
Amit D. Joshi
19
1
0
25 Oct 2023
Remote Heart Rate Monitoring in Smart Environments from Videos with Self-supervised Pre-training
Divij Gupta
Ali Etemad
47
2
0
23 Oct 2023
3M-TRANSFORMER: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction
Mehdi Fatan
Emanuele Mincato
Dimitra Pintzou
Mariella Dimiccoli
30
1
0
23 Oct 2023
ConViViT -- A Deep Neural Network Combining Convolutions and Factorized Self-Attention for Human Activity Recognition
Rachid Reda Dokkar
F. Chaieb
Hassen Drira
Arezki Aberkane
ViT
30
2
0
22 Oct 2023
On the Relevance of Temporal Features for Medical Ultrasound Video Recognition
D. H. Smith
J. P. Lineberger
G. H. Baker
8
2
0
16 Oct 2023
CM-PIE: Cross-modal perception for interactive-enhanced audio-visual video parsing
Yaru Chen
Ruohao Guo
Xubo Liu
Peipei Wu
Guangyao Li
Zhenbo Li
Wenwu Wang
34
7
0
11 Oct 2023
Boundary Discretization and Reliable Classification Network for Temporal Action Detection
Zhenying Fang
Jun Yu
Richang Hong
28
0
0
10 Oct 2023
Automatic nodule identification and differentiation in ultrasound videos to facilitate per-nodule examination
Siyuan Jiang
Yan Ding
Yuling Wang
Lei Xu
Wenli Dai
...
Jie Yu
Jianqiao Zhou
Chunquan Zhang
Ping Liang
Dexing Kong
19
0
0
10 Oct 2023
Semantic-aware Temporal Channel-wise Attention for Cardiac Function Assessment
Guanqi Chen
Guanbin Li
11
0
0
09 Oct 2023
In the Blink of an Eye: Event-based Emotion Recognition
Haiwei Zhang
Jiqing Zhang
B. Dong
Pieter Peers
Wenwei Wu
Xiaopeng Wei
Felix Heide
Xin Yang
CVBM
32
12
0
06 Oct 2023
Multi-Resolution Audio-Visual Feature Fusion for Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
36
0
0
05 Oct 2023
FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery
Tasin Islam
A. Miron
Xiaohui Liu
Yongmin Li
DiffM
31
3
0
29 Sep 2023
A Survey on Deep Learning Techniques for Action Anticipation
Zeyun Zhong
Manuel Martin
Michael Voit
Juergen Gall
Jürgen Beyerer
26
7
0
29 Sep 2023
End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning
Jinrong Zhang
Wu Wen
Sheng-lan Liu
Yunheng Li
Qifeng Li
Lin Feng
31
0
0
27 Sep 2023
Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Jyoti Kini
Sarah Fleischer
I. Dave
Mubarak Shah
EgoV
31
2
0
25 Sep 2023
Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training
Jiangliu Wang
Jianbo Jiao
Yibing Song
Stephen James
Zhan Tong
Chongjian Ge
Pieter Abbeel
Yunhui Liu
20
0
0
25 Sep 2023
S3TC: Spiking Separated Spatial and Temporal Convolutions with Unsupervised STDP-based Learning for Action Recognition
Mireille el Assal
Pierre Tirilly
Ioan Marius Bilasco
26
2
0
22 Sep 2023
TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification
Meng Liu
K. Liang
Dayu Hu
Hao Yu
Yue Liu
Lingyuan Meng
Wenxuan Tu
Sihang Zhou
Xinwang Liu
18
25
0
21 Sep 2023
Selective Volume Mixup for Video Action Recognition
Yi Tan
Zhaofan Qiu
Y. Hao
Ting Yao
Xiangnan He
Tao Mei
ViT
35
2
0
18 Sep 2023
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism
I. Gurvich
Ido Leichter
Dharmendar Reddy Palle
Yossi Asher
Alon Vinnikov
Igor Abramovski
Vishak Gopal
Ross Cutler
Eyal Krupka
34
4
0
15 Sep 2023
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
27
18
0
14 Sep 2023
TransNet: A Transfer Learning-Based Network for Human Action Recognition
Khaled Alomar
Xiaohao Cai
38
1
0
13 Sep 2023
STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning
Palaash Agrawal
Haidi Azaman
Cheston Tan
51
3
0
13 Sep 2023
Previous
1
2
3
4
5
6
...
24
25
26
Next