Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.07503
Cited By
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition
14 October 2022
Dasom Ahn
Sangwon Kim
H. Hong
ByoungChul Ko
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition"
44 / 44 papers shown
Title
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
426
1
0
08 Feb 2025
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
81
0
0
10 Aug 2024
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
224
1,191
0
23 Mar 2022
Co-training Transformer with Videos and Images Improves Action Recognition
Bowen Zhang
Jiahui Yu
Christopher Fifty
Wei Han
Andrew M. Dai
Ruoming Pang
Fei Sha
ViT
70
54
0
14 Dec 2021
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
68
21
0
09 Dec 2021
Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition
Tianyu Guo
Hong Liu
Zhan Chen
Mengyuan Liu
Tao Wang
Runwei Ding
SSL
62
154
0
07 Dec 2021
AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation
Xiangyi Yan
Hao Tang
Shanlin Sun
Haoyu Ma
Deying Kong
Xiaohui Xie
ViT
MedIm
80
129
0
20 Oct 2021
ASFormer: Transformer for Action Segmentation
Fangqiu Yi
Hongyu Wen
Tingting Jiang
ViT
119
176
0
16 Oct 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
77
77
0
20 Aug 2021
Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation
Ailing Zeng
Xiao Sun
Lei Yang
Nanxuan Zhao
Minhao Liu
Qiang Xu
3DH
76
112
0
16 Aug 2021
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
Tailin Chen
Desen Zhou
Jian Wang
Shidong Wang
Yu Guan
Xuming He
Errui Ding
92
75
0
10 Aug 2021
Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
Yuxin Chen
Ziqi Zhang
Chunfen Yuan
Bing Li
Ying Deng
Weiming Hu
66
584
0
26 Jul 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
55
49
0
19 Jul 2021
Long Short-Term Transformer for Online Action Detection
Mingze Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Xia
Zhuowen Tu
Stefano Soatto
ViT
90
134
0
07 Jul 2021
Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
Vittorio Mazzia
Simone Angarano
Francesco Salvetti
Federico Angelini
Marcello Chiaberge
ViT
79
140
0
01 Jul 2021
OadTR: Online Action Detection with Transformers
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Zhe Zuo
Changxin Gao
Nong Sang
OffRL
ViT
80
115
0
21 Jun 2021
MOTR: End-to-End Multiple-Object Tracking with Transformer
Fangao Zeng
Bin Dong
Cheng Chen
Tiancai Wang
Xinming Zhang
Yichen Wei
VOT
56
516
0
07 May 2021
Revisiting Skeleton-based Action Recognition
Haodong Duan
Yue Zhao
Kai-xiang Chen
Dahua Lin
Bo Dai
3DH
69
498
0
28 Apr 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
222
2,150
0
29 Mar 2021
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
Fatemeh Karimi Nejadasl
Olaf Booij
Jan van Gemert
73
41
0
29 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
68
170
0
11 Mar 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
381
2,053
0
09 Feb 2021
MSAF: Multimodal Split Attention Fusion
Lang Su
Chuqing Hu
Guofa Li
Dongpu Cao
63
38
0
13 Dec 2020
JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition
Jinmiao Cai
Nianjuan Jiang
Xiaoguang Han
Kui Jia
Jiangbo Lu
51
85
0
16 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
657
41,103
0
22 Oct 2020
Skeleton-based Action Recognition via Spatial and Temporal Transformer Networks
Chiara Plizzari
Marco Cannici
Matteo Matteucci
ViT
MedIm
81
306
0
17 Aug 2020
Hierarchical Action Classification with Network Pruning
Mahdi Davoodikakhki
KangKang Yin
65
19
0
30 Jul 2020
VPN: Learning Video-Pose Embedding for Activities of Daily Living
Srijan Das
Saurav Sharma
Rui Dai
Francois Bremond
Monique Thonnat
ViT
77
126
0
06 Jul 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
157
796
0
02 May 2020
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
134
1,020
0
09 Apr 2020
View-Invariant Probabilistic Embedding for Human Pose
Jennifer J. Sun
Jiaping Zhao
Liang-Chieh Chen
Florian Schroff
Hartwig Adam
Ting Liu
58
77
0
02 Dec 2019
MMTM: Multimodal Transfer Module for CNN Fusion
Hamid Reza Vaezi Joze
Amirreza Shaban
Michael L. Iuzzolino
K. Koishida
85
282
0
20 Nov 2019
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
C. Caetano
Jessica Sena
Francois Bremond
J. A. dos Santos
William Robson Schwartz
120
171
0
30 Jul 2019
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
Jun Liu
Amir Shahroudy
Mauricio Perez
G. Wang
Ling-yu Duan
Alex C. Kot
80
1,289
0
12 May 2019
Action Machine: Rethinking Action Recognition in Trimmed Videos
Jiagang Zhu
Wei Zou
Liang Xu
Yiming Hu
Zheng Zhu
Manyu Chang
Junjie Huang
Guan Huang
Dalong Du
80
37
0
14 Dec 2018
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
126
709
0
06 Dec 2018
Part-based Graph Convolutional Network for Action Recognition
Kalpit C. Thakkar
P. J. Narayanan
3DH
GNN
62
165
0
13 Sep 2018
2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning
D. Luvizon
David Picard
Hedi Tabia
3DH
163
484
0
26 Feb 2018
Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
Fabien Baradel
Christian Wolf
J. Mille
Graham W. Taylor
157
154
0
22 Feb 2018
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
218
3,030
0
30 Nov 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
250
3,806
0
19 May 2017
Body Joint guided 3D Deep Convolutional Descriptors for Action Recognition
Congqi Cao
Yifan Zhang
Chunjie Zhang
Hanqing Lu
3DH
47
66
0
24 Apr 2017
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
Amir Shahroudy
Jun Liu
T. Ng
G. Wang
250
2,490
0
11 Apr 2016
1