Vision Transformer with Cross-attention by Temporal Shift for Efficient
Action Recognition

v1v2 (latest)

Vision Transformer with Cross-attention by Temporal Shift for Efficient Action Recognition

1 April 2022

Ryota Hashiguchi

ArXiv (abs)PDF HTML

Papers citing "Vision Transformer with Cross-attention by Temporal Shift for Efficient Action Recognition"

5 / 5 papers shown

Title
Shift and matching queries for video semantic segmentation Tsubasa Mizuno Toru Tamaki 101 0 0 10 Oct 2024
Query matching for spatio-temporal action detection with query-based object detector Shimon Hori Kazuki Omi Toru Tamaki 63 0 0 27 Sep 2024
S3Aug: Segmentation, Sampling, and Shift for Action Recognition Taiki Sugiura Toru Tamaki AI4TS 69 2 0 23 Oct 2023
Joint learning of images and videos with a single Vision Transformer Shuki Shimizu Toru Tamaki ViT 64 0 0 21 Aug 2023
3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition Lei Wang Piotr Koniusz ViT 88 50 0 25 Mar 2023