ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.13605
  4. Cited By
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online
  Action Prediction

GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction

24 October 2022
Samrudhdhi B. Rangrej
Kevin J. Liang
Tal Hassner
James J. Clark
ArXivPDFHTML

Papers citing "GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction"

42 / 42 papers shown
Title
Efficient Human Vision Inspired Action Recognition using Adaptive
  Spatiotemporal Sampling
Efficient Human Vision Inspired Action Recognition using Adaptive Spatiotemporal Sampling
Khoi-Nguyen C. Mac
Minh Do
Minh Vo
TTA
53
1
0
12 Jul 2022
Consistency driven Sequential Transformers Attention Model for Partially
  Observable Scenes
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
Samrudhdhi B. Rangrej
C. Srinidhi
J. Clark
53
12
0
01 Apr 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for
  Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
201
1,181
0
23 Mar 2022
Glance and Focus Networks for Dynamic Visual Recognition
Glance and Focus Networks for Dynamic Visual Recognition
Gao Huang
Yulin Wang
Kangchen Lv
Haojun Jiang
Wenhui Huang
Pengfei Qi
S. Song
3DH
109
50
0
09 Jan 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video
  Recognition
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
Yulin Wang
Yang Yue
Yuanze Lin
Haojun Jiang
Zihang Lai
V. Kulikov
Nikita Orlov
Humphrey Shi
Gao Huang
53
50
0
28 Dec 2021
A Probabilistic Hard Attention Model For Sequentially Observed Scenes
A Probabilistic Hard Attention Model For Sequentially Observed Scenes
Samrudhdhi B. Rangrej
James J. Clark
44
12
0
15 Nov 2021
Video Swin Transformer
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
94
1,474
0
24 Jun 2021
Anticipative Video Transformer
Anticipative Video Transformer
Rohit Girdhar
Kristen Grauman
ViT
53
210
0
03 Jun 2021
Anticipating human actions by correlating past with the future with
  Jaccard similarity measures
Anticipating human actions by correlating past with the future with Jaccard similarity measures
Basura Fernando
Samitha Herath
EgoV
56
58
0
26 May 2021
Adaptive Focus for Efficient Video Recognition
Adaptive Focus for Efficient Video Recognition
Yulin Wang
Zhaoxi Chen
Haojun Jiang
Shiji Song
Yizeng Han
Gao Huang
64
99
0
07 May 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
201
2,137
0
29 Mar 2021
Hard-Attention for Scalable Image Classification
Hard-Attention for Scalable Image Classification
Athanasios Papadopoulos
Pawel Korus
N. Memon
87
25
0
20 Feb 2021
Training data-efficient image transformers & distillation through
  attention
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
359
6,731
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
550
40,739
0
22 Oct 2020
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in
  Image Classification
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification
Yulin Wang
Kangchen Lv
Rui Huang
Shiji Song
Le Yang
Gao Huang
3DH
40
150
0
11 Oct 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
125
1,018
0
09 Apr 2020
Meta Pseudo Labels
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
335
667
0
23 Mar 2020
FixMatch: Simplifying Semi-Supervised Learning with Consistency and
  Confidence
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
Kihyuk Sohn
David Berthelot
Chun-Liang Li
Zizhao Zhang
Nicholas Carlini
E. D. Cubuk
Alexey Kurakin
Han Zhang
Colin Raffel
AAML
153
3,545
0
21 Jan 2020
Self-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classification
Qizhe Xie
Minh-Thang Luong
Eduard H. Hovy
Quoc V. Le
NoLa
296
2,387
0
11 Nov 2019
Knowledge Distillation from Internal Representations
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
70
181
0
08 Oct 2019
Saccader: Improving Accuracy of Hard Attention Models for Vision
Saccader: Improving Accuracy of Hard Attention Models for Vision
Gamaleldin F. Elsayed
Simon Kornblith
Quoc V. Le
VLM
42
73
0
20 Aug 2019
Unsupervised Data Augmentation for Consistency Training
Unsupervised Data Augmentation for Consistency Training
Qizhe Xie
Zihang Dai
Eduard H. Hovy
Minh-Thang Luong
Quoc V. Le
124
2,314
0
29 Apr 2019
Video Classification with Channel-Separated Convolutional Networks
Video Classification with Channel-Separated Convolutional Networks
Du Tran
Heng Wang
Lorenzo Torresani
Matt Feiszli
3DV
61
586
0
04 Apr 2019
Cross-lingual Language Model Pretraining
Cross-lingual Language Model Pretraining
Guillaume Lample
Alexis Conneau
73
2,735
0
22 Jan 2019
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
162
3,262
0
10 Dec 2018
TSM: Temporal Shift Module for Efficient Video Understanding
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
85
1,683
0
20 Nov 2018
Glimpse Clouds: Human Activity Recognition from Unstructured Feature
  Points
Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
Fabien Baradel
Christian Wolf
J. Mille
Graham W. Taylor
139
154
0
22 Feb 2018
Human Action Recognition: Pose-based Attention draws focus to Hands
Human Action Recognition: Pose-based Attention draws focus to Hands
Fabien Baradel
Christian Wolf
J. Mille
130
108
0
20 Dec 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
196
3,021
0
30 Nov 2017
Temporal Relational Reasoning in Videos
Temporal Relational Reasoning in Videos
Bolei Zhou
A. Andonian
Aude Oliva
Antonio Torralba
NAI
91
1,037
0
22 Nov 2017
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
273
8,888
0
21 Nov 2017
The "something something" video database for learning and evaluating
  visual common sense
The "something something" video database for learning and evaluating visual common sense
Raghav Goyal
Samira Ebrahimi Kahou
Vincent Michalski
Joanna Materzynska
S. Westphal
...
Moritz Mueller-Freitag
F. Hoppe
Christian Thurau
Ingo Bax
Roland Memisevic
VLM
82
1,529
0
13 Jun 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
651
130,942
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
219
7,989
0
22 May 2017
Temporal Ensembling for Semi-Supervised Learning
Temporal Ensembling for Semi-Supervised Learning
S. Laine
Timo Aila
UQCV
181
2,552
0
07 Oct 2016
Temporal Segment Networks: Towards Good Practices for Deep Action
  Recognition
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
98
3,825
0
02 Aug 2016
Regularization With Stochastic Transformations and Perturbations for
  Deep Semi-Supervised Learning
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
Mehdi S. M. Sajjadi
Mehran Javanmardi
Tolga Tasdizen
BDL
80
1,111
0
14 Jun 2016
Spatial Transformer Networks
Spatial Transformer Networks
Max Jaderberg
Karen Simonyan
Andrew Zisserman
Koray Kavukcuoglu
292
7,379
0
05 Jun 2015
Distilling the Knowledge in a Neural Network
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
322
19,609
0
09 Mar 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
324
10,050
0
10 Feb 2015
Learning with Pseudo-Ensembles
Learning with Pseudo-Ensembles
Philip Bachman
O. Alsharif
Doina Precup
70
598
0
16 Dec 2014
Recurrent Models of Visual Attention
Recurrent Models of Visual Attention
Volodymyr Mnih
N. Heess
Alex Graves
Koray Kavukcuoglu
VLM
142
3,651
0
24 Jun 2014
1