Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.05392
Cited By
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
9 June 2021
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers"
50 / 189 papers shown
Title
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush K. Rai
Kyle Min
Tarun Krishna
Feiyan Hu
Alan F. Smeaton
Noel E. O'Connor
VGen
31
0
0
13 May 2025
Learning Multi-frame and Monocular Prior for Estimating Geometry in Dynamic Scenes
S. Park
Jinwoo Shin
42
0
0
03 May 2025
PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition
Jongseo Lee
Wooil Lee
Gyeong-Moon Park
Seong Tae Kim
Jinwoo Choi
33
0
0
17 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
59
0
0
01 Apr 2025
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Jongseo Lee
Joohyun Chang
Dongho Lee
Jinwoo Choi
56
0
0
30 Mar 2025
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Zihang Lai
Andrea Vedaldi
45
0
0
25 Mar 2025
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad
Vibhav Vineet
Y. S. Rawat
VLM
139
1
0
11 Mar 2025
Do Language Models Understand Time?
Xi Ding
Lei Wang
181
0
0
18 Dec 2024
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning
Yunhong Wang
Zhikang Zhang
Jue Wang
D. Fan
Zhenlin Xu
Linda Liu
Xiang Hao
Vimal Bhat
Xinyu Li
VLM
82
1
0
10 Dec 2024
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano
Gabriele Trivigno
Gabriele Rosi
Carlo Masone
Giuseppe Averta
VOS
106
2
0
26 Nov 2024
When Spatial meets Temporal in Action Recognition
H. Chen
Lei Wang
Y. Chen
Tom Gedeon
Piotr Koniusz
97
2
0
22 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
98
0
0
20 Nov 2024
Video Token Merging for Long-form Video Understanding
Seon-Ho Lee
Jue Wang
Zhikang Zhang
D. Fan
Xinyu Li
45
5
0
31 Oct 2024
SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining
Ruiqi Xian
Xiyang Wu
Tianrui Guan
Xijun Wang
Boqing Gong
Dinesh Manocha
ViT
39
0
0
26 Sep 2024
Mamba Fusion: Learning Actions Through Questioning
Zhikang Dong
Apoorva Beedu
Jason Sheinkopf
Irfan Essa
Mamba
70
2
0
17 Sep 2024
MultiCounter: Multiple Action Agnostic Repetition Counting in Untrimmed Videos
Yin Tang
Wei Luo
Jinrui Zhang
Wei Huang
Ruihai Jing
Deyu Zhang
46
0
0
06 Sep 2024
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer
Shuai Peng
Di Fu
Baole Wei
Yong Cao
Liangcai Gao
Zhi Tang
ViT
45
1
0
30 Aug 2024
DEAR: Depth-Enhanced Action Recognition
Sadegh Rahmaniboldaji
Filip Rybansky
Quoc Vuong
Frank Guerin
Andrew Gilbert
23
0
0
28 Aug 2024
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
28
10
0
12 Aug 2024
ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
Ziyi Gao
Kai-xiang Chen
Zhipeng Wei
Tingshu Mou
Jingjing Chen
Zhiyu Tan
Hao Li
Yu-Gang Jiang
VGen
AAML
36
2
0
10 Aug 2024
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
Xin Liu
Chao Hao
Zitong Yu
Huanjing Yue
Jingyu Yang
41
1
0
05 Aug 2024
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
31
12
0
08 Jul 2024
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
40
4
0
03 Jul 2024
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Mona Ahmadian
Frank Guerin
Andrew Gilbert
44
1
0
05 Jun 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan L. Yuille
Cihang Xie
AI4TS
VGen
SSL
59
1
0
24 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
29
6
0
13 May 2024
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
49
3
0
10 May 2024
A Survey on Backbones for Deep Video Action Recognition
Zixuan Tang
Youjun Zhao
Yuhang Wen
Mengyuan Liu
35
1
0
09 May 2024
Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios
Chirag Parikh
Ravi Shankar Mishra
Rohan Chandra
Ravi Kiran Sarvadevabhatla
39
1
0
08 May 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
46
38
0
24 Apr 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
43
3
0
18 Apr 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He
Hengduo Li
Young Kyun Jang
Menglin Jia
Xuefei Cao
Ashish Shah
Abhinav Shrivastava
Ser-Nam Lim
MLLM
83
88
0
08 Apr 2024
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk
Jaesung Huh
Evangelos Kazakos
Andrew Zisserman
Dima Damen
40
9
0
08 Apr 2024
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
40
7
0
05 Apr 2024
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva
Fadime Sener
Edoardo Remelli
Bugra Tekin
Eric Sauser
Bernt Schiele
Shugao Ma
VLM
EgoV
45
1
0
28 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLM
VGen
69
14
0
26 Mar 2024
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
42
1
0
24 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
40
5
0
14 Mar 2024
FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders
Soumen Basu
Mayuna Gupta
Chetan Madan
Pankaj Gupta
Chetan Arora
36
4
0
13 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
37
180
0
11 Mar 2024
Improving Legal Judgement Prediction in Romanian with Long Text Encoders
Mihai Masala
Traian Rebedea
Horia Velicu
AILaw
43
2
0
29 Feb 2024
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
F. Worgotter
Alexander S. Ecker
33
3
0
29 Jan 2024
Synchformer: Efficient Synchronization from Sparse Cues
Vladimir E. Iashin
Weidi Xie
Esa Rahtu
Andrew Zisserman
24
11
0
29 Jan 2024
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Boyuan Jiang
Jun Chen
Jianbiao Mei
Xingxing Zuo
Guang Dai
Jingdong Wang
Yong-Jin Liu
VLM
28
4
0
22 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
27
5
0
18 Jan 2024
Motion Guided Token Compression for Efficient Masked Video Modeling
Yukun Feng
Yangming Shi
Fengze Liu
Tan Yan
43
0
0
10 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
41
5
0
08 Jan 2024
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
37
4
0
08 Jan 2024
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
42
5
0
21 Dec 2023
1
2
3
4
Next