Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.15679
Cited By
DEAR: Depth-Enhanced Action Recognition
28 August 2024
Sadegh Rahmaniboldaji
Filip Rybansky
Quoc Vuong
Frank Guerin
Andrew Gilbert
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DEAR: Depth-Enhanced Action Recognition"
13 / 13 papers shown
Title
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
117
10
0
27 Nov 2023
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
144
689
0
02 Dec 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Joey Tianyi Zhou
78
66
0
21 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
80
279
0
09 Jun 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
217
2,149
0
29 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
918
29,372
0
26 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
367
2,048
0
09 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
637
41,003
0
22 Oct 2020
Rescaling Egocentric Vision
Dima Damen
Hazel Doughty
G. Farinella
Antonino Furnari
Evangelos Kazakos
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
65
458
0
23 Jun 2020
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
René Ranftl
Katrin Lasinger
David Hafner
Konrad Schindler
V. Koltun
MDE
201
1,790
0
02 Jul 2019
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
164
3,273
0
10 Dec 2018
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
98
1,689
0
20 Nov 2018
Modality Distillation with Multiple Stream Networks for Action Recognition
Nuno C. Garcia
Pietro Morerio
Vittorio Murino
65
183
0
19 Jun 2018
1