Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.03982
Cited By
SlowFast Networks for Video Recognition
10 December 2018
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SlowFast Networks for Video Recognition"
50 / 655 papers shown
Title
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Keli Zhang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
32
21
0
09 Dec 2021
Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Rui Qian
Yeqing Li
Liangzhe Yuan
Boqing Gong
Ting Liu
Matthew A. Brown
Serge Belongie
Ming-Hsuan Yang
Hartwig Adam
Huayu Chen
AI4TS
61
6
0
08 Dec 2021
MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Rui Dai
Srijan Das
Kumara Kahatapitiya
Michael S. Ryoo
F. Brémond
ViT
42
73
0
07 Dec 2021
DCAN: Improving Temporal Action Detection via Dual Context Aggregation
Guo Chen
Yin-Dong Zheng
Limin Wang
Tong Lu
AI4TS
37
70
0
07 Dec 2021
Gesture Recognition with a Skeleton-Based Keyframe Selection Module
Yunsoo Kim
Hyun Myung
SLR
27
1
0
03 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
39
203
0
02 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
75
679
0
02 Dec 2021
Self-supervised Video Transformer
Kanchana Ranasinghe
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
Michael S. Ryoo
ViT
39
84
0
02 Dec 2021
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
33
23
0
02 Dec 2021
Adaptive Token Sampling For Efficient Vision Transformers
Mohsen Fayyaz
Soroush Abbasi Koohpayegani
F. Jafari
Sunando Sengupta
Hamid Reza Vaezi Joze
Eric Sommerlade
Hamed Pirsiavash
Juergen Gall
ViT
16
148
0
30 Nov 2021
UBoCo : Unsupervised Boundary Contrastive Learning for Generic Event Boundary Detection
Hyolim Kang
Jinwoo Kim
Taehyun Kim
Seon Joo Kim
43
25
0
29 Nov 2021
Learning from Temporal Gradient for Semi-supervised Action Recognition
Junfei Xiao
Longlong Jing
Lin Zhang
Ju He
Qi She
Zongwei Zhou
Alan Yuille
Yingwei Li
12
51
0
25 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
35
73
0
25 Nov 2021
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-Jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Wenjie Wang
Lijuan Wang
Zicheng Liu
VLM
53
218
0
24 Nov 2021
Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis
Zhaobo Qi
Shuhui Wang
Chi Su
Li Su
Weigang Zhang
Qingming Huang
27
10
0
23 Nov 2021
Self-Regulated Learning for Egocentric Video Activity Anticipation
Zhaobo Qi
Shuhui Wang
Chi Su
Li Su
Qingming Huang
Q. Tian
EgoV
47
52
0
23 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
Exploring Segment-level Semantics for Online Phase Recognition from Surgical Videos
Xinpeng Ding
Xiaomeng Li
22
33
0
22 Nov 2021
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue
Tiankai Hang
Yanhong Zeng
Yuchong Sun
Bei Liu
Huan Yang
Jianlong Fu
B. Guo
AI4TS
VLM
31
189
0
19 Nov 2021
Evaluating Transformers for Lightweight Action Recognition
Raivo Koot
Markus Hennerbichler
Haiping Lu
ViT
30
8
0
18 Nov 2021
Will You Ever Become Popular? Learning to Predict Virality of Dance Clips
Jiahao Wang
Yunhong Wang
Nina Weng
Tianrui Chai
Annan Li
Faxi Zhang
Sansi Yu
27
13
0
06 Nov 2021
Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Aakash Kaku
Kangning Liu
A. Parnandi
H. Rajamohan
Kannan Venkataramanan
Anita Venkatesan
Audre Wirtanen
Natasha Pandit
Heidi M. Schambra
C. Fernandez‐Granda
27
5
0
03 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
27
28
0
02 Nov 2021
AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling
Alexandros Stergiou
R. Poppe
42
79
0
01 Nov 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
21
25
0
27 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
39
99
0
25 Oct 2021
A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark
Zhenxi Zhu
Limin Wang
Sheng Guo
Gangshan Wu
45
32
0
24 Oct 2021
Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction
Takuma Yagi
Md. Tasnimul Hasan
Yoichi Sato
22
5
0
19 Oct 2021
TEAM-Net: Multi-modal Learning for Video Action Recognition with Partial Decoding
Zhengwei Wang
Qi She
A. Smolic
21
9
0
17 Oct 2021
Shaping embodied agent behavior with activity-context priors from egocentric video
Tushar Nagarajan
Kristen Grauman
EgoV
LM&Ro
63
13
0
14 Oct 2021
The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning
Haider Al-Tahan
Y. Mohsenzadeh
SSL
AI4TS
34
0
0
13 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
278
1,026
0
13 Oct 2021
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
30
82
0
13 Oct 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Mingqian Tang
Ziwei Liu
M. Ang
53
49
0
12 Oct 2021
Video Is Graph: Structured Graph Module for Video Action Recognition
Rongjie Li
Xiaojun Wu
Tianyang Xu
46
12
0
12 Oct 2021
Joint Learning On The Hierarchy Representation for Fine-Grained Human Action Recognition
M. C. Leong
Hui Li Tan
Haosong Zhang
Liyuan Li
Feng Lin
J. Lim
40
10
0
12 Oct 2021
Towards Streaming Egocentric Action Anticipation
Antonino Furnari
G. Farinella
EgoV
33
6
0
11 Oct 2021
SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition
Hezhen Hu
Weichao Zhao
Wen-gang Zhou
Yuechen Wang
Houqiang Li
ViT
32
63
0
11 Oct 2021
Deep Learning-based Action Detection in Untrimmed Videos: A Survey
Elahe Vahdani
Yingli Tian
52
60
0
30 Sep 2021
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
18
19
0
30 Sep 2021
Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
Shuangrui Ding
Maomao Li
Tianyu Yang
Rui Qian
Haohang Xu
Qingyi Chen
Jue Wang
Hongkai Xiong
SSL
30
49
0
30 Sep 2021
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition
Marawan Gamal Abdel Hameed
Marzieh S. Tahaei
A. Mosleh
V. Nia
47
25
0
29 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
561
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
42
64
0
27 Sep 2021
CLIPort: What and Where Pathways for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
Dieter Fox
LM&Ro
65
633
0
24 Sep 2021
DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Tongan Cai
Haomiao Ni
Ming-Chieh Yu
Xiaolei Huang
K. Wong
John Volpi
Jianmin Wang
Stephen T. C. Wong
34
14
0
24 Sep 2021
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
72
44
0
21 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
152
362
0
17 Sep 2021
METEOR:A Dense, Heterogeneous, and Unstructured Traffic Dataset With Rare Behaviors
Rohan Chandra
Xijun Wang
Mridul Mahajan
Rahul Kala
Rishitha Palugulla
Chandrababu Naidu
Alok Jain
Tianyi Zhou
37
15
0
16 Sep 2021
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Zhenzhi Wang
Limin Wang
Tao Wu
Tianhao Li
Gangshan Wu
AI4TS
28
116
0
10 Sep 2021
Previous
1
2
3
...
8
9
10
...
12
13
14
Next