ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.03982
  4. Cited By
SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

10 December 2018
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
ArXivPDFHTML

Papers citing "SlowFast Networks for Video Recognition"

50 / 610 papers shown
Title
Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware
  Clues
Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues
Yuyang Qian
Guojun Yin
Lu Sheng
Zixuan Chen
Jing Shao
CVBM
40
662
0
18 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Temporal Distinct Representation Learning for Action Recognition
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
17
26
0
15 Jul 2020
AViD Dataset: Anonymized Videos from Diverse Countries
AViD Dataset: Anonymized Videos from Diverse Countries
A. Piergiovanni
Michael S. Ryoo
30
35
0
10 Jul 2020
Generalized Few-Shot Video Classification with Video Retrieval and
  Feature Generation
Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation
Yongqin Xian
Bruno Korbar
Matthijs Douze
Lorenzo Torresani
Bernt Schiele
Zeynep Akata
VGen
18
18
0
09 Jul 2020
Aligning Videos in Space and Time
Aligning Videos in Space and Time
Senthil Purushwalkam
Tian-Chun Ye
Saurabh Gupta
Abhinav Gupta
30
23
0
09 Jul 2020
Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action
  Recognition
Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition
Lei Shi
Yifan Zhang
Jian Cheng
Hanqing Lu
16
49
0
07 Jul 2020
Domain Adaptation without Source Data
Domain Adaptation without Source Data
Youngeun Kim
Donghyeon Cho
Kyeongtak Han
Priyadarshini Panda
Sungeun Hong
TTA
11
174
0
03 Jul 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
40
371
0
29 Jun 2020
Rescaling Egocentric Vision
Rescaling Egocentric Vision
Dima Damen
Hazel Doughty
G. Farinella
Antonino Furnari
Evangelos Kazakos
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
19
437
0
23 Jun 2020
Video Playback Rate Perception for Self-supervisedSpatio-Temporal
  Representation Learning
Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning
Yuan Yao
Chang-rui Liu
Dezhao Luo
Yu Zhou
QiXiang Ye
29
169
0
20 Jun 2020
Learn to cycle: Time-consistent feature discovery for action recognition
Learn to cycle: Time-consistent feature discovery for action recognition
Alexandros Stergiou
R. Poppe
22
23
0
15 Jun 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
19
150
0
14 Jun 2020
SpotFast Networks with Memory Augmented Lateral Transformers for
  Lipreading
SpotFast Networks with Memory Augmented Lateral Transformers for Lipreading
Peratham Wiriyathammabhum
23
8
0
21 May 2020
Towards Streaming Perception
Towards Streaming Perception
Mengtian Li
Yu-xiong Wang
Deva Ramanan
18
5
0
21 May 2020
Retrieving and Highlighting Action with Spatiotemporal Reference
Retrieving and Highlighting Action with Spatiotemporal Reference
Seito Kasai
Yuchi Ishikawa
Masaki Hayashi
Y. Aoki
Kensho Hara
Hirokatsu Kataoka
11
0
0
19 May 2020
The AVA-Kinetics Localized Human Actions Video Dataset
The AVA-Kinetics Localized Human Actions Video Dataset
Ang Li
Meghana Thotakuri
David A. Ross
João Carreira
Alexander Vostrikov
Andrew Zisserman
VGen
19
133
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
46
493
0
01 May 2020
Local-Global Video-Text Interactions for Temporal Grounding
Local-Global Video-Text Interactions for Temporal Grounding
Jonghwan Mun
Minsu Cho
Bohyung Han
36
267
0
16 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
73
1,001
0
09 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
25
439
0
03 Apr 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling
  with Attention
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
Explaining Motion Relevance for Activity Recognition in Video Deep
  Learning Models
Explaining Motion Relevance for Activity Recognition in Video Deep Learning Models
Liam Hiley
Alun D. Preece
Y. Hicks
Supriyo Chakraborty
Prudhvi K. Gurram
Richard J. Tomsett
FAtt
25
14
0
31 Mar 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
31
235
0
31 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
PIC: Permutation Invariant Convolution for Recognizing Long-range
  Activities
PIC: Permutation Invariant Convolution for Recognizing Long-range Activities
Noureldien Hussein
E. Gavves
A. Smeulders
VLM
26
13
0
18 Mar 2020
SF-Net: Single-Frame Supervision for Temporal Action Localization
SF-Net: Single-Frame Supervision for Temporal Action Localization
Fan Ma
Linchao Zhu
Yi Yang
Shengxin Cindy Zha
Gourab Kundu
Matt Feiszli
Zheng Shou
18
139
0
15 Mar 2020
PANDA: A Gigapixel-level Human-centric Video Dataset
PANDA: A Gigapixel-level Human-centric Video Dataset
Xueyan Wang
Xiya Zhang
Yinheng Zhu
Yuchen Guo
Xiaoyun Yuan
...
Zerun Wang
Guiguang Ding
D. Brady
Qionghai Dai
Lu Fang
VGen
44
79
0
10 Mar 2020
Rethinking Zero-shot Video Classification: End-to-end Training for
  Realistic Applications
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
137
127
0
03 Mar 2020
Evolving Losses for Unsupervised Video Representation Learning
Evolving Losses for Unsupervised Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
27
138
0
26 Feb 2020
A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
Bin Ren
Mengyuan Liu
Runwei Ding
Hong Liu
27
121
0
14 Feb 2020
Over-the-Air Adversarial Flickering Attacks against Video Recognition
  Networks
Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
Roi Pony
I. Naeh
Shie Mannor
AAML
18
51
0
12 Feb 2020
Interpreting video features: a comparison of 3D convolutional networks
  and convolutional LSTM networks
Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
Joonatan Mänttäri
Sofia Broomé
John Folkesson
Hedvig Kjellström
FAtt
24
27
0
02 Feb 2020
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
Temporal Interlacing Network
Temporal Interlacing Network
Hao Shao
Shengju Qian
Yu Liu
29
92
0
17 Jan 2020
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Jingwei Ji
Ranjay Krishna
Li Fei-Fei
Juan Carlos Niebles
39
336
0
15 Dec 2019
Listen to Look: Action Recognition by Previewing Audio
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
29
251
0
10 Dec 2019
Synthetic Humans for Action Recognition from Unseen Viewpoints
Synthetic Humans for Action Recognition from Unseen Viewpoints
Gül Varol
Ivan Laptev
Cordelia Schmid
Andrew Zisserman
33
96
0
09 Dec 2019
A Multigrid Method for Efficiently Training Video Models
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
21
94
0
02 Dec 2019
More Is Less: Learning Efficient Video Representations by Big-Little
  Network and Depthwise Temporal Aggregation
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation
Quanfu Fan
Chun-Fu Chen
Hilde Kuehne
Marco Pistoia
David D. Cox
32
126
0
02 Dec 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time
  Spatiotemporal Action Localization
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Okan Kopuklu
Xiangyu Wei
Gerhard Rigoll
28
143
0
15 Nov 2019
Comprehensive Video Understanding: Video summarization with
  content-based video recommender design
Comprehensive Video Understanding: Video summarization with content-based video recommender design
Yudong Jiang
Kaixu Cui
B. Peng
Changliang Xu
BDL
14
28
0
30 Oct 2019
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection
AFO-TAD: Anchor-free One-Stage Detector for Temporal Action Detection
Yiping Tang
Chuang Niu
Minghao Dong
Shenghan Ren
Jimin Liang
27
10
0
18 Oct 2019
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Chenxu Luo
Alan Yuille
130
150
0
28 Sep 2019
Learning deep representations for video-based intake gesture detection
Learning deep representations for video-based intake gesture detection
Philipp V. Rouast
M. Adam
20
39
0
24 Sep 2019
Class Feature Pyramids for Video Explanation
Class Feature Pyramids for Video Explanation
Alexandros Stergiou
G. Kapidis
Grigorios Kalliatakis
C. Chrysoulas
R. Poppe
R. Veltkamp
FAtt
33
18
0
18 Sep 2019
Action recognition with spatial-temporal discriminative filter banks
Action recognition with spatial-temporal discriminative filter banks
Brais Martínez
Davide Modolo
Yuanjun Xiong
Joseph Tighe
18
66
0
20 Aug 2019
SF-Net: Structured Feature Network for Continuous Sign Language
  Recognition
SF-Net: Structured Feature Network for Continuous Sign Language Recognition
Zhaoyang Yang
Zhenmei Shi
Xiaoyong Shen
Yu-Wing Tai
SLR
27
63
0
04 Aug 2019
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Laura Sevilla-Lara
Shengxin Cindy Zha
Zhicheng Yan
Vedanuj Goswami
Matt Feiszli
Lorenzo Torresani
50
75
0
19 Jul 2019
Deformable Tube Network for Action Detection in Videos
Deformable Tube Network for Action Detection in Videos
Wei Li
Zehuan Yuan
Dashan Guo
Lei Huang
Xiangzhong Fang
Changhu Wang
ViT
MedIm
33
5
0
03 Jul 2019
Video Modeling with Correlation Networks
Video Modeling with Correlation Networks
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
24
127
0
07 Jun 2019
Previous
123...111213
Next