ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.03982
  4. Cited By
SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

10 December 2018
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
ArXivPDFHTML

Papers citing "SlowFast Networks for Video Recognition"

50 / 641 papers shown
Title
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
42
9
0
30 Aug 2021
A Multimodal Framework for Video Ads Understanding
A Multimodal Framework for Video Ads Understanding
Zejia Weng
Lingjiang Meng
Rui Wang
Zuxuan Wu
Yu-Gang Jiang
33
1
0
29 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action
  Recognition
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
Blindly Assess Quality of In-the-Wild Videos via Quality-aware
  Pre-training and Motion Perception
Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception
Bowen Li
Weixia Zhang
Meng Tian
Guangtao Zhai
Xianpei Wang
43
120
0
19 Aug 2021
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
49
20
0
17 Aug 2021
Temporal Action Segmentation with High-level Complex Activity Labels
Temporal Action Segmentation with High-level Complex Activity Labels
Guodong Ding
Angela Yao
33
18
0
15 Aug 2021
Learning to Cut by Watching Movies
Learning to Cut by Watching Movies
Alejandro Pardo
Fabian Caba Heilbron
Juan Carlos León Alcázar
Ali K. Thabet
Guohao Li
VGen
58
20
0
09 Aug 2021
Token Shift Transformer for Video Classification
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
29
116
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
35
42
0
04 Aug 2021
Spatio-Temporal Representation Factorization for Video-based Person
  Re-Identification
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
Abhishek Aich
Meng Zheng
Srikrishna Karanam
Terrence Chen
A. Roy-Chowdhury
Ziyan Wu
37
70
0
25 Jul 2021
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Hanxi Lin
Xinxiao Wu
Jiebo Luo
25
1
0
25 Jul 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
35
41
0
22 Jul 2021
Evidential Deep Learning for Open Set Action Recognition
Evidential Deep Learning for Open Set Action Recognition
Wentao Bao
Qi Yu
Yu Kong
CML
EDL
19
135
0
21 Jul 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural
  Language Queries
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
ViT
24
62
0
20 Jul 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action
  Recognition
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
F. Brémond
27
47
0
19 Jul 2021
Fine-Grained AutoAugmentation for Multi-Label Classification
Fine-Grained AutoAugmentation for Multi-Label Classification
Y. Wang
Hesen Chen
Fangyi Zhang
Yaohua Wang
Xiuyu Sun
Ming Lin
Hao Li
29
2
0
12 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
42
543
0
30 Jun 2021
Spatio-Temporal Context for Action Detection
Spatio-Temporal Context for Action Detection
Manuel Sarmiento Calderó
David Varas
Elisenda Bou
27
2
0
29 Jun 2021
Feature Combination Meets Attention: Baidu Soccer Embeddings and
  Transformer based Temporal Detection
Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection
Xin Zhou
Le Kang
Zhiyu Cheng
Bo He
Jingyu Xin
51
34
0
28 Jun 2021
Can An Image Classifier Suffice For Action Recognition?
Can An Image Classifier Suffice For Action Recognition?
Quanfu Fan
Chun-Fu Chen
Chen
Yikang Shen
ViT
34
33
0
26 Jun 2021
Prototypical Cross-Attention Networks for Multiple Object Tracking and
  Segmentation
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
Lei Ke
Xia Li
Martin Danelljan
Yu-Wing Tai
Chi-Keung Tang
Feng Yu
VOS
21
71
0
22 Jun 2021
Towards Long-Form Video Understanding
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
54
166
0
21 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
37
127
0
21 Jun 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
35
292
0
21 Jun 2021
Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal
  Video Grounding
Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding
Chaolei Tan
Zihang Lin
Jianfang Hu
Xiang Li
Weishi Zheng
28
9
0
20 Jun 2021
Weakly-Supervised Temporal Action Localization Through Local-Global
  Background Modeling
Weakly-Supervised Temporal Action Localization Through Local-Global Background Modeling
Xiang Wang
Zhiwu Qing
Ziyuan Huang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Yuanjie Shao
Nong Sang
29
4
0
20 Jun 2021
Attend What You Need: Motion-Appearance Synergistic Networks for Video
  Question Answering
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
Ahjeong Seo
Gi-Cheon Kang
J. Park
Byoung-Tak Zhang
18
53
0
19 Jun 2021
Self-supervised Video Representation Learning with Cross-Stream
  Prototypical Contrasting
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Martine Toering
Ioannis Gatopoulos
M. Stol
Vincent Tao Hu
SSL
40
11
0
18 Jun 2021
VidHarm: A Clip Based Dataset for Harmful Content Detection
VidHarm: A Clip Based Dataset for Harmful Content Detection
Johan Edstedt
Amanda Berg
M. Felsberg
Johan Karlsson
Francisca Benavente
Anette Novak
G. Pihlgren
28
2
0
15 Jun 2021
Relation Modeling in Spatio-Temporal Action Localization
Relation Modeling in Spatio-Temporal Action Localization
Yutong Feng
Jianwen Jiang
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Shiwei Zhang
Mingqian Tang
Yue Gao
33
11
0
15 Jun 2021
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Multi-level Attention Fusion Network for Audio-visual Event Recognition
Mathilde Brousmiche
Jean Rouat
Stéphane Dupont
27
11
0
12 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
36
124
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
30
274
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for
  EPIC-KITCHENS-100 Action Recognition
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
27
11
0
09 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker
  Detection in the Wild
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
Okan Kopuklu
Maja Taseska
Gerhard Rigoll
3DV
29
45
0
07 Jun 2021
CT-Net: Channel Tensorization Network for Video Classification
CT-Net: Channel Tensorization Network for Video Classification
Kunchang Li
Xianhang Li
Yali Wang
Jun Wang
Yu Qiao
ViT
30
55
0
03 Jun 2021
Continual 3D Convolutional Neural Networks for Real-time Processing of
  Videos
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Lukas Hedegaard
Alexandros Iosifidis
3DPC
23
14
0
31 May 2021
SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised
  Temporal Action Segmentation
SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation
Zhe Wang
Hao Chen
Xinyu Li
Chunhui Liu
Yuanjun Xiong
Joseph Tighe
Charless C. Fowlkes
30
20
0
29 May 2021
DSANet: Dynamic Segment Aggregation Network for Video-Level
  Representation Learning
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Wenhao Wu
Yuxiang Zhao
Yanwu Xu
Xiao Tan
Dongliang He
...
Jinxing Ye
Yingying Li
Mingde Yao
Zichao Dong
Yifeng Shi
AI4TS
30
27
0
25 May 2021
Temporal Action Proposal Generation with Transformers
Temporal Action Proposal Generation with Transformers
Lining Wang
Haosen Yang
Wenhao Wu
H. Yao
Hujie Huang
ViT
38
27
0
25 May 2021
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction
  Detection in Videos
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos
Meng-Jiun Chiou
Chun-Yu Liao
Li-Wei Wang
Roger Zimmermann
Jiashi Feng
41
24
0
25 May 2021
FineAction: A Fine-Grained Video Dataset for Temporal Action
  Localization
FineAction: A Fine-Grained Video Dataset for Temporal Action Localization
Yi Liu
Limin Wang
Yali Wang
Xiao Ma
Yu Qiao
24
56
0
24 May 2021
Coarse to Fine Multi-Resolution Temporal Convolutional Network
Coarse to Fine Multi-Resolution Temporal Convolutional Network
Dipika Singhania
R. Rahaman
Angela Yao
AI4TS
16
55
0
23 May 2021
PLM: Partial Label Masking for Imbalanced Multi-label Classification
PLM: Partial Label Masking for Imbalanced Multi-label Classification
Kevin Duarte
Yogesh S Rawat
M. Shah
39
15
0
22 May 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Parallel Attention Network with Sequence Matching for Video Grounding
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
18
40
0
18 May 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of
  Daily Living
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
Srijan Das
Rui Dai
Di Yang
F. Brémond
ViT
43
67
0
17 May 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
  Sports Actions
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
27
97
0
16 May 2021
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model
  Configurations
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Taojiannan Yang
Sijie Zhu
Matías Mendieta
Pu Wang
Ravikumar Balakrishnan
Minwoo Lee
T. Han
M. Shah
Chong Chen
3DH
OOD
28
23
0
14 May 2021
Representation Learning via Global Temporal Alignment and
  Cycle-Consistency
Representation Learning via Global Temporal Alignment and Cycle-Consistency
Isma Hadji
Konstantinos G. Derpanis
Allan D. Jepson
AI4TS
27
54
0
11 May 2021
Coupling Intent and Action for Pedestrian Crossing Behavior Prediction
Coupling Intent and Action for Pedestrian Crossing Behavior Prediction
Yu Yao
E. Atkins
Matthew Johnson-Roberson
Ram Vasudevan
Xiaoxiao Du
21
33
0
10 May 2021
Previous
123...101112139
Next