ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.03982
  4. Cited By
SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

10 December 2018
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
ArXivPDFHTML

Papers citing "SlowFast Networks for Video Recognition"

50 / 610 papers shown
Title
Faster Diffusion Action Segmentation
Faster Diffusion Action Segmentation
Shuai Wang
Shunli Wang
Mingcheng Li
Dingkang Yang
Haopeng Kuang
Ziyun Qian
Lihua Zhang
42
0
0
04 Aug 2024
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for
  Efficient Pedestrian Detection
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection
Xiangbo Gao
A. Kanu-Asiegbu
Xiaoxiao Du
Mamba
38
0
0
02 Aug 2024
Learning Video Context as Interleaved Multimodal Sequences
Learning Video Context as Interleaved Multimodal Sequences
S. Shao
Pengchuan Zhang
Y. Li
Xide Xia
A. Meso
Ziteng Gao
Jinheng Xie
N. Holliman
Mike Zheng Shou
49
5
0
31 Jul 2024
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
32
1
0
30 Jul 2024
Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross
  Modal Retrieval
Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross Modal Retrieval
Zeyu Chen
Pengfei Zhang
Kai Ye
Wei Dong
Xin Feng
Yana Zhang
43
0
0
28 Jul 2024
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language
  Models
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Mingze Xu
Mingfei Gao
Zhe Gan
Hong-You Chen
Zhengfeng Lai
Haiming Gang
Kai Kang
Afshin Dehghan
62
49
0
22 Jul 2024
Self-Supervised Video Representation Learning in a Heuristic Decoupled
  Perspective
Self-Supervised Video Representation Learning in a Heuristic Decoupled Perspective
Zeen Song
Wenwen Qiang
Jianqi Zhang
Changwen Zheng
Wenwen Qiang
SSL
66
0
0
19 Jul 2024
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in
  Streaming Videos
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
SUMix: Mixup with Semantic and Uncertain Information
SUMix: Mixup with Semantic and Uncertain Information
Huafeng Qin
Xin Jin
Hongyu Zhu
Hongchao Liao
M. El-Yacoubi
Xinbo Gao
UQCV
51
6
0
10 Jul 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
Jianwen Jiang
Gaojie Lin
Zhengkun Rong
Chao Liang
Yongming Zhu
Jiaqi Yang
Tianyun Zhong
3DH
90
8
0
08 Jul 2024
MMAD: Multi-label Micro-Action Detection in Videos
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li
Pengyu Liu
Pengyu Liu
Guoliang Chen
Zhiliang Wu
Hehe Fan
Meng Wang
47
7
0
07 Jul 2024
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal
  Model
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model
Lu Xu
Sijie Zhu
Chunyuan Li
Chia-Wen Kuo
Fan Chen
Xinyao Wang
Guang Chen
Dawei Du
Ye Yuan
Longyin Wen
44
4
0
15 Jun 2024
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD
Ioanna Ntinou
Enrique Sanchez
Georgios Tzimiropoulos
45
0
0
11 Jun 2024
Video-based Exercise Classification and Activated Muscle Group
  Prediction with Hybrid X3D-SlowFast Network
Video-based Exercise Classification and Activated Muscle Group Prediction with Hybrid X3D-SlowFast Network
Manvik Pasula
Pramit Saha
29
0
0
10 Jun 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
Zeyue Tian
Zhaoyang Liu
Ruibin Yuan
Jiahao Pan
Xiaoqiang Huang
Xu Tan
Xu Tan
Qifeng Chen
Yu Guo
VGen
104
16
0
06 Jun 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Ajmal Mian
50
2
0
21 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video
  Understanding
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
Bidirectional Progressive Transformer for Interaction Intention
  Anticipation
Bidirectional Progressive Transformer for Interaction Intention Anticipation
Zichen Zhang
Hongcheng Luo
Wei Zhai
Yang Cao
Yu Kang
41
5
0
09 May 2024
Light-VQA+: A Video Quality Assessment Model for Exposure Correction
  with Vision-Language Guidance
Light-VQA+: A Video Quality Assessment Model for Exposure Correction with Vision-Language Guidance
Xunchu Zhou
Xiaohong Liu
Yunlong Dong
Tengchuan Kou
Yixuan Gao
Zicheng Zhang
Chunyi Li
Haoning Wu
Guangtao Zhai
41
3
0
06 May 2024
MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition
MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition
Hongyu Qu
Rui Yan
Xiangbo Shu
Haoliang Gao
Peng Huang
Guo-Sen Xie
61
4
0
03 May 2024
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model
Deng Li
Xin Liu
Bohao Xing
Baiqiang Xia
Yuan Zong
Bihan Wen
Heikki Kälviäinen
42
3
0
01 May 2024
Unifying Global and Local Scene Entities Modelling for Precise Action
  Spotting
Unifying Global and Local Scene Entities Modelling for Precise Action Spotting
Kim Hoang Tran
Phuc Vuong Do
Ngoc Quoc Ly
Ngan Le
36
4
0
15 Apr 2024
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint
  Moment Retrieval and Highlight Detection
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
Jin Yang
Ping Wei
Huan Li
Ziyang Ren
51
8
0
14 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video
  Gaze Estimation
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
37
5
0
08 Apr 2024
Study of the effect of Sharpness on Blind Video Quality Assessment
Study of the effect of Sharpness on Blind Video Quality Assessment
Anantha Prabhu
David Pratap
Narayana Darapeni
R. AnweshP
28
0
0
06 Apr 2024
Koala: Key frame-conditioned long video-LLM
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
36
0
05 Apr 2024
Unleash the Potential of CLIP for Video Highlight Detection
Unleash the Potential of CLIP for Video Highlight Detection
D. Han
Seunghyeon Seo
Eunhwan Park
Seong-Uk Nam
Nojun Kwak
VLM
29
2
0
02 Apr 2024
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
Chih-Chung Hsu
Chia-Ming Lee
Chiang Fan Yang
Yi-Shiuan Chou
Chih-Yu Jiang
Shen-Chieh Tai
Chin-Han Tsai
44
0
0
02 Apr 2024
$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video
  Temporal Grounding
R2R^2R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Ye Liu
Jixuan He
Wanhua Li
Junsik Kim
D. Wei
Hanspeter Pfister
Chang Wen Chen
46
13
0
31 Mar 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
Hierarchical Deep Learning for Intention Estimation of Teleoperation
  Manipulation in Assembly Tasks
Hierarchical Deep Learning for Intention Estimation of Teleoperation Manipulation in Assembly Tasks
Mingyu Cai
Karankumar Patel
Soshi Iba
Songpo Li
36
1
0
28 Mar 2024
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal
  Action Localization
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
49
1
0
27 Mar 2024
Edit3K: Universal Representation Learning for Video Editing Components
Edit3K: Universal Representation Learning for Video Editing Components
Xin Gu
Libo Zhang
Fan Chen
Longyin Wen
Yufei Wang
Tiejian Luo
Sijie Zhu
43
4
0
24 Mar 2024
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang
Guo Chen
Jilan Xu
Mingfang Zhang
Lijin Yang
...
Hongjie Zhang
Lu Dong
Yali Wang
Limin Wang
Yu Qiao
EgoV
66
38
0
24 Mar 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
61
3
0
21 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
40
5
0
14 Mar 2024
Density-Guided Label Smoothing for Temporal Localization of Driving
  Actions
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
38
4
0
11 Mar 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
45
29
0
20 Feb 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
67
1
0
15 Jan 2024
Efficient Bitrate Ladder Construction using Transfer Learning and
  Spatio-Temporal Features
Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features
A. Falahati
Mohammad Karim Safavi
Ardavan Elahi
Farhad Pakdaman
Moncef Gabbouj
AI4TS
32
1
0
06 Jan 2024
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease
  Progression from Longitudinal OCTs
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs
T. Emre
A. Chakravarty
Antoine Rivail
Dmitrii Lachinov
Oliver Leingang
...
S. Sivaprasad
Daniel Rueckert
A. Lotery
U. Schmidt-Erfurth
Hrvoje Bogunović
MedIm
29
3
0
28 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
37
0
0
20 Dec 2023
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hao Shao
Yuxuan Hu
Letian Wang
Steven L. Waslander
Yu Liu
Hongsheng Li
ELM
33
113
0
12 Dec 2023
From Detection to Action Recognition: An Edge-Based Pipeline for Robot
  Human Perception
From Detection to Action Recognition: An Edge-Based Pipeline for Robot Human Perception
Petros Toupas
Georgios Tsamis
Dimitrios Giakoumis
K. Votis
Dimitrios Tzovaras
32
0
0
06 Dec 2023
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin
Antonino Furnari
Kyle Min
Subarna Tripathi
G. Farinella
EgoV
27
12
0
06 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
37
4
0
05 Dec 2023
Overcoming Label Noise for Source-free Unsupervised Video Domain
  Adaptation
Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation
A. Dasgupta
C. V. Jawahar
Karteek Alahari
TTA
VLM
24
10
0
30 Nov 2023
Previous
12345...111213
Next