ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.04730
  4. Cited By
X3D: Expanding Architectures for Efficient Video Recognition

X3D: Expanding Architectures for Efficient Video Recognition

9 April 2020
Christoph Feichtenhofer
ArXivPDFHTML

Papers citing "X3D: Expanding Architectures for Efficient Video Recognition"

50 / 93 papers shown
Title
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
174
5
0
17 Apr 2025
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Jongseo Lee
Joohyun Chang
Dongho Lee
Jinwoo Choi
144
0
0
30 Mar 2025
Action tube generation by person query matching for spatio-temporal action detection
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
105
0
0
17 Mar 2025
AI-Based Thermal Video Analysis in Privacy-Preserving Healthcare: A Case Study on Detecting Time of Birth
AI-Based Thermal Video Analysis in Privacy-Preserving Healthcare: A Case Study on Detecting Time of Birth
Jorge García-Torres
Øyvind Meinich-Bache
Siren Rettedal
K. Engan
65
2
0
05 Feb 2025
Can masking background and object reduce static bias for zero-shot action recognition?
Can masking background and object reduce static bias for zero-shot action recognition?
Takumi Fukuzawa
Kensho Hara
Hirokatsu Kataoka
Toru Tamaki
87
1
0
22 Jan 2025
Human Activity Recognition in an Open World
Human Activity Recognition in an Open World
D. Prijatelj
Samuel Grieggs
Jin Huang
Dawei Du
Ameya Shringi
Christopher Funk
Adam Kaufman
Eric Robertson
Walter J. Scheirer University of Notre Dame
103
3
0
17 Jan 2025
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
156
1
0
03 Dec 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
145
1
0
25 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
123
0
0
20 Nov 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
58
0
0
10 Aug 2024
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy
Will LeVine
Benjamin Pikus
Jacob Phillips
Berk Norman
Fernando Amat Gil
Sean Hendryx
OODD
119
1
0
22 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
89
1
0
15 Jan 2024
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Tianlin Li
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
97
3
0
18 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
70
4
0
05 Dec 2023
Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach
Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach
André O. Françani
Marcos R. O. A. Máximo
52
8
0
10 May 2023
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Chenxu Luo
Alan Yuille
149
150
0
28 Sep 2019
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective
  Untrimmed Video Recognition
Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Shilei Wen
42
128
0
31 Jul 2019
A Short Note on the Kinetics-700 Human Action Dataset
A Short Note on the Kinetics-700 Human Action Dataset
João Carreira
Eric Noland
Chloe Hillier
Andrew Zisserman
52
446
0
15 Jul 2019
FASTER Recurrent Networks for Efficient Video Classification
FASTER Recurrent Networks for Efficient Video Classification
Linchao Zhu
Laura Sevilla-Lara
Du Tran
Matt Feiszli
Yi Yang
Heng Wang
60
6
0
10 Jun 2019
Video Modeling with Correlation Networks
Video Modeling with Correlation Networks
Heng Wang
Du Tran
Lorenzo Torresani
Matt Feiszli
47
127
0
07 Jun 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
109
17,950
0
28 May 2019
Searching for MobileNetV3
Searching for MobileNetV3
Andrew G. Howard
Mark Sandler
Grace Chu
Liang-Chieh Chen
Bo Chen
...
Yukun Zhu
Ruoming Pang
Vijay Vasudevan
Quoc V. Le
Hartwig Adam
269
6,685
0
06 May 2019
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Xitong Yang
Xiaodong Yang
Ming-Yuan Liu
Fanyi Xiao
L. Davis
Jan Kautz
47
138
0
19 Apr 2019
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural
  Networks with Octave Convolution
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
Yunpeng Chen
Haoqi Fan
Bing Xu
Zhicheng Yan
Yannis Kalantidis
Marcus Rohrbach
Shuicheng Yan
Jiashi Feng
89
556
0
10 Apr 2019
SCSampler: Sampling Salient Clips from Video for Efficient Action
  Recognition
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition
Bruno Korbar
Du Tran
Lorenzo Torresani
46
224
0
08 Apr 2019
Video Classification with Channel-Separated Convolutional Networks
Video Classification with Channel-Separated Convolutional Networks
Du Tran
Heng Wang
Lorenzo Torresani
Matt Feiszli
3DV
50
583
0
04 Apr 2019
Resource Efficient 3D Convolutional Neural Networks
Resource Efficient 3D Convolutional Neural Networks
Okan Kopuklu
Neslihan Köse
Ahmet Gunduz
Gerhard Rigoll
43
187
0
04 Apr 2019
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
151
479
0
12 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
144
3,244
0
10 Dec 2018
A Structured Model For Action Detection
A Structured Model For Action Detection
Yubo Zhang
P. Tokmakov
M. Hebert
Cordelia Schmid
57
101
0
09 Dec 2018
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
117
706
0
06 Dec 2018
Timeception for Complex Action Recognition
Timeception for Complex Action Recognition
Noureldien Hussein
E. Gavves
A. Smeulders
96
213
0
04 Dec 2018
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
Zuxuan Wu
Caiming Xiong
Chih-Yao Ma
R. Socher
L. Davis
142
195
0
29 Nov 2018
TSM: Temporal Shift Module for Efficient Video Understanding
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
78
1,677
0
20 Nov 2018
Representation Flow for Action Recognition
Representation Flow for Action Recognition
A. Piergiovanni
Michael S. Ryoo
69
147
0
02 Oct 2018
A Short Note about Kinetics-600
A Short Note about Kinetics-600
João Carreira
Eric Noland
Andras Banki-Horvath
Chloe Hillier
Andrew Zisserman
77
520
0
03 Aug 2018
MnasNet: Platform-Aware Neural Architecture Search for Mobile
MnasNet: Platform-Aware Neural Architecture Search for Mobile
Mingxing Tan
Bo Chen
Ruoming Pang
Vijay Vasudevan
Mark Sandler
Andrew G. Howard
Quoc V. Le
MQ
102
2,995
0
31 Jul 2018
Multi-Fiber Networks for Video Recognition
Multi-Fiber Networks for Video Recognition
Yunpeng Chen
Yannis Kalantidis
Jianshu Li
Shuicheng Yan
Jiashi Feng
CVBM
97
217
0
30 Jul 2018
Actor-Centric Relation Network
Actor-Centric Relation Network
Chen Sun
Abhinav Shrivastava
Carl Vondrick
Kevin Patrick Murphy
Rahul Sukthankar
Cordelia Schmid
78
220
0
28 Jul 2018
Motion Feature Network: Fixed Motion Filter for Action Recognition
Motion Feature Network: Fixed Motion Filter for Action Recognition
Myunggi Lee
Seungeui Lee
S. Son
Gyutae Park
Nojun Kwak
56
122
0
26 Jul 2018
Spatio-Temporal Channel Correlation Networks for Action Classification
Spatio-Temporal Channel Correlation Networks for Action Classification
Ali Diba
Mohsen Fayyaz
Vivek Sharma
M. M. Arzani
Rahman Yousefzadeh
Juergen Gall
Luc Van Gool
3DPC
56
181
0
19 Jun 2018
Massively Parallel Video Networks
Massively Parallel Video Networks
João Carreira
Viorica Patraucean
L. Mazaré
Andrew Zisserman
Simon Osindero
45
43
0
11 Jun 2018
Videos as Space-Time Region Graphs
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
67
753
0
05 Jun 2018
ECO: Efficient Convolutional Network for Online Video Understanding
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
170
498
0
24 Apr 2018
End-to-End Learning of Motion Representation for Video Understanding
End-to-End Learning of Motion Representation for Video Understanding
Lijie Fan
Wen-bing Huang
Chuang Gan
Stefano Ermon
Boqing Gong
Junzhou Huang
54
214
0
02 Apr 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
148
19,124
0
13 Jan 2018
Compressed Video Action Recognition
Compressed Video Action Recognition
Chao-Yuan Wu
Manzil Zaheer
Hexiang Hu
R. Manmatha
Alex Smola
Philipp Krahenbuhl
125
325
0
02 Dec 2017
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for
  Video Interpolation
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
Huaizu Jiang
Deqing Sun
Varun Jampani
Ming-Hsuan Yang
Erik Learned-Miller
Jan Kautz
95
785
0
30 Nov 2017
A Closer Look at Spatiotemporal Convolutions for Action Recognition
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
184
3,007
0
30 Nov 2017
Optical Flow Guided Feature: A Fast and Robust Motion Representation for
  Video Action Recognition
Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
Shuyang Sun
Zhanghui Kuang
Wanli Ouyang
Lu Sheng
Wayne Zhang
66
296
0
29 Nov 2017
12
Next