Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.10305
Cited By
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
28 November 2017
Zhaofan Qiu
Ting Yao
Tao Mei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks"
50 / 263 papers shown
Title
Robust Dynamic Facial Expression Recognition
Feng Liu
Hanyang Wang
Siyuan Shen
50
1
0
22 Feb 2025
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
Zhen Chen
Xingjian Luo
Jinlin Wu
Long Bai
Zhen Lei
Hongliang Ren
Sebastien Ourselin
Hongbin Liu
66
0
0
17 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
69
0
0
06 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
Multimodal 3D Brain Tumor Segmentation with Adversarial Training and Conditional Random Field
Lan Jiang
Yuchao Zheng
Miao Yu
Haiqing Zhang
Fatemah Aladwani
Alessandro Perelli
MedIm
71
0
0
21 Nov 2024
GMFL-Net: A Global Multi-geometric Feature Learning Network for Repetitive Action Counting
Jun Li
Jinying Wu
Qiming Li
Feifei Guo
49
0
0
31 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
31
0
0
10 Aug 2024
An Empirical Comparison of Video Frame Sampling Methods for Multi-Modal RAG Retrieval
Mahesh Kandhare
Thibault Gisselbrecht
35
5
0
22 Jul 2024
Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition
Weichao Zhao
Wengang Zhou
Hezhen Hu
Min Wang
Houqiang Li
SLR
45
2
0
15 Jun 2024
OUS: Scene-Guided Dynamic Facial Expression Recognition
Xinji Mai
Haoran Wang
Zeng Tao
Junxiong Lin
Shaoqi Yan
...
Jing Liu
Jiawen Yu
Xuan Tong
Yating Li
Wenqiang Zhang
36
3
0
29 May 2024
ToonCrafter: Generative Cartoon Interpolation
Jinbo Xing
Hanyuan Liu
Menghan Xia
Yong Zhang
Xintao Wang
Ying Shan
Tien-Tsin Wong
62
28
0
28 May 2024
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera
Yan Ru Pei
Sasskia Brüers
Sébastien Crouzet
Douglas McLelland
Olivier Coenen
33
8
0
13 Apr 2024
Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition
Xuzheng Yu
Chen Jiang
Wei Zhang
Tian Gan
Linlin Chao
Jianan Zhao
Yuan Cheng
Qingpei Guo
Wei Chu
28
0
0
09 Jan 2024
Large-scale Long-tailed Disease Diagnosis on Radiology Images
Qiaoyu Zheng
Weike Zhao
Chaoyi Wu
Xiaoman Zhang
Lisong Dai
Hengyu Guan
Yuehua Li
Ya Zhang
Yanfeng Wang
Weidi Xie
LM&MA
MedIm
40
5
0
26 Dec 2023
ConFormer: A Novel Collection of Deep Learning Models to Assist Cardiologists in the Assessment of Cardiac Function
Ethan Thomas
Salman Aslam
MedIm
34
0
0
13 Dec 2023
Boundary Discretization and Reliable Classification Network for Temporal Action Detection
Zhenying Fang
Jun Yu
Richang Hong
28
0
0
10 Oct 2023
TransNet: A Transfer Learning-Based Network for Human Action Recognition
Khaled Alomar
Xiaohao Cai
43
1
0
13 Sep 2023
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
36
53
0
21 Aug 2023
Histogram-guided Video Colorization Structure with Spatial-Temporal Connection
Zheyuan Liu
Pan Mu
Hanning Xu
Cong Bai
24
0
0
09 Aug 2023
View while Moving: Efficient Video Recognition in Long-untrimmed Videos
Ye Tian
Meng Yang
Lanshan Zhang
Zhizhen Zhang
Yang Liu
Xiao-Zhu Xie
Xirong Que
Wendong Wang
24
7
0
09 Aug 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
42
8
0
18 Jul 2023
SwiFT: Swin 4D fMRI Transformer
P. Y. Kim
Junbeom Kwon
Sunghwan Joo
Sang-Peel Bae
Donggyu Lee
Yoonho Jung
Shinjae Yoo
Jiook Cha
Taesup Moon
MedIm
35
21
0
12 Jul 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
Xiang Wang
Hangjie Yuan
Shiwei Zhang
Dayou Chen
Jiuniu Wang
Yingya Zhang
Yujun Shen
Deli Zhao
Jingren Zhou
VGen
DiffM
33
319
0
03 Jun 2023
GoferBot: A Visual Guided Human-Robot Collaborative Assembly System
Zheyu Zhuang
Yizhak Ben-Shabat
Jiahao Zhang
Stephen Gould
Robert E. Mahony
40
6
0
18 Apr 2023
Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment
Kai Zhao
Kun Yuan
Ming-Ting Sun
Xingsen Wen
21
20
0
13 Apr 2023
Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition
Qianhui Men
Edmond S. L. Ho
Hubert P. H. Shum
Howard Leung
SSL
37
19
0
03 Apr 2023
DOAD: Decoupled One Stage Action Detection Network
Shuning Chang
Pichao Wang
Fan Wang
Jiashi Feng
Mike Zheng Show
26
4
0
01 Apr 2023
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization
Tuan N. Tang
Kwonyoung Kim
Kwanghoon Sohn
29
29
0
16 Mar 2023
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting
Aoting Zhang
He Wang
Pengcheng Guo
Yihui Fu
Linfu Xie
Yingying Gao
Shilei Zhang
Junlan Feng
21
4
0
27 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
46
20
0
21 Feb 2023
Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer
Dichao Liu
T. Yamasaki
Yu Wang
K. Mase
Jien Kato
30
27
0
09 Feb 2023
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
Kaiwen Zhang
Jialun Peng
Jingjing Fu
Dong Liu
ViT
27
8
0
24 Jan 2023
Deep Diversity-Enhanced Feature Representation of Hyperspectral Images
Jinhui Hou
Zhiyu Zhu
Junhui Hou
Hui Liu
Huanqiang Zeng
Deyu Meng
18
6
0
15 Jan 2023
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
Xi Shen
Zhedong Zheng
Yi Yang
SLR
35
13
0
25 Dec 2022
Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration
Liqi Yan
Qifan Wang
Siqi Ma
Jingang Wang
Changbin (Brad) Yu
VOS
32
38
0
15 Dec 2022
Weakly Supervised Semantic Segmentation for Large-Scale Point Cloud
Yachao Zhang
Zhonghao Li
Yuan Xie
Yanyun Qu
Cuihua Li
Tao Mei
3DPC
24
94
0
09 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior Analysis
Tanay Agrawal
Michal Balazia
Philippe Muller
Franccois Brémond
ViT
28
9
0
07 Dec 2022
DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition
Santosh Kumar Yadav
Achleshwar Luthra
Esha Pahwa
K. Tiwari
Heena Rathore
Hari Mohan Pandey
Peter Corcoran
36
12
0
07 Dec 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao
Shuming Liu
K. Mangalam
Guohao Li
40
17
0
25 Nov 2022
Video Test-Time Adaptation for Action Recognition
Wei Lin
M. Jehanzeb Mirza
Mateusz Koziñski
Horst Possegger
Hilde Kuehne
Horst Bischof
TTA
47
31
0
24 Nov 2022
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training
Guoxi Huang
A. Bors
27
1
0
23 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
107
0
17 Nov 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022
Yin-Dong Zheng
Guo Chen
Jiahao Wang
Tong Lu
Liming Wang
45
0
0
16 Nov 2022
Dynamic Temporal Filtering in Video Models
Fuchen Long
Zhaofan Qiu
Yingwei Pan
Ting Yao
Chong-Wah Ngo
Tao Mei
AI4TS
32
17
0
15 Nov 2022
MARLIN: Masked Autoencoder for facial video Representation LearnINg
Zhixi Cai
Shreya Ghosh
Kalin Stefanov
Abhinav Dhall
Jianfei Cai
Hamid Rezatofighi
Reza Haffari
Munawar Hayat
ViT
CVBM
27
60
0
12 Nov 2022
Eat-Radar: Continuous Fine-Grained Intake Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network with Attention
C. Wang
T. S. Kumar
W. de Raedt
Guido Camps
Hans Hallez
Bart Vanrumste
24
12
0
08 Nov 2022
Holistic Interaction Transformer Network for Action Detection
Gueter Josmy Faure
Min-Hung Chen
S. Lai
33
37
0
23 Oct 2022
Semantic Video Moments Retrieval at Scale: A New Task and a Baseline
Na Li
26
0
0
15 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
VLM
22
39
0
12 Oct 2022
1
2
3
4
5
6
Next