Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.09577
Cited By
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
27 November 2017
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?"
50 / 282 papers shown
Title
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation
Haiquan Wen
Yiwei He
Zhenglin Huang
Tianxiao Li
Zihan YU
Xingru Huang
Lu Qi
Baoyuan Wu
Xuelong Li
Guangliang Cheng
VGen
9
0
0
19 May 2025
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression Recognition
Runduo Han
Xiuping Liu
Shangxuan Yi
Yi Zhang
Hongchen Tan
6
0
0
17 May 2025
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
Jiewen Yang
Taoran Huang
Shangwei Ding
Xiaowei Xu
Qinhua Zhao
...
Bin Pu
Jiexuan Zheng
Caojin Zhang
Hongwen Fei
Xuelong Li
16
0
0
12 May 2025
Fast Adversarial Training with Weak-to-Strong Spatial-Temporal Consistency in the Frequency Domain on Videos
Songping Wang
Hanqing Liu
Yueming Lyu
Xiantao Hu
Ziwen He
Wei Wang
Caifeng Shan
Lei Wang
AAML
130
0
0
21 Apr 2025
TRIDENT: Tri-modal Real-time Intrusion Detection Engine for New Targets
Ildi Alla
Selma Yahia
Valeria Loscri
20
0
0
08 Apr 2025
MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition
Ryosuke Kawamura
Hideaki Hayashi
Noriko Takemura
Hajime Nagahara
CVBM
3DH
65
4
0
28 Feb 2025
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
69
0
0
06 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
Measuring Error Alignment for Decision-Making Systems
Binxia Xu
Antonis Bikakis
Daniel Onah
A. Vlachidis
Luke Dickens
41
0
0
03 Jan 2025
Diffusion Models in 3D Vision: A Survey
Zhen Wang
Dongyuan Li
Renhe Jiang
Tianyu He
Jiang Bian
Renhe Jiang
MedIm
70
4
0
07 Oct 2024
Deep Learning for Video Anomaly Detection: A Review
Peng Wu
Chengyu Pan
Yuting Yan
Guansong Pang
Peng Wang
Yanning Zhang
VLM
AI4TS
45
6
0
09 Sep 2024
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts
Peng Wu
Xuerong Zhou
Guansong Pang
Zhiwei Yang
Qingsen Yan
Peng Wang
Yanning Zhang
33
9
0
12 Aug 2024
Causal Understanding For Video Question Answering
Bhanu Prakash Reddy Guda
Tanmay Kulkarni
Adithya Sampath
Swarnashree Mysore Sathyendra
CML
54
0
0
23 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering
Zhaohe Liao
Jiangtong Li
Li Niu
Liqing Zhang
CoGe
39
3
0
03 Jul 2024
FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
Haodong Chen
Haojian Huang
Junhao Dong
Mingzhe Zheng
Dian Shao
45
16
0
02 Jul 2024
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition
Yang Wang
Haiyang Mei
Qirui Bao
Ziqi Wei
Mike Zheng Shou
Haizhou Li
Bo Dong
Xin Yang
46
1
0
20 Jun 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
ViViD: Video Virtual Try-on using Diffusion Models
Zixun Fang
Wei Zhai
Aimin Su
Hongliang Song
Kai Zhu
Mao Wang
Yu Chen
Zhiheng Liu
Yang Cao
Zheng-jun Zha
DiffM
VGen
50
7
0
20 May 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba
Hongwei Ren
Yue Zhou
Jiadong Zhu
Haotian Fu
Yulong Huang
Xiaopeng Lin
Yuetong Fang
Fei Ma
Hao Yu
Bo-Xun Cheng
Mamba
43
9
0
09 May 2024
Unified Dynamic Scanpath Predictors Outperform Individually Trained Neural Models
Fares Abawi
Di Fu
Stefan Wermter
38
0
0
05 May 2024
An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video
Xingyu Song
Zhan Li
Shi Chen
Xin-Qiang Cai
K. Demachi
28
2
0
10 Apr 2024
A self-attention model for robust rigid slice-to-volume registration of functional MRI
Samah Khawaled
Simon K. Warfield
Moti Freiman
48
1
0
06 Apr 2024
iMD4GC: Incomplete Multimodal Data Integration to Advance Precise Treatment Response Prediction and Survival Analysis for Gastric Cancer
Fengtao Zhou
Ying Xu
Yanfen Cui
Shenyang Zhang
Yun Zhu
...
Louis Ho Shing Lau
Chu Han
Dafu Zhang
Zhenhui Li
Hao Chen
30
1
0
01 Apr 2024
CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification
Guangqian Yang
Kangrui Du
Zhihan Yang
Ye Du
Yongping Zheng
Shujun Wang
45
16
0
25 Mar 2024
Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans
F. Bougourzi
Féryal Windal Moulaï
H. Benhabiles
Fadi Dornaika
Abdelmalik Taleb-Ahmed
3DPC
49
3
0
17 Mar 2024
DF4LCZ: A SAM-Empowered Data Fusion Framework for Scene-Level Local Climate Zone Classification
Qianqian Wu
Xianping Ma
Jialu Sui
Man-On Pun
34
4
0
14 Mar 2024
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
38
4
0
11 Mar 2024
DISCOVER: 2-D Multiview Summarization of Optical Coherence Tomography Angiography for Automatic Diabetic Retinopathy Diagnosis
Mostafa EL HABIB DAHO
Yi-Hsuan Li
Rachid Zeghlache
Hugo Le Boité
Pierre Deman
...
A. Couturier
R. Tadayoni
Pierre-Henri Conze
M. Lamard
G. Quellec
46
7
0
10 Jan 2024
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Semi-supervised Active Learning for Video Action Detection
Aayush Singh
A. J. Rana
Akash Kumar
Shruti Vyas
Yogesh S Rawat
36
7
0
12 Dec 2023
Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation
A. Dasgupta
C. V. Jawahar
Karteek Alahari
TTA
VLM
24
10
0
30 Nov 2023
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
Ming Jin
Qingsong Wen
Keli Zhang
Chaoli Zhang
Siqiao Xue
...
Shirui Pan
Vincent S. Tseng
Yu Zheng
Lei Chen
Hui Xiong
AI4TS
SyDa
40
118
0
16 Oct 2023
Metadata-Conditioned Generative Models to Synthesize Anatomically-Plausible 3D Brain MRIs
Wei Peng
Tomas Bosschieter
J. Ouyang
Robert Paul
Ehsan Adeli
Qingyu Zhao
K. Pohl
MedIm
38
9
0
07 Oct 2023
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation
Suhwan Cho
Minhyeok Lee
Jungho Lee
Myeongah Cho
Seungwook Park
Jaeyeob Kim
Hyunsung Jang
Sangyoun Lee
VOS
68
2
0
26 Sep 2023
View while Moving: Efficient Video Recognition in Long-untrimmed Videos
Ye Tian
Meng Yang
Lanshan Zhang
Zhizhen Zhang
Yang Liu
Xiao-Zhu Xie
Xirong Que
Wendong Wang
24
7
0
09 Aug 2023
Long-Distance Gesture Recognition using Dynamic Neural Networks
Shubhang Bhatnagar
S. Gopal
Narendra Ahuja
Liu Ren
34
3
0
09 Aug 2023
Atrial Septal Defect Detection in Children Based on Ultrasound Video Using Multiple Instances Learning
Yiman Liu
Qingming Huang
Xiaoxiang Han
Tongtong Liang
Zhi-fang Zhang
...
Angelos Stefanidis
Jionglong Su
Jiangang Chen
Qingli Li
Yuqi Zhang
23
7
0
06 Jun 2023
ViDaS Video Depth-aware Saliency Network
Ioanna Di̇amanti̇
A. Tsiami
Petros Koutras
Petros Maragos
MDE
40
0
0
19 May 2023
Helping Visually Impaired People Take Better Quality Pictures
Maniratnam Mandal
Deepti Ghadiyaram
Danna Gurari
A. Bovik
13
3
0
14 May 2023
Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video
Zenan Xu
Xiaojun Meng
Yasheng Wang
Qinliang Su
Zexuan Qiu
Xin Jiang
Qun Liu
33
3
0
08 May 2023
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Hung-Ting Su
Yulei Niu
Xudong Lin
Winston H. Hsu
Shih-Fu Chang
VGen
ELM
29
6
0
07 Apr 2023
Black Box Few-Shot Adaptation for Vision-Language models
Yassine Ouali
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
VLM
39
31
0
04 Apr 2023
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
Zicheng Zhang
Wei Wu
Wei Sun
Dangyang Tu
Wei Lu
Xiongkuo Min
Ying-Cong Chen
Guangtao Zhai
58
40
0
27 Mar 2023
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu
G. Chen
Yufei Wang
Libo Zhang
Tiejian Luo
Longyin Wen
32
47
0
22 Mar 2023
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant
Maika Edberg
Nicolas Dufour
Vicky Kalogeiton
MedIm
ViT
40
1
0
21 Mar 2023
Evaluating the Fairness of Deep Learning Uncertainty Estimates in Medical Image Analysis
Raghav Mehta
Changjian Shui
Tal Arbel
24
12
0
06 Mar 2023
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Junyan Wang
Zhenhong Sun
Yichen Qian
Dong Gong
Xiuyu Sun
Ming Lin
Maurice Pagnucco
Yang Song
3DPC
20
11
0
05 Mar 2023
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
N. H. Phong
B. Ribeiro
29
15
0
17 Feb 2023
1
2
3
4
5
6
Next