Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.01232
Cited By
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
3 August 2020
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition"
50 / 57 papers shown
Title
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
64
0
0
06 Feb 2025
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
75
0
0
24 Nov 2024
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
24
0
0
28 Oct 2024
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
Wenqing Gan
Yaoyu Li
Jian Li
Zhangang Lin
ViT
32
0
0
01 Aug 2024
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
40
0
0
18 Jul 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
35
9
0
22 May 2024
Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining
Neena Aloysius
M. Geetha
Prema Nedungadi
SLR
27
2
0
20 May 2024
TransNet: A Transfer Learning-Based Network for Human Action Recognition
Khaled Alomar
Xiaohao Cai
34
1
0
13 Sep 2023
Predicting Routine Object Usage for Proactive Robot Assistance
Maithili Patel
Aswin Prakash
Sonia Chernova
AI4TS
37
8
0
12 Sep 2023
IndGIC: Supervised Action Recognition under Low Illumination
Jing-Teng Zeng
35
1
0
29 Aug 2023
Actor-agnostic Multi-label Action Recognition with Multi-modal Query
Anindya Mondal
Sauradip Nag
J. Prada
Xiatian Zhu
Anjan Dutta
23
9
0
20 Jul 2023
UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou
Meng Shen
Chen Chen
Yuchen Hu
D. Rajan
Chng Eng Siong
SSL
40
15
0
16 May 2023
Physical Adversarial Attacks for Surveillance: A Survey
Kien Nguyen Thanh
Tharindu Fernando
Clinton Fookes
Sridha Sridharan
AAML
36
8
0
01 May 2023
Weakly Supervised Detection of Baby Cry
Weijun Tan
Qi Yao
Jingfeng Liu
18
1
0
19 Apr 2023
Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review
Asim Waqas
Aakash Tripathi
Ravichandran Ramachandran
Paul Stewart
Ghulam Rasool
AI4CE
37
31
0
11 Mar 2023
Capsules as viewpoint learners for human pose estimation
Nicola Garau
Nicola Conci
3DH
24
0
0
13 Feb 2023
Triple-stream Deep Metric Learning of Great Ape Behavioural Actions
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
25
14
0
06 Jan 2023
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
24
8
0
29 Dec 2022
Simultaneous Multiple Object Detection and Pose Estimation using 3D Model Infusion with Monocular Vision
Cong Li
Shijie Sun
Xiangyu Song
Huansheng Song
Naveed Akhtar
Ajmal Mian
3DPC
33
1
0
21 Nov 2022
A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
37
17
0
16 Nov 2022
Overlooked Video Classification in Weakly Supervised Video Anomaly Detection
Weijun Tan
Qi Yao
Jingfeng Liu
AI4TS
24
10
0
13 Oct 2022
Vision Transformers for Action Recognition: A Survey
Anwaar Ulhaq
Naveed Akhtar
Ganna Pogrebna
Ajmal Mian
ViT
19
44
0
13 Sep 2022
Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration
Sadman Sakib Enan
Michael Fulton
Junaed Sattar
39
8
0
12 Jul 2022
Two-Stage COVID19 Classification Using BERT Features
Weijun Tan
Qi Yao
Jingfeng Liu
24
9
0
29 Jun 2022
Detection of Fights in Videos: A Comparison Study of Anomaly Detection and Action Recognition
Weijun Tan
Jingfeng Liu
21
8
0
23 May 2022
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
Arnav Chakravarthy
Zhiyuan Fang
Yezhou Yang
35
2
0
28 Apr 2022
3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition
Pierre-Etienne Martin
J. Benois-Pineau
Renaud Péteri
A. Zemmari
J. Morlier
27
5
0
13 Apr 2022
Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Yuecong Xu
Jianfei Yang
Haozhi Cao
Jianxiong Yin
Zhenghua Chen
Xiaoli Li
Zhengguo Li
Qiaoqiao Xu
43
2
0
19 Feb 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
22
103
0
16 Jan 2022
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yanyan Liang
Fan Wang
Du Zhang
Zhen Lei
Hao Li
Rong Jin
36
29
0
16 Dec 2021
Evaluating Transformers for Lightweight Action Recognition
Raivo Koot
Markus Hennerbichler
Haiping Lu
ViT
28
8
0
18 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
27
7
0
14 Nov 2021
Sparse Adversarial Video Attacks with Spatial Transformations
Ronghui Mu
Wenjie Ruan
Leandro Soriano Marcolino
Q. Ni
AAML
30
18
0
10 Nov 2021
Unsupervised View-Invariant Human Posture Representation
Faegheh Sardari
Bjorn Ommer
Majid Mirmehdi
3DH
31
3
0
17 Sep 2021
Deep Learning for Fitness
N. Mahendran
3DH
21
4
0
03 Sep 2021
Multi-Modal Zero-Shot Sign Language Recognition
R. Rastgoo
Kourosh Kiani
Sergio Escalera
Mohammad Sabokrou
SLR
19
5
0
02 Sep 2021
LIGAR: Lightweight General-purpose Action Recognition
Evgeny Izutov
15
3
0
30 Aug 2021
ZS-SLR: Zero-Shot Sign Language Recognition from RGB-D Videos
R. Rastgoo
Kourosh Kiani
Sergio Escalera
SLR
24
10
0
23 Aug 2021
DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders
Nicola Garau
N. Bisagno
Piotr Bródka
Nicola Conci
16
27
0
19 Aug 2021
Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net
Yu Qiu
Yun-Hai Liu
Le Zhang
Jing Xu
ViT
21
30
0
17 Aug 2021
Temporal Action Localization Using Gated Recurrent Units
Hassan Keshvari Khojasteh
Hoda Mohammadzade
H. Behroozi
21
3
0
07 Aug 2021
Federated Action Recognition on Heterogeneous Embedded Devices
Pranjali Jain
Shreyas Goenka
S. Bagchi
Biplab Banerjee
Somali Chaterji
FedML
45
7
0
18 Jul 2021
Training for temporal sparsity in deep neural networks, application in video processing
Amirreza Yousefzadeh
Manolis Sifalakis
26
3
0
15 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
75
17
0
12 Jul 2021
Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation
Hadrien Reynaud
Athanasios Vlontzos
Benjamin Hou
A. Beqiri
Paul Leeson
Bernhard Kainz
MedIm
ViT
33
53
0
02 Jul 2021
A 3D CNN Network with BERT For Automatic COVID-19 Diagnosis From CT-Scan Images
Weijun Tan
Jingfeng Liu
3DPC
MedIm
28
17
0
28 Jun 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Joey Tianyi Zhou
36
65
0
21 Jun 2021
What Makes Multi-modal Learning Better than Single (Provably)
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
33
249
0
08 Jun 2021
Personalizing Pre-trained Models
Mina Khan
P. Srivatsa
Advait Rane
Shriram Chenniappa
A. Hazariwala
Pattie Maes
VLM
55
5
0
02 Jun 2021
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
24
120
0
25 Mar 2021
1
2
Next