ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2199
  4. Cited By
Two-Stream Convolutional Networks for Action Recognition in Videos
v1v2 (latest)

Two-Stream Convolutional Networks for Action Recognition in Videos

9 June 2014
Karen Simonyan
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,289 papers shown
Title
Skim then Focus: Integrating Contextual and Fine-grained Views for
  Repetitive Action Counting
Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting
Zhengqi Zhao
Xiaohu Huang
Hao Zhou
Kun Yao
Errui Ding
Jingdong Wang
Xinggang Wang
Wenyu Liu
Bin Feng
44
1
0
13 Jun 2024
Vision Model Pre-training on Interleaved Image-Text Data via Latent
  Compression Learning
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Chenyu Yang
Xizhou Zhu
Jinguo Zhu
Weijie Su
Junjie Wang
...
Lewei Lu
Bin Li
Jie Zhou
Yu Qiao
Jifeng Dai
VLMCLIP
87
6
0
11 Jun 2024
Motion Consistency Model: Accelerating Video Diffusion with Disentangled
  Motion-Appearance Distillation
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
Yuanhao Zhai
Kevin Lin
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Chung-Ching Lin
David Doermann
Junsong Yuan
Lijuan Wang
VGenDiffM
93
13
0
11 Jun 2024
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World
  Egocentric Action Recognition
ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition
Sanjoy Kundu
Shubham Trehan
Sathyanarayanan N. Aakur
LM&RoLRM
59
1
0
09 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model
  Training, and Data Perspectives
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
170
13
1
09 Jun 2024
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the
  Dark
DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark
Chi-Jui Chang
Oscar Tai-Yuan Chen
Vincent S. Tseng
VLM
63
2
0
04 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a
  Hybrid Model
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedImViT
83
9
0
02 Jun 2024
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space
  Model
Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model
Wenbing Li
Hang Zhou
Junqing Yu
Zikai Song
Wei Yang
Mamba
91
5
0
28 May 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
84
0
0
28 May 2024
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
Hao Dong
Yue Zhao
Eleni Chatzi
Olga Fink
OODD
85
18
0
27 May 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to
  Biological Motion Perception
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
100
0
0
26 May 2024
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
Chau Pham
Bryan A. Plummer
72
6
0
26 May 2024
Planted: a dataset for planted forest identification from
  multi-satellite time series
Planted: a dataset for planted forest identification from multi-satellite time series
L. M. Pazos-Outón
Cristina Nader Vasconcelos
Anton Raichuk
Anurag Arnab
Dan Morris
Maxim Neumann
80
5
0
24 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video
  Representation Learning
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan Yuille
Cihang Xie
AI4TSVGenSSL
83
2
0
24 May 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A
  Survey
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
106
10
0
22 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
153
3
0
21 May 2024
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic
  Hand Gesture Recognition
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLRViT
80
5
0
18 May 2024
A Semantic and Motion-Aware Spatiotemporal Transformer Network for
  Action Detection
A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
Matthew Korban
Peter Youngs
Scott T. Acton
ViT
75
7
0
13 May 2024
Deep video representation learning: a survey
Deep video representation learning: a survey
Elham Ravanbakhsh
Yongqing Liang
J. Ramanujam
Xin Li
80
3
0
10 May 2024
Multi-Stream Keypoint Attention Network for Sign Language Recognition
  and Translation
Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation
Mo Guan
Yan Wang
Guangkun Ma
Jiarui Liu
Mingzu Sun
SLR
75
7
0
09 May 2024
A Survey on Backbones for Deep Video Action Recognition
A Survey on Backbones for Deep Video Action Recognition
Zixuan Tang
Youjun Zhao
Yuhang Wen
Mengyuan Liu
60
1
0
09 May 2024
MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging
MERIT: Multi-view evidential learning for reliable and interpretable liver fibrosis staging
Yuanye Liu
Zheyao Gao
Nannan Shi
Fuping Wu
Yuxin Shi
Qingchao Chen
Xiahai Zhuang
120
0
0
05 May 2024
Multi-view Action Recognition via Directed Gromov-Wasserstein
  Discrepancy
Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy
Hoang-Quan Nguyen
Thanh-Dat Truong
Khoa Luu
89
1
0
02 May 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric
  Action Recognition
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
124
3
0
18 Apr 2024
Vision Augmentation Prediction Autoencoder with Attention Design
  (VAPAAD)
Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)
Yiqiao Yin
47
0
0
15 Apr 2024
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
Jianyuan Ni
Hao Tang
Syed Tousiful Haque
Yan Yan
A. Ngu
122
9
0
14 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
66
3
0
13 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video
  Gaze Estimation
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
67
6
0
08 Apr 2024
Towards more realistic human motion prediction with attention to motion
  coordination
Towards more realistic human motion prediction with attention to motion coordination
Pengxiang Ding
Jianqin Yin
76
16
0
04 Apr 2024
Language Model Guided Interpretable Video Action Reasoning
Language Model Guided Interpretable Video Action Reasoning
Ning Wang
Guangming Zhu
HS Li
Liang Zhang
Syed Afaq Ali Shah
Mohammed Bennamoun
82
3
0
02 Apr 2024
Dual DETRs for Multi-Label Temporal Action Detection
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu
Guozhen Zhang
Jing Tan
Gangshan Wu
Limin Wang
113
12
0
31 Mar 2024
Hypergraph-based Multi-View Action Recognition using Event Cameras
Hypergraph-based Multi-View Action Recognition using Event Cameras
Yue Gao
Jiaxuan Lu
Siqi Li
Yipeng Li
Shaoyi Du
116
13
0
28 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLMVGen
119
16
0
26 Mar 2024
Emotion Recognition from the perspective of Activity Recognition
Emotion Recognition from the perspective of Activity Recognition
Savinay Nagendra
Prapti Panigrahi
68
2
0
24 Mar 2024
Enhancing Video Transformers for Action Understanding with VLM-aided
  Training
Enhancing Video Transformers for Action Understanding with VLM-aided Training
Hui Lu
Hu Jian
Ronald Poppe
A. A. Salah
74
2
0
24 Mar 2024
Towards Two-Stream Foveation-based Active Vision Learning
Towards Two-Stream Foveation-based Active Vision Learning
Timur Ibrayev
Amitangshu Mukherjee
Sai Aparna Aketi
Kaushik Roy
68
2
0
24 Mar 2024
VidLA: Video-Language Alignment at Scale
VidLA: Video-Language Alignment at Scale
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLMAI4TS
90
4
0
21 Mar 2024
Selective, Interpretable, and Motion Consistent Privacy Attribute
  Obfuscation for Action Recognition
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Filip Ilic
Henghui Zhao
Thomas Pock
Richard P. Wildes
PICVAAML
68
3
0
19 Mar 2024
VideoBadminton: A Video Dataset for Badminton Action Recognition
VideoBadminton: A Video Dataset for Badminton Action Recognition
Qi Li
Tzu-Chen Chiu
Hsiang-Wei Huang
Minmin Sun
Wei-Shinn Ku
44
1
0
19 Mar 2024
Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level
  Perception
Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception
Vijay John
Yasutomo Kawanishi
71
0
0
18 Mar 2024
A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity
  Recognition
A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity Recognition
Abhi Kamboj
Minh Do
90
4
0
17 Mar 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
98
13
0
17 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
77
6
0
14 Mar 2024
SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion
  using a 3D Recurrent U-Net
SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net
Helin Cao
Sven Behnke
3DPC3DGS
63
4
0
13 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
67
214
0
11 Mar 2024
Deep Learning Approaches for Human Action Recognition in Video Data
Deep Learning Approaches for Human Action Recognition in Video Data
Yufei Xie
53
0
0
11 Mar 2024
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for
  Distracted Driver Action Recognition
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
Erkut Akdag
Zeqi Zhu
Egor Bondarev
Peter H. N. de With
ViT
80
5
0
11 Mar 2024
A spatiotemporal style transfer algorithm for dynamic visual stimulus
  generation
A spatiotemporal style transfer algorithm for dynamic visual stimulus generation
Antonino Greco
Markus Siegel
70
2
0
07 Mar 2024
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits
Credibility-Aware Multi-Modal Fusion Using Probabilistic Circuits
Sahil Sidheekh
Pranuthi Tenali
Saurabh Mathur
Erik Blasch
Kristian Kersting
S. Natarajan
63
1
0
05 Mar 2024
Enhancing Long-Term Person Re-Identification Using Global, Local Body
  Part, and Head Streams
Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams
Duy Tran Thanh
Yeejin Lee
Byeongkeun Kang
111
2
0
05 Mar 2024
Previous
123456...444546
Next