ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.06573
  4. Cited By
Convolutional Two-Stream Network Fusion for Video Action Recognition

Convolutional Two-Stream Network Fusion for Video Action Recognition

22 April 2016
Christoph Feichtenhofer
A. Pinz
Andrew Zisserman
ArXivPDFHTML

Papers citing "Convolutional Two-Stream Network Fusion for Video Action Recognition"

50 / 853 papers shown
Title
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
Xinze Wang
39
73
0
24 Mar 2022
No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static
  Models by Fitting Feature-level Space-time Surfaces
No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces
Jianqi Zhong
Kaichen Zhou
Qingyong Hu
Bing Wang
Niki Trigoni
Andrew Markham
3DPC
31
21
0
21 Mar 2022
FAR: Fourier Aerial Video Recognition
FAR: Fourier Aerial Video Recognition
D. Kothandaraman
Tianrui Guan
Xijun Wang
Sean Hu
Ming-Shun Lin
Tianyi Zhou
21
13
0
21 Mar 2022
DirecFormer: A Directed Attention in Transformer Approach to Robust
  Action Recognition
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
Thanh-Dat Truong
Quoc-Huy Bui
C. Duong
Han-Seok Seo
Son Lam Phung
Xin Li
Khoa Luu
ViT
42
49
0
19 Mar 2022
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal
  Generation
ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation
Khoa T. Vo
Kashu Yamazaki
Sang Truong
M. Tran
Akihiro Sugimoto
Ngan Le
EgoV
27
9
0
16 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
25
22
0
16 Mar 2022
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Keval Doshi
Yasin Yılmaz
ViT
35
2
0
10 Mar 2022
Universal Prototype Transport for Zero-Shot Action Recognition and
  Localization
Universal Prototype Transport for Zero-Shot Action Recognition and Localization
Pascal Mettes
19
5
0
08 Mar 2022
Behavior Recognition Based on the Integration of Multigranular Motion
  Features
Behavior Recognition Based on the Integration of Multigranular Motion Features
Lizong Zhang
Yiming Wang
Bei Hui
Xiu Zhang
Sijuan Liu
Shuxin Feng
16
0
0
07 Mar 2022
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition
  on Modality-Specific Annotated Videos
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
Saghir Alfasly
Jian Lu
C. Xu
Yuru Zou
42
18
0
06 Mar 2022
Exploiting long-term temporal dynamics for video captioning
Exploiting long-term temporal dynamics for video captioning
Yuyu Guo
Jingqiu Zhang
Lianli Gao
19
18
0
22 Feb 2022
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Zuoyu Qiu
Liang Xu
Yue Xu
Haoshu Fang
Cewu Lu
32
38
0
14 Feb 2022
A Coding Framework and Benchmark towards Compressed Video Understanding
A Coding Framework and Benchmark towards Compressed Video Understanding
Yuan Tian
Guo Lu
Yichao Yan
Guangtao Zhai
L. Chen
Zhiyong Gao
41
21
0
06 Feb 2022
Video Violence Recognition and Localization Using a Semi-Supervised Hard
  Attention Model
Video Violence Recognition and Localization Using a Semi-Supervised Hard Attention Model
Hamid Reza Mohammadi
Ehsan Nazerfard
27
24
0
04 Feb 2022
Should I take a walk? Estimating Energy Expenditure from Video Data
Should I take a walk? Estimating Energy Expenditure from Video Data
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
16
4
0
01 Feb 2022
Riemannian Local Mechanism for SPD Neural Networks
Riemannian Local Mechanism for SPD Neural Networks
Ziheng Chen
Tianyang Xu
Xiaojun Wu
Rui Wang
Zhiwu Huang
J. Kittler
19
16
0
25 Jan 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
36
38
0
20 Jan 2022
Hand-Object Interaction Reasoning
Hand-Object Interaction Reasoning
Jian Ma
Dima Damen
27
7
0
13 Jan 2022
OCSampler: Compressing Videos to One Clip with Single-step Sampling
OCSampler: Compressing Videos to One Clip with Single-step Sampling
Jintao Lin
Haodong Duan
Kai-xiang Chen
Dahua Lin
Limin Wang
42
24
0
12 Jan 2022
Motion-Focused Contrastive Learning of Video Representations
Motion-Focused Contrastive Learning of Video Representations
Rui Li
Yiheng Zhang
Zhaofan Qiu
Ting Yao
Dong Liu
Tao Mei
SSL
39
34
0
11 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Representing Videos as Discriminative Sub-graphs for Action Recognition
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
44
26
0
11 Jan 2022
Boosting Video Representation Learning with Multi-Faceted Integration
Boosting Video Representation Learning with Multi-Faceted Integration
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Xiaoping Zhang
Dong Wu
Tao Mei
31
8
0
11 Jan 2022
Condensing a Sequence to One Informative Frame for Video Recognition
Condensing a Sequence to One Informative Frame for Video Recognition
Zhaofan Qiu
Ting Yao
Y. Shu
Chong-Wah Ngo
Tao Mei
42
9
0
11 Jan 2022
Optimization Planning for 3D ConvNets
Optimization Planning for 3D ConvNets
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
3DPC
3DH
42
9
0
11 Jan 2022
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
TSA-Net: Tube Self-Attention Network for Action Quality Assessment
Shunli Wang
Dingkang Yang
Peng Zhai
Chixiao Chen
Lihua Zhang
ViT
32
63
0
11 Jan 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video
  Recognition
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
Yulin Wang
Yang Yue
Yuanze Lin
Haojun Jiang
Zihang Lai
V. Kulikov
Nikita Orlov
Humphrey Shi
Gao Huang
16
50
0
28 Dec 2021
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so
  Naïve
3D Skeleton-based Few-shot Action Recognition with JEANIE is not so Naïve
Lei Wang
Jun Liu
Piotr Koniusz
42
20
0
23 Dec 2021
Precondition and Effect Reasoning for Action Recognition
Precondition and Effect Reasoning for Action Recognition
Hongsang Yoo
Haopeng Li
Qiuhong Ke
Liangchen Liu
Rui Zhang
CML
49
4
0
19 Dec 2021
Adversarial Memory Networks for Action Prediction
Adversarial Memory Networks for Action Prediction
Zhiqiang Tao
Yue Bai
Handong Zhao
Sheng Li
Yuanyuan Kong
Y. Fu
GAN
18
2
0
18 Dec 2021
Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
Yinghao Xu
Fangyun Wei
Xiao Sun
Ceyuan Yang
Yujun Shen
Bo Dai
Bolei Zhou
Stephen Lin
VLM
33
52
0
17 Dec 2021
Spatio-Temporal CNN baseline method for the Sports Video Task of
  MediaEval 2021 benchmark
Spatio-Temporal CNN baseline method for the Sports Video Task of MediaEval 2021 benchmark
Pierre-Etienne Martin
14
7
0
16 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
29
17
0
13 Dec 2021
Progressive Attention on Multi-Level Dense Difference Maps for Generic
  Event Boundary Detection
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
Jiaqi Tang
Zhaoyang Liu
Chao Qian
Wayne Wu
Limin Wang
17
17
0
09 Dec 2021
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
33
364
0
08 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and
  Detection
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
75
679
0
02 Dec 2021
Video-Text Pre-training with Learned Regions
Video-Text Pre-training with Learned Regions
Rui Yan
Mike Zheng Shou
Yixiao Ge
Alex Jinpeng Wang
Xudong Lin
Guanyu Cai
Jinhui Tang
33
23
0
02 Dec 2021
Weakly-guided Self-supervised Pretraining for Temporal Activity
  Detection
Weakly-guided Self-supervised Pretraining for Temporal Activity Detection
Kumara Kahatapitiya
Zhou Ren
Haoxiang Li
Zhenyu Wu
Michael S. Ryoo
G. Hua
ViT
28
6
0
26 Nov 2021
Learning from Temporal Gradient for Semi-supervised Action Recognition
Learning from Temporal Gradient for Semi-supervised Action Recognition
Junfei Xiao
Longlong Jing
Lin Zhang
Ju He
Qi She
Zongwei Zhou
Alan Yuille
Yingwei Li
12
51
0
25 Nov 2021
Sparse Fusion for Multimodal Transformers
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
27
7
0
23 Nov 2021
Modeling Temporal Concept Receptive Field Dynamically for Untrimmed
  Video Analysis
Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis
Zhaobo Qi
Shuhui Wang
Chi Su
Li Su
Weigang Zhang
Qingming Huang
27
10
0
23 Nov 2021
Self-Regulated Learning for Egocentric Video Activity Anticipation
Self-Regulated Learning for Egocentric Video Activity Anticipation
Zhaobo Qi
Shuhui Wang
Chi Su
Li Su
Qingming Huang
Q. Tian
EgoV
47
52
0
23 Nov 2021
Efficient Video Transformers with Spatial-Temporal Token Selection
Efficient Video Transformers with Spatial-Temporal Token Selection
Junke Wang
Xitong Yang
Hengduo Li
Li Liu
Zuxuan Wu
Yu-Gang Jiang
ViT
21
63
0
23 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer
  Vision Tasks
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
30
7
0
14 Nov 2021
Relational Self-Attention: What's Missing in Attention for Video
  Understanding
Relational Self-Attention: What's Missing in Attention for Video Understanding
Manjin Kim
Heeseung Kwon
Chunyu Wang
Suha Kwak
Minsu Cho
ViT
27
28
0
02 Nov 2021
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
Dinghao Fan
Hengjie Lu
Shugong Xu
Shan Cao
32
15
0
29 Oct 2021
ST-ABN: Visual Explanation Taking into Account Spatio-temporal
  Information for Video Recognition
ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video Recognition
Masahiro Mitsuhara
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
27
1
0
29 Oct 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
21
25
0
27 Oct 2021
A Variational Graph Autoencoder for Manipulation Action Recognition and
  Prediction
A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction
Gamze Akyol
Sanem Sariel
E. Aksoy
GNN
DRL
BDL
41
2
0
25 Oct 2021
Multimodal Semi-Supervised Learning for 3D Objects
Multimodal Semi-Supervised Learning for 3D Objects
Zhimin Chen
Longlong Jing
Yang Liang
Yingli Tian
Bing Li
3DPC
18
28
0
22 Oct 2021
GTM: Gray Temporal Model for Video Recognition
GTM: Gray Temporal Model for Video Recognition
Yanping Zhang
Yongxin Yu
33
0
0
20 Oct 2021
Previous
123456...161718
Next