ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.12686
  4. Cited By
Holistic Interaction Transformer Network for Action Detection

Holistic Interaction Transformer Network for Action Detection

23 October 2022
Gueter Josmy Faure
Min-Hung Chen
S. Lai
ArXivPDFHTML

Papers citing "Holistic Interaction Transformer Network for Action Detection"

45 / 45 papers shown
Title
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li
Renping Zhou
Jiawei Zhou
Yingwei Song
Johannes Herter
Minghan Qin
Gao Huang
Hanspeter Pfister
3DGS
VLM
109
1
0
13 Mar 2025
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts
Taein Son
Soo Won Seo
Jisong Kim
S. Lee
Jun Won Choi
VGen
111
0
0
18 Dec 2024
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient
  Long-Term Video Recognition
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
81
199
0
20 Jan 2022
Identity-aware Graph Memory Network for Action Detection
Identity-aware Graph Memory Network for Action Detection
Jingcheng Ni
Jie Qin
Di Huang
73
9
0
26 Aug 2021
Channel-wise Topology Refinement Graph Convolution for Skeleton-Based
  Action Recognition
Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
Yuxin Chen
Ziqi Zhang
Chunfen Yuan
Bing Li
Ying Deng
Weiming Hu
59
583
0
26 Jul 2021
Towards Long-Form Video Understanding
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
111
169
0
21 Jun 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized
  Sports Actions
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
Yixuan Li
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
73
98
0
16 May 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
  Memories
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
54
21
0
02 Apr 2021
TubeR: Tubelet Transformer for Video Action Detection
TubeR: Tubelet Transformer for Video Action Detection
Jiaojiao Zhao
Yanyi Zhang
Xinyu Li
Hao Chen
Shuai Bing
...
Yuanjun Xiong
Davide Modolo
I. Marsic
Cees G. M. Snoek
Joseph Tighe
ViT
56
73
0
02 Apr 2021
3D Human Pose Estimation with Spatial and Temporal Transformers
3D Human Pose Estimation with Spatial and Temporal Transformers
Ce Zheng
Sijie Zhu
Matías Mendieta
Taojiannan Yang
Chong Chen
Zhengming Ding
ViT
118
452
0
18 Mar 2021
ACDnet: An action detection network for real-time edge computing based
  on flow-guided feature approximation and memory aggregation
ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation
Yu Liu
Fan Yang
D. Ginhac
82
13
0
26 Feb 2021
Finding Action Tubes with a Sparse-to-Dense Framework
Finding Action Tubes with a Sparse-to-Dense Framework
Yuxi Li
Weiyao Lin
Tao Wang
John See
Rui Qian
N. Xu
Limin Wang
Shugong Xu
ViT
136
17
0
30 Aug 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
107
80
0
20 Jul 2020
Quo Vadis, Skeleton Action Recognition ?
Quo Vadis, Skeleton Action Recognition ?
Pranay Gupta
Anirudh Thatipelli
Aditya Aggarwal
Shubhanshu Maheshwari
Neel Trivedi
Sourav Das
Ravi Kiran Sarvadevabhatla
60
64
0
04 Jul 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
60
150
0
14 Jun 2020
Asynchronous Interaction Aggregation for Action Detection
Asynchronous Interaction Aggregation for Action Detection
Jiajun Tang
Jinchao Xia
Xinzhi Mu
Bo Pang
Cewu Lu
57
120
0
16 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
128
1,019
0
09 Apr 2020
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
  Recognition
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Ziyu Liu
Hongwen Zhang
Zhenghao Chen
Zhiyong Wang
Wanli Ouyang
84
835
0
31 Mar 2020
Actions as Moving Points
Actions as Moving Points
Yixuan Li
Zixu Wang
Limin Wang
Gangshan Wu
106
106
0
14 Jan 2020
Something-Else: Compositional Action Recognition with Spatial-Temporal
  Interaction Networks
Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks
Joanna Materzynska
Tete Xiao
Roei Herzig
Huijuan Xu
Xiaolong Wang
Trevor Darrell
CoGe
51
176
0
20 Dec 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time
  Spatiotemporal Action Localization
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Okan Kopuklu
Xiangyu Wei
Gerhard Rigoll
74
144
0
15 Nov 2019
TACNet: Transition-Aware Context Network for Spatio-Temporal Action
  Detection
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
Lin Song
Shiwei Zhang
Gang Yu
Hongbin Sun
129
83
0
31 May 2019
Improving Action Localization by Progressive Cross-stream Cooperation
Improving Action Localization by Progressive Cross-stream Cooperation
Rui Su
Wanli Ouyang
Luping Zhou
Dong Xu
32
23
0
28 May 2019
Collaborative Spatio-temporal Feature Learning for Video Action
  Recognition
Collaborative Spatio-temporal Feature Learning for Video Action Recognition
Chong Li
Qiaoyong Zhong
Di Xie
Shiliang Pu
61
82
0
04 Mar 2019
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
169
480
0
12 Dec 2018
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
164
3,273
0
10 Dec 2018
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
126
708
0
06 Dec 2018
Actor-Centric Relation Network
Actor-Centric Relation Network
Chen Sun
Abhinav Shrivastava
Carl Vondrick
Kevin Patrick Murphy
Rahul Sukthankar
Cordelia Schmid
97
221
0
28 Jul 2018
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
Zhaofan Qiu
Ting Yao
Tao Mei
84
1,662
0
28 Nov 2017
Temporal Relational Reasoning in Videos
Temporal Relational Reasoning in Videos
Bolei Zhou
A. Andonian
Aude Oliva
Antonio Torralba
NAI
96
1,039
0
22 Nov 2017
Non-local Neural Networks
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
286
8,905
0
21 Nov 2017
Attend and Interact: Higher-Order Object Interactions for Video
  Understanding
Attend and Interact: Higher-Order Object Interactions for Video Understanding
Chih-Yao Ma
Asim Kadav
I. Melvin
Z. Kira
G. Al-Regib
H. Graf
52
145
0
16 Nov 2017
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual
  Actions
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu
Chen Sun
David A. Ross
Carl Vondrick
C. Pantofaru
...
G. Toderici
Susanna Ricco
Rahul Sukthankar
Cordelia Schmid
Jitendra Malik
VGen
101
1,030
0
23 May 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
229
8,015
0
22 May 2017
Action Tubelet Detector for Spatio-Temporal Action Localization
Action Tubelet Detector for Spatio-Temporal Action Localization
Vicky Kalogeiton
Philippe Weinzaepfel
V. Ferrari
Cordelia Schmid
66
325
0
04 May 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
350
27,181
0
20 Mar 2017
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
468
22,102
0
09 Dec 2016
Online Real-time Multiple Spatiotemporal Action Localisation and
  Prediction
Online Real-time Multiple Spatiotemporal Action Localisation and Prediction
Gurkirt Singh
Suman Saha
Michael Sapienza
Philip Torr
Fabio Cuzzolin
63
288
0
25 Nov 2016
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
509
10,322
0
16 Nov 2016
Convolutional Two-Stream Network Fusion for Video Action Recognition
Convolutional Two-Stream Network Fusion for Video Action Recognition
Christoph Feichtenhofer
A. Pinz
Andrew Zisserman
160
2,611
0
22 Apr 2016
Human Action Recognition using Factorized Spatio-Temporal Convolutional
  Networks
Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks
Lin Sun
Kui Jia
Dit-Yan Yeung
Bertram E. Shi
74
532
0
02 Oct 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
499
62,270
0
04 Jun 2015
Fast R-CNN
Fast R-CNN
Ross B. Girshick
ObjD
301
25,051
0
30 Apr 2015
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIP
VGen
143
6,147
0
03 Dec 2012
1