Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01929
Cited By
v1
v2 (latest)
Learning to track for spatio-temporal action localization
5 June 2015
Philippe Weinzaepfel
Zaïd Harchaoui
Cordelia Schmid
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to track for spatio-temporal action localization"
15 / 15 papers shown
Title
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
136
0
0
17 Mar 2025
A Novel Online Action Detection Framework from Untrimmed Video Streams
Da-Hye Yoon
Nam-Gyu Cho
Seong-Whan Lee
73
20
0
17 Mar 2020
Discovering Spatio-Temporal Action Tubes
Yuancheng Ye
Xiaodong Yang
Yingli Tian
65
14
0
29 Nov 2018
Action Tubelet Detector for Spatio-Temporal Action Localization
Vicky Kalogeiton
Philippe Weinzaepfel
V. Ferrari
Cordelia Schmid
68
325
0
04 May 2017
DAP3D-Net: Where, What and How Actions Occur in Videos?
Li Liu
Yi Zhou
Ling Shao
57
14
0
10 Feb 2016
A robust and efficient video representation for action recognition
Heng Wang
Dan Oneaţă
Jakob Verbeek
Cordelia Schmid
58
326
0
21 Apr 2015
What makes for effective detection proposals?
J. Hosang
Rodrigo Benenson
Piotr Dollár
Bernt Schiele
ObjD
110
735
0
17 Feb 2015
Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran
Lubomir D. Bourdev
Rob Fergus
Lorenzo Torresani
Manohar Paluri
3DPC
77
411
0
02 Dec 2014
Finding Action Tubes
Georgia Gkioxari
Jitendra Malik
77
599
0
21 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
173
6,057
0
17 Nov 2014
Caffe: Convolutional Architecture for Fast Feature Embedding
Yangqing Jia
Evan Shelhamer
Jeff Donahue
Sergey Karayev
Jonathan Long
Ross B. Girshick
S. Guadarrama
Trevor Darrell
VLM
BDL
3DV
280
14,715
0
20 Jun 2014
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
259
7,545
0
09 Jun 2014
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
291
26,223
0
11 Nov 2013
Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks
Alessandro Giusti
D. Ciresan
Jonathan Masci
L. Gambardella
Jürgen Schmidhuber
178
346
0
07 Feb 2013
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIP
VGen
160
6,164
0
03 Dec 2012
1