Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.03766
Cited By
The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary
11 August 2018
Guohao Li
Juan Carlos Niebles
Cees G. M. Snoek
Fabian Caba Heilbron
Humam Alwassel
Victor Escorcia
Ranjay Krishna
S. Buch
Cuong Duc Dao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary"
27 / 27 papers shown
Title
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
168
499
0
24 Apr 2018
Jointly Localizing and Describing Events for Dense Video Captioning
Yehao Li
Ting Yao
Yingwei Pan
Hongyang Chao
Tao Mei
41
170
0
23 Apr 2018
Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning
Jingwen Wang
Wenhao Jiang
Lin Ma
Wen Liu
Yong-mei Xu
54
204
0
31 Mar 2018
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
...
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
67
543
0
09 Jan 2018
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks
Zhaofan Qiu
Ting Yao
Tao Mei
63
1,655
0
28 Nov 2017
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
Xiang Long
Chuang Gan
Gerard de Melo
Jiajun Wu
Xiao-Chang Liu
Shilei Wen
48
209
0
27 Nov 2017
Appearance-and-Relation Networks for Video Classification
Limin Wang
Wei Li
Wen Li
Luc Van Gool
54
351
0
24 Nov 2017
Non-local Neural Networks
Xinyu Wang
Ross B. Girshick
Abhinav Gupta
Kaiming He
OffRL
194
8,867
0
21 Nov 2017
Generating Video Descriptions with Topic Guidance
Shizhe Chen
Jia Chen
Qin Jin
54
21
0
31 Aug 2017
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu
Chen Sun
David A. Ross
Carl Vondrick
C. Pantofaru
...
G. Toderici
Susanna Ricco
Rahul Sukthankar
Cordelia Schmid
Jitendra Malik
VGen
82
1,021
0
23 May 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
194
7,961
0
22 May 2017
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
193
3,771
0
19 May 2017
Temporal Segment Networks for Action Recognition in Videos
Limin Wang
Yuanjun Xiong
Zhe Wang
Yu Qiao
Dahua Lin
Xiaoou Tang
Luc Van Gool
ViT
81
807
0
08 May 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
118
1,225
0
02 May 2017
ActionVLAD: Learning spatio-temporal aggregation for action classification
Rohit Girdhar
Deva Ramanan
Abhinav Gupta
Josef Sivic
Bryan C. Russell
AI4TS
56
451
0
10 Apr 2017
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
96
1,880
0
02 Dec 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
44
328
0
23 Nov 2016
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
415
10,281
0
16 Nov 2016
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
86
2,488
0
29 Sep 2016
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Christian Szegedy
Sergey Ioffe
Vincent Vanhoucke
Alexander A. Alemi
270
14,196
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.3K
192,638
0
10 Dec 2015
NetVLAD: CNN architecture for weakly supervised place recognition
Relja Arandjelović
Petr Gronát
Akihiko Torii
Tomas Pajdla
Josef Sivic
3DV
SSL
111
2,618
0
23 Nov 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
60
534
0
07 May 2015
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
91
1,417
0
03 May 2015
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
118
1,063
0
27 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
277
10,034
0
10 Feb 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
117
6,037
0
17 Nov 2014
1