Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.07950
Cited By
End-to-End Video Captioning with Multitask Reinforcement Learning
21 March 2018
Lijun Li
Boqing Gong
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Video Captioning with Multitask Reinforcement Learning"
14 / 14 papers shown
Title
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
202
7,961
0
22 May 2017
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi
C. Grana
Rita Cucchiara
44
191
0
28 Nov 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
50
329
0
23 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
70
621
0
05 Nov 2016
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation
Rakshith Shetty
Jorma T. Laaksonen
36
94
0
17 Aug 2016
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data
Lisa Anne Hendricks
Subhashini Venugopalan
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
Trevor Darrell
CoGe
39
284
0
17 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Leilei Gan
Yueting Zhuang
38
385
0
11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
75
560
0
26 Oct 2015
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
131
2,261
0
05 Aug 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
153
2,461
0
01 Apr 2015
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
128
1,063
0
27 Feb 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
286
10,034
0
10 Feb 2015
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
91
951
0
15 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
227
4,451
0
20 Nov 2014
1