Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1504.00325
Cited By
Microsoft COCO Captions: Data Collection and Evaluation Server
1 April 2015
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Microsoft COCO Captions: Data Collection and Evaluation Server"
50 / 1,391 papers shown
Title
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems
Ozan Caglayan
Mercedes García-Martínez
Adrien Bardet
Walid Aransa
Fethi Bougares
Loïc Barrault
27
65
0
01 Jun 2017
Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols
Serhii Havrylov
Ivan Titov
LLMAG
41
286
0
31 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,865
0
26 May 2017
Imagination improves Multimodal Translation
Desmond Elliott
Ákos Kádár
29
136
0
11 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
21
118
0
02 May 2017
Learning to Ask: Neural Question Generation for Reading Comprehension
Xinya Du
Junru Shao
Claire Cardie
3DV
34
658
0
29 Apr 2017
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Kumar Misra
John Langford
Yoav Artzi
21
247
0
28 Apr 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
19
57
0
26 Apr 2017
Multi-Task Video Captioning with Video and Entailment Generation
Ramakanth Pasunuru
Joey Tianyi Zhou
33
116
0
24 Apr 2017
Paying Attention to Descriptions Generated by Image Captioning Models
Hamed R. Tavakoli
Rakshith Shetty
Ali Borji
Jorma T. Laaksonen
23
79
0
24 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks
Spandana Gella
Frank Keller
ObjD
24
11
0
24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
22
37
0
24 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
34
324
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Egocentric Video Description based on Temporally-Linked Sequences
Marc Bolaños
Álvaro Peris
F. Casacuberta
Sergi Soler
Petia Radeva
EgoV
26
25
0
07 Apr 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images
Rakshith Shetty
Bernt Schiele
Mario Fritz
35
223
0
30 Mar 2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
19
142
0
30 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huan Zhang
Chuang Gan
Eric P. Xing
GAN
21
200
0
21 Mar 2017
Evolving Deep Neural Networks
Risto Miikkulainen
J. Liang
Elliot Meyerson
Aditya Rawal
Daniel Fink
...
B. Raju
H. Shahrzad
Arshak Navruzyan
Nigel P. Duffy
B. Hodjat
16
884
0
01 Mar 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
24
386
0
19 Feb 2017
Learning to Decode for Future Success
Jiwei Li
Will Monroe
Dan Jurafsky
31
58
0
23 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
41
359
0
11 Jan 2017
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
30
9
0
15 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park
Lisa Anne Hendricks
Zeynep Akata
Bernt Schiele
Trevor Darrell
Marcus Rohrbach
AAML
24
79
0
14 Dec 2016
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering
Marc Bolaños
Álvaro Peris
F. Casacuberta
Petia Radeva
24
6
0
12 Dec 2016
ImageNet pre-trained models with batch normalization
Marcel Simon
E. Rodner
Joachim Denzler
VLM
SSeg
44
165
0
05 Dec 2016
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
33
205
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
21
232
0
02 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
22
88
0
01 Dec 2016
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Linchao Zhu
Zhongwen Xu
Yi Yang
27
1
0
28 Nov 2016
A Simple, Fast Diverse Decoding Algorithm for Neural Generation
Jiwei Li
Will Monroe
Dan Jurafsky
33
239
0
25 Nov 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Zhe Gan
Chunyuan Li
Changyou Chen
Yunchen Pu
Qinliang Su
Lawrence Carin
BDL
UQCV
53
41
0
23 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
44
425
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
19
329
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
38
10
0
20 Nov 2016
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
Tieniu Tan
32
142
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
29
103
0
16 Nov 2016
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
29
9
0
16 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
48
620
0
05 Nov 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
29
50
0
17 Oct 2016
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
17
16
0
12 Oct 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
41
19,576
0
07 Oct 2016
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
9
6
0
26 Sep 2016
Deep Learning for Video Classification and Captioning
Zuxuan Wu
Ting Yao
Yanwei Fu
Yu-Gang Jiang
3DV
VLM
22
123
0
22 Sep 2016
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
Wei Shen
Kai Zhao
Yuan Jiang
Yan Wang
X. Bai
Alan Yuille
14
99
0
13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
21
37
0
31 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
24
68
0
18 Aug 2016
Learning Joint Representations of Videos and Sentences with Web Image Search
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
N. Yokoya
18
94
0
08 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,884
0
29 Jul 2016
Previous
1
2
3
...
26
27
28
Next