Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1504.00325
Cited By
Microsoft COCO Captions: Data Collection and Evaluation Server
1 April 2015
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Microsoft COCO Captions: Data Collection and Evaluation Server"
41 / 1,391 papers shown
Title
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
32
413
0
20 Jul 2016
VideoMCC: a New Benchmark for Video Comprehension
Du Tran
Maksim Bolonkin
Manohar Paluri
Lorenzo Torresani
21
1
0
23 Jun 2016
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions
F. Carrara
Andrea Esuli
T. Fagni
Fabrizio Falchi
Alejandro Moreo
DiffM
24
31
0
23 Jun 2016
Pragmatic factors in image description: the case of negations
Emiel van Miltenburg
R. Morante
Desmond Elliott
15
18
0
20 Jun 2016
Review Networks for Caption Generation
Zhilin Yang
Ye Yuan
Yuexin Wu
Ruslan Salakhutdinov
William W. Cohen
3DV
32
85
0
25 May 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Multi30K: Multilingual English-German Image Descriptions
Desmond Elliott
Stella Frank
K. Simaán
Lucia Specia
VLM
27
580
0
02 May 2016
Video Description using Bidirectional Recurrent Neural Networks
Álvaro Peris
Marc Bolaños
Petia Radeva
F. Casacuberta
20
33
0
12 Apr 2016
Attributes as Semantic Units between Natural Language and Visual Recognition
Marcus Rohrbach
VLM
16
3
0
12 Apr 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar A. Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
20
1,224
0
06 Apr 2016
Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
Wei Shen
Kai Zhao
Yuan Jiang
Yan Wang
Zhijiang Zhang
X. Bai
12
104
0
31 Mar 2016
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus
Iulian Serban
Alberto García-Durán
Çağlar Gülçehre
Sungjin Ahn
A. Chandar
Aaron Courville
Yoshua Bengio
LRM
18
287
0
22 Mar 2016
Image Captioning with Semantic Attention
Quanzeng You
Hailin Jin
Zhaowen Wang
Chen Fang
Jiebo Luo
VLM
61
1,652
0
12 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li-Jia Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
57
5,652
0
23 Feb 2016
Learning Distributed Representations of Sentences from Unlabelled Data
Felix Hill
Kyunghyun Cho
Anna Korhonen
SSL
21
570
0
10 Feb 2016
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
188
515
0
26 Jan 2016
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Raffaella Bernardi
Ruken Cakici
Desmond Elliott
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
Frank Keller
A. Muscat
Barbara Plank
EGVM
VLM
27
363
0
15 Jan 2016
Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels
Ishan Misra
C. L. Zitnick
Margaret Mitchell
Ross B. Girshick
NoLa
8
218
0
22 Dec 2015
Neural Self Talk: Image Understanding via Continuous Questioning and Answering
Yezhou Yang
Yi Li
Cornelia Fermuller
Yiannis Aloimonos
16
24
0
10 Dec 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
66
1,159
0
24 Nov 2015
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
Qi Wu
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
14
370
0
22 Nov 2015
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas
L. Yao
C. Pal
Aaron Courville
MDE
37
692
0
19 Nov 2015
Oracle performance for visual captioning
L. Yao
Nicolas Ballas
Kyunghyun Cho
John R. Smith
Yoshua Bengio
VLM
36
8
0
14 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Fei Wu
Yueting Zhuang
16
385
0
11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks
Haonan Yu
Jiang Wang
Zhiheng Huang
Yi Yang
Wenyuan Xu
44
560
0
26 Oct 2015
Multilingual Image Description with Neural Sequence Models
Desmond Elliott
Stella Frank
Eva Hasler
VLM
22
75
0
15 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments
A. Mathews
Lexing Xie
Xuming He
26
221
0
06 Oct 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
28
117
0
20 Jun 2015
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
30
110
0
04 Jun 2015
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
24
443
0
03 Jun 2015
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
32
711
0
08 May 2015
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
24
1,416
0
03 May 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
66
5,369
0
03 May 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images
Junhua Mao
Xu Wei
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
25
154
0
25 Apr 2015
Describing Videos by Exploiting Temporal Structure
L. Yao
Atousa Torabi
Kyunghyun Cho
Nicolas Ballas
C. Pal
Hugo Larochelle
Aaron Courville
46
1,062
0
27 Feb 2015
Salient Object Detection: A Benchmark
Ali Borji
Ming-Ming Cheng
Huaizu Jiang
Jia Li
32
1,719
0
05 Jan 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
65
1,235
0
20 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
21
5,556
0
07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
73
4,401
0
20 Nov 2014
From Captions to Visual Concepts and Back
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
...
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
VLM
27
1,307
0
18 Nov 2014
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
53
5,993
0
17 Nov 2014
Previous
1
2
3
...
26
27
28