ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1504.00325
  4. Cited By
Microsoft COCO Captions: Data Collection and Evaluation Server

Microsoft COCO Captions: Data Collection and Evaluation Server

1 April 2015
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
ArXivPDFHTML

Papers citing "Microsoft COCO Captions: Data Collection and Evaluation Server"

50 / 1,391 papers shown
Title
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation
  Systems
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems
Ozan Caglayan
Mercedes García-Martínez
Adrien Bardet
Walid Aransa
Fethi Bougares
Loïc Barrault
27
65
0
01 Jun 2017
Emergence of Language with Multi-agent Games: Learning to Communicate
  with Sequences of Symbols
Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols
Serhii Havrylov
Ivan Titov
LLMAG
41
286
0
31 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,865
0
26 May 2017
Imagination improves Multimodal Translation
Imagination improves Multimodal Translation
Desmond Elliott
Ákos Kádár
29
136
0
11 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption
  Dataset
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
21
118
0
02 May 2017
Learning to Ask: Neural Question Generation for Reading Comprehension
Learning to Ask: Neural Question Generation for Reading Comprehension
Xinya Du
Junru Shao
Claire Cardie
3DV
34
658
0
29 Apr 2017
Mapping Instructions and Visual Observations to Actions with
  Reinforcement Learning
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Kumar Misra
John Langford
Yoav Artzi
21
247
0
28 Apr 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
19
57
0
26 Apr 2017
Multi-Task Video Captioning with Video and Entailment Generation
Multi-Task Video Captioning with Video and Entailment Generation
Ramakanth Pasunuru
Joey Tianyi Zhou
33
116
0
24 Apr 2017
Paying Attention to Descriptions Generated by Image Captioning Models
Paying Attention to Descriptions Generated by Image Captioning Models
Hamed R. Tavakoli
Rakshith Shetty
Ali Borji
Jorma T. Laaksonen
23
79
0
24 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks
An Analysis of Action Recognition Datasets for Language and Vision Tasks
Spandana Gella
Frank Keller
ObjD
24
11
0
24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better
  Visual Question Answering Datasets
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
22
37
0
24 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
34
324
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Egocentric Video Description based on Temporally-Linked Sequences
Egocentric Video Description based on Temporally-Linked Sequences
Marc Bolaños
Álvaro Peris
F. Casacuberta
Sergi Soler
Petia Radeva
EgoV
26
25
0
07 Apr 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy
  Risks in Images
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images
Rakshith Shetty
Bernt Schiele
Mario Fritz
35
223
0
30 Mar 2017
Speaking the Same Language: Matching Machine to Human Captions by
  Adversarial Training
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
19
142
0
30 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huan Zhang
Chuang Gan
Eric P. Xing
GAN
21
200
0
21 Mar 2017
Evolving Deep Neural Networks
Evolving Deep Neural Networks
Risto Miikkulainen
J. Liang
Elliot Meyerson
Aditya Rawal
Daniel Fink
...
B. Raju
H. Shahrzad
Arshak Navruzyan
Nigel P. Duffy
B. Hodjat
16
884
0
01 Mar 2017
Person Search with Natural Language Description
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
24
386
0
19 Feb 2017
Learning to Decode for Future Success
Learning to Decode for Future Success
Jiwei Li
Will Monroe
Dan Jurafsky
31
58
0
23 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
41
359
0
11 Jan 2017
Recurrent Image Captioner: Describing Images with Spatial-Invariant
  Transformation and Attention Filtering
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
30
9
0
15 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the
  Evidence
Attentive Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park
Lisa Anne Hendricks
Zeynep Akata
Bernt Schiele
Trevor Darrell
Marcus Rohrbach
AAML
24
79
0
14 Dec 2016
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question
  Answering
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering
Marc Bolaños
Álvaro Peris
F. Casacuberta
Petia Radeva
24
6
0
12 Dec 2016
ImageNet pre-trained models with batch normalization
ImageNet pre-trained models with batch normalization
Marcel Simon
E. Rodner
Joachim Denzler
VLM
SSeg
44
165
0
05 Dec 2016
Areas of Attention for Image Captioning
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
33
205
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
21
232
0
02 Dec 2016
Video Captioning with Multi-Faceted Attention
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
22
88
0
01 Dec 2016
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Linchao Zhu
Zhongwen Xu
Yi Yang
27
1
0
28 Nov 2016
A Simple, Fast Diverse Decoding Algorithm for Neural Generation
A Simple, Fast Diverse Decoding Algorithm for Neural Generation
Jiwei Li
Will Monroe
Dan Jurafsky
33
239
0
25 Nov 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language
  Modeling
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Zhe Gan
Chunyuan Li
Changyou Chen
Yunchen Pu
Qinliang Su
Lawrence Carin
BDL
UQCV
53
41
0
23 Nov 2016
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
44
425
0
23 Nov 2016
Video Captioning with Transferred Semantic Attributes
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
19
329
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
Recurrent Memory Addressing for describing videos
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
38
10
0
20 Nov 2016
Multimodal Memory Modelling for Video Captioning
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
Tieniu Tan
32
142
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
29
103
0
16 Nov 2016
A Semi-supervised Framework for Image Captioning
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
29
9
0
16 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
48
620
0
05 Nov 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
29
50
0
17 Oct 2016
Generating captions without looking beyond objects
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
17
16
0
12 Oct 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based
  Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
41
19,576
0
07 Oct 2016
Visual Fashion-Product Search at SK Planet
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
9
6
0
26 Sep 2016
Deep Learning for Video Classification and Captioning
Deep Learning for Video Classification and Captioning
Zuxuan Wu
Ting Yao
Yanwei Fu
Yu-Gang Jiang
3DV
VLM
22
123
0
22 Sep 2016
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for
  Object Skeleton Extraction in Natural Images
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
Wei Shen
Kai Zhao
Yuan Jiang
Yan Wang
X. Bai
Alan Yuille
14
99
0
13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
21
37
0
31 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
24
68
0
18 Aug 2016
Learning Joint Representations of Videos and Sentences with Web Image
  Search
Learning Joint Representations of Videos and Sentences with Web Image Search
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
N. Yokoya
18
94
0
08 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,884
0
29 Jul 2016
Previous
123...262728
Next