Microsoft COCO Captions: Data Collection and Evaluation Server

1 April 2015

Piotr Dollar

Papers citing "Microsoft COCO Captions: Data Collection and Evaluation Server"

50 / 1,391 papers shown

Title
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems Ozan Caglayan Mercedes García-Martínez Adrien Bardet Walid Aransa Fethi Bougares Loïc Barrault 27 65 0 01 Jun 2017
Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols Serhii Havrylov Ivan Titov LLMAG 41 286 0 31 May 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 15 2,865 0 26 May 2017
Imagination improves Multimodal Translation Desmond Elliott Ákos Kádár 29 136 0 11 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset Yuya Yoshikawa Yutaro Shigeto A. Takeuchi 3DV 21 118 0 02 May 2017
Learning to Ask: Neural Question Generation for Reading Comprehension Xinya Du Junru Shao Claire Cardie 3DV 34 658 0 29 Apr 2017
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning Dipendra Kumar Misra John Langford Yoav Artzi 21 247 0 28 Apr 2017
Spatio-temporal Person Retrieval via Natural Language Queries Masataka Yamaguchi Kuniaki Saito Yoshitaka Ushiku Tatsuya Harada 19 57 0 26 Apr 2017
Multi-Task Video Captioning with Video and Entailment Generation Ramakanth Pasunuru Joey Tianyi Zhou 33 116 0 24 Apr 2017
Paying Attention to Descriptions Generated by Image Captioning Models Hamed R. Tavakoli Rakshith Shetty Ali Borji Jorma T. Laaksonen 23 79 0 24 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks Spandana Gella Frank Keller ObjD 24 11 0 24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets Wei-Lun Chao Hexiang Hu Fei Sha 22 37 0 24 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren Xiaoyu Wang Ning Zhang Xutao Lv Li-Jia Li 34 324 0 12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang Yin Li Jing-ling Huang Svetlana Lazebnik VLM 27 494 0 11 Apr 2017
Egocentric Video Description based on Temporally-Linked Sequences Marc Bolaños Álvaro Peris F. Casacuberta Sergi Soler Petia Radeva EgoV 26 25 0 07 Apr 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images Rakshith Shetty Bernt Schiele Mario Fritz 35 223 0 30 Mar 2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training Rakshith Shetty Marcus Rohrbach Lisa Anne Hendricks Mario Fritz Bernt Schiele 19 142 0 30 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation Xiaodan Liang Zhiting Hu Huan Zhang Chuang Gan Eric P. Xing GAN 21 200 0 21 Mar 2017
Evolving Deep Neural Networks Risto Miikkulainen J. Liang Elliot Meyerson Aditya Rawal Daniel Fink ... B. Raju H. Shahrzad Arshak Navruzyan Nigel P. Duffy B. Hodjat 16 884 0 01 Mar 2017
Person Search with Natural Language Description Shuang Li Tong Xiao Hongsheng Li Bolei Zhou Dayu Yue Xiaogang Wang 24 386 0 19 Feb 2017
Learning to Decode for Future Success Jiwei Li Will Monroe Dan Jurafsky 31 58 0 23 Jan 2017
Attention-Based Multimodal Fusion for Video Description Chiori Hori Takaaki Hori Teng-Yok Lee Kazuhiro Sumi J. Hershey Tim K. Marks 41 359 0 11 Jan 2017
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering Hao Liu Yang Yang Fumin Shen Lixin Duan Heng Tao Shen 30 9 0 15 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence Dong Huk Park Lisa Anne Hendricks Zeynep Akata Bernt Schiele Trevor Darrell Marcus Rohrbach AAML 24 79 0 14 Dec 2016
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering Marc Bolaños Álvaro Peris F. Casacuberta Petia Radeva 24 6 0 12 Dec 2016
ImageNet pre-trained models with batch normalization Marcel Simon E. Rodner Joachim Denzler VLM SSeg 44 165 0 05 Dec 2016
Areas of Attention for Image Captioning M. Pedersoli Thomas Lucas Cordelia Schmid Jakob Verbeek 33 205 0 03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson Basura Fernando Mark Johnson Stephen Gould 21 232 0 02 Dec 2016
Video Captioning with Multi-Faceted Attention Xiang Long Chuang Gan Gerard de Melo 22 88 0 01 Dec 2016
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos Linchao Zhu Zhongwen Xu Yi Yang 27 1 0 28 Nov 2016
A Simple, Fast Diverse Decoding Algorithm for Neural Generation Jiwei Li Will Monroe Dan Jurafsky 33 239 0 25 Nov 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling Zhe Gan Chunyuan Li Changyou Chen Yunchen Pu Qinliang Su Lawrence Carin BDL UQCV 53 41 0 23 Nov 2016
Semantic Compositional Networks for Visual Captioning Zhe Gan Chuang Gan Xiaodong He Yunchen Pu Kenneth Tran Jianfeng Gao Lawrence Carin Li Deng CoGe 44 425 0 23 Nov 2016
Video Captioning with Transferred Semantic Attributes Yingwei Pan Ting Yao Houqiang Li Tao Mei 19 329 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li-Jia Li VLM 30 169 0 21 Nov 2016
Recurrent Memory Addressing for describing videos A. Jain Abhinav Agarwalla Kumar Krishna Agrawal Pabitra Mitra 38 10 0 20 Nov 2016
Multimodal Memory Modelling for Video Captioning Junbo Wang Wei Wang Yan Huang Liang Wang Tieniu Tan 32 142 0 17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation Feng Liu Tao Xiang Timothy M. Hospedales Wankou Yang Changyin Sun 29 103 0 16 Nov 2016
A Semi-supervised Framework for Image Captioning Wenhu Chen Aurelien Lucchi Thomas Hofmann 29 9 0 16 Nov 2016
Boosting Image Captioning with Attributes Ting Yao Yingwei Pan Yehao Li Zhaofan Qiu Tao Mei VLM 48 620 0 05 Nov 2016
Spatio-Temporal Attention Models for Grounded Video Captioning M. Zanfir Elisabeta Marinoiu C. Sminchisescu 29 50 0 17 Oct 2016
Generating captions without looking beyond objects Hendrik Heuer Christof Monz A. Smeulders 17 16 0 12 Oct 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju Michael Cogswell Abhishek Das Ramakrishna Vedantam Devi Parikh Dhruv Batra FAtt 41 19,576 0 07 Oct 2016
Visual Fashion-Product Search at SK Planet Taewan Kim Seyeong Kim Sangil Na Hayoon Kim Moonki Kim Beyeongki Jeon 9 6 0 26 Sep 2016
Deep Learning for Video Classification and Captioning Zuxuan Wu Ting Yao Yanwei Fu Yu-Gang Jiang 3DV VLM 22 123 0 22 Sep 2016
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images Wei Shen Kai Zhao Yuan Jiang Yan Wang X. Bai Alan Yuille 14 99 0 13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering C. L. Zitnick Aishwarya Agrawal Stanislaw Antol Margaret Mitchell Dhruv Batra Devi Parikh 21 37 0 31 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning Yusuke Sugano Andreas Bulling 24 68 0 18 Aug 2016
Learning Joint Representations of Videos and Sentences with Web Image Search Mayu Otani Yuta Nakashima Esa Rahtu J. Heikkilä N. Yokoya 18 94 0 08 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation Peter Anderson Basura Fernando Mark Johnson Stephen Gould EGVM 36 1,884 0 29 Jul 2016