v1v2 (latest)

Microsoft COCO Captions: Data Collection and Evaluation Server

1 April 2015

Piotr Dollar

Papers citing "Microsoft COCO Captions: Data Collection and Evaluation Server"

21 / 1,421 papers shown

Title
DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson A. Karpathy Li Fei-Fei VLM 131 1,172 0 24 Nov 2015
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources Qi Wu Peng Wang Chunhua Shen A. Dick Anton Van Den Hengel 75 372 0 22 Nov 2015
Delving Deeper into Convolutional Networks for Learning Video Representations Nicolas Ballas L. Yao C. Pal Aaron Courville MDE 102 703 0 19 Nov 2015
Oracle performance for visual captioning L. Yao Nicolas Ballas Kyunghyun Cho John R. Smith Yoshua Bengio VLM 103 8 0 14 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning Pingbo Pan Zhongwen Xu Yi Yang Leilei Gan Yueting Zhuang 59 385 0 11 Nov 2015
Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks Haonan Yu Jiang Wang Zhiheng Huang Yi Yang Wenyuan Xu 100 560 0 26 Oct 2015
Multilingual Image Description with Neural Sequence Models Desmond Elliott Stella Frank Eva Hasler VLM 131 76 0 15 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments A. Mathews Lexing Xie Xuming He 110 222 0 06 Oct 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization Junqi Jin Kun Fu Runpeng Cui Fei Sha Changshui Zhang 93 117 0 20 Jun 2015
The Long-Short Story of Movie Description Anna Rohrbach Marcus Rohrbach Bernt Schiele VLM 74 111 0 04 Jun 2015
What value do explicit high level concepts have in vision to language problems? Qi Wu Chunhua Shen Lingqiao Liu A. Dick Anton Van Den Hengel 83 444 0 03 Jun 2015
Exploring Models and Data for Image Question Answering Mengye Ren Ryan Kiros R. Zemel 102 720 0 08 May 2015
Sequence to Sequence -- Video to Text Subhashini Venugopalan Marcus Rohrbach Jeff Donahue Raymond J. Mooney Trevor Darrell Kate Saenko 150 1,421 0 03 May 2015
VQA: Visual Question Answering Aishwarya Agrawal Jiasen Lu Stanislaw Antol Margaret Mitchell C. L. Zitnick Dhruv Batra Devi Parikh CoGe 254 5,519 0 03 May 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images Junhua Mao Xu Wei Yi Yang Jiang Wang Zhiheng Huang Alan Yuille 107 155 0 25 Apr 2015
Describing Videos by Exploiting Temporal Structure L. Yao Atousa Torabi Kyunghyun Cho Nicolas Ballas C. Pal Hugo Larochelle Aaron Courville 158 1,064 0 27 Feb 2015
Salient Object Detection: A Benchmark Ali Borji Ming-Ming Cheng Huaizu Jiang Jia Li 107 1,731 0 05 Jan 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Zhiheng Huang Alan Yuille VLM 196 1,241 0 20 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions A. Karpathy Li Fei-Fei 156 5,602 0 07 Dec 2014
CIDEr: Consensus-based Image Description Evaluation Ramakrishna Vedantam C. L. Zitnick Devi Parikh 315 4,531 0 20 Nov 2014
From Captions to Visual Concepts and Back Hao Fang Saurabh Gupta F. Iandola R. Srivastava Li Deng ... Xiaodong He Margaret Mitchell John C. Platt C. L. Zitnick Geoffrey Zweig VLM 136 1,312 0 18 Nov 2014