Show and Tell: A Neural Image Caption Generator

17 November 2014

Papers citing "Show and Tell: A Neural Image Caption Generator"

50 / 2,022 papers shown

Title
Visual Fashion-Product Search at SK Planet Taewan Kim Seyeong Kim Sangil Na Hayoon Kim Moonki Kim Beyeongki Jeon 9 6 0 26 Sep 2016
Deep Learning for Video Classification and Captioning Zuxuan Wu Ting Yao Yanwei Fu Yu-Gang Jiang 3DV VLM 24 123 0 22 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA) Andrew Shin Yoshitaka Ushiku Tatsuya Harada 49 14 0 21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 27 850 0 21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion Ankit Gandhi Arjun Sharma Arijit Biswas Om Deshmukh AI4TS 19 12 0 17 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention Yuntian Deng Anssi Kanervisto Jeffrey Ling Alexander M. Rush 19 226 0 16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories Mark Harmon Abdolghani Ebrahimi P. Lucey Diego Klabjan GAN 19 18 0 15 Sep 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016
Multimodal Attention for Neural Machine Translation Ozan Caglayan Loïc Barrault Fethi Bougares 31 75 0 13 Sep 2016
Sensor-based Gait Parameter Extraction with Deep Convolutional Neural Networks J. Hannink T. Kautz C. Pasluosta Karl-Gunter Gasmann J. Klucken Björn Eskofier 27 159 0 12 Sep 2016
Learning Action Concept Trees and Semantic Alignment Networks from Image-Description Data J. Gao Ram Nevatia 11 1 0 08 Sep 2016
Hierarchical Multiscale Recurrent Neural Networks Junyoung Chung Sungjin Ahn Yoshua Bengio BDL 40 534 0 06 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering C. L. Zitnick Aishwarya Agrawal Stanislaw Antol Margaret Mitchell Dhruv Batra Devi Parikh 24 37 0 31 Aug 2016
Linking Image and Text with 2-Way Nets Aviv Eisenschtat Lior Wolf 27 176 0 29 Aug 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints Junqi Jin Ziang Yan Kun Fu Nan Jiang Changshui Zhang 22 2 0 29 Aug 2016
Learning to generalize to new compositions in image understanding Y. Atzmon Jonathan Berant Vahid Kezami Amir Globerson Gal Chechik 26 67 0 27 Aug 2016
Title Generation for User Generated Videos Kuo-Hao Zeng Tseng-Hung Chen Juan Carlos Niebles Min Sun 35 69 0 25 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning Y. Tan Chee Seng Chan VLM 22 29 0 20 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning Yusuke Sugano Andreas Bulling 24 68 0 18 Aug 2016
Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation Rakshith Shetty Jorma T. Laaksonen 13 94 0 17 Aug 2016
Modelling Student Behavior using Granular Large Scale Action Data from a MOOC Steven Tang Joshua C. Peterson Z. Pardos AI4Ed 14 19 0 16 Aug 2016
DeepDiary: Automatic Caption Generation for Lifelogging Image Streams Chenyou Fan David J. Crandall DiffM 14 5 0 12 Aug 2016
Canonical Correlation Inference for Mapping Abstract Scenes to Text Nikos Papasarantopoulos Helen Jiang Shay B. Cohen 6 1 0 09 Aug 2016
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task Ashkan Mokarian Mateusz Malinowski Mario Fritz 27 5 0 09 Aug 2016
Learning Joint Representations of Videos and Sentences with Web Image Search Mayu Otani Yuta Nakashima Esa Rahtu J. Heikkilä N. Yokoya 20 94 0 08 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient Yuping Luo Chung-Cheng Chiu Navdeep Jaitly Ilya Sutskever OffRL 13 46 0 03 Aug 2016
RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks P. Doetsch Albert Zeyer P. Voigtlaender Ilya Kulikov Ralf Schluter Hermann Ney 18 74 0 02 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding Varun K. Nagaraja Vlad I. Morariu Larry S. Davis 31 143 0 01 Aug 2016
Modeling Context in Referring Expressions Licheng Yu Patrick Poirson Shan Yang Alexander C. Berg Tamara L. Berg 30 1,227 0 31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation Peter Anderson Basura Fernando Mark Johnson Stephen Gould EGVM 36 1,884 0 29 Jul 2016
An Actor-Critic Algorithm for Sequence Prediction Dzmitry Bahdanau Philemon Brakel Kelvin Xu Anirudh Goyal Ryan J. Lowe Joelle Pineau Aaron Courville Yoshua Bengio 57 635 0 24 Jul 2016
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition Jun Liu Amir Shahroudy Dong Xu Gang Wang 49 1,099 0 24 Jul 2016
A Comprehensive Survey on Cross-modal Retrieval Kun Wang Qiyue Yin Wei Wang Shu Wu Liang Wang 42 294 0 21 Jul 2016
Constructing a Natural Language Inference Dataset using Generative Neural Networks Janez Starc Dunja Mladenić 19 7 0 20 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets Qi Wu Damien Teney Peng Wang Chunhua Shen A. Dick Anton Van Den Hengel 32 413 0 20 Jul 2016
An Empirical Evaluation of various Deep Learning Architectures for Bi-Sequence Classification Tasks Anirban Laha V. Raykar SSeg 17 16 0 17 Jul 2016
Neural Discourse Modeling of Conversations J. Pierre M. Butler Jacob Portnoff Luis Aguilar 6 2 0 15 Jul 2016
End-to-end training of object class detectors for mean average precision Paul Henderson V. Ferrari ObjD 33 259 0 12 Jul 2016
Domain Adaptation for Neural Networks by Parameter Augmentation Yusuke Watanabe Kazuma Hashimoto Yoshimasa Tsuruoka OOD 19 6 0 01 Jul 2016
Sequence-Level Knowledge Distillation Yoon Kim Alexander M. Rush 47 1,098 0 25 Jun 2016
Captioning Images with Diverse Objects Subhashini Venugopalan Lisa Anne Hendricks Marcus Rohrbach Raymond J. Mooney Trevor Darrell Kate Saenko VLM 27 178 0 24 Jun 2016
Is a Picture Worth Ten Thousand Words in a Review Dataset? Roberto Camacho Barranco Laura M. Rodriguez Rebecca Urbina M. S. Hossain LMTD 36 3 0 23 Jun 2016
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks Jindrich Libovický Jindřich Helcl Marek Tlustý Pavel Pecina Ondrej Bojar 11 67 0 23 Jun 2016
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions F. Carrara Andrea Esuli T. Fagni Fabrizio Falchi Alejandro Moreo DiffM 24 31 0 23 Jun 2016
LSTM-Based Predictions for Proactive Information Retrieval Petri Luukkonen M. Koskela P. Floréen RALM HAI 20 10 0 20 Jun 2016
DualNet: Domain-Invariant Network for Visual Question Answering Kuniaki Saito Andrew Shin Yoshitaka Ushiku Tatsuya Harada 29 58 0 20 Jun 2016
Smart Reply: Automated Response Suggestion for Email Anjuli Kannan Karol Kurach Sujith Ravi Tobias Kaufmann Andrew Tomkins ... G. Corrado László Lukács Marina Ganea Peter Young Vivek Ramavajjala VLM 16 309 0 15 Jun 2016
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation Amrita Saha Mitesh M. Khapra A. Chandar Janarthanan Rajendran Kyunghyun Cho 22 18 0 15 Jun 2016
Unsupervised Learning of Predictors from Unpaired Input-Output Samples Jianshu Chen Po-Sen Huang Xiaodong He Jianfeng Gao Li Deng OOD SSL 26 8 0 15 Jun 2016
Bidirectional Long-Short Term Memory for Video Description Yi Bin Yang Yang Zi Huang Fumin Shen Xing Xu Heng Tao Shen 39 60 0 15 Jun 2016