v1v2v3v4v5 (latest)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014

Yi Yang

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown

Title
Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals J. Alberg Zachary Chase Lipton AI4TS 71 48 0 13 Nov 2017
Phrase-based Image Captioning with Hierarchical LSTM Model Y. Tan Chee Seng Chan VLM 31 4 0 11 Nov 2017
A Neural-Symbolic Approach to Design of CAPTCHA Qiuyuan Huang P. Smolensky Xiaodong He Li Deng D. Wu AAML 63 1 0 29 Oct 2017
Learning Social Image Embedding with Deep Multimodal Attention Networks Feiran Huang Xiaoming Zhang Zhoujun Li Tao Mei Yueying He Zhonghua Zhao 59 20 0 18 Oct 2017
Tensor Product Generation Networks for Deep NLP Modeling Qiuyuan Huang P. Smolensky Xiaodong He Li Deng D. Wu 87 3 0 26 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention Mechanism Xiaojun Xu Xinyun Chen Chang-rui Liu Anna Rohrbach Trevor Darrell Basel Alomair AAML 106 41 0 25 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning Yang Xian Yingli Tian VLM 59 23 0 15 Sep 2017
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning Jiuxiang Gu Jianfei Cai G. Wang Tsuhan Chen 110 181 0 11 Sep 2017
Predicting Visual Features from Text for Image and Video Caption Retrieval Jianfeng Dong Xirong Li Cees G. M. Snoek 94 226 0 05 Sep 2017
Image2song: Song Retrieval via Bridging Image Content and Lyric Words Xuelong Li Di Hu Xiaoqiang Lu 53 10 0 19 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation Chuang Gan Yandong Li Haoxiang Li Chen Sun Boqing Gong 106 127 0 15 Aug 2017
Fluency-Guided Cross-Lingual Image Captioning Weiyu Lan Xirong Li Jianfeng Dong 71 95 0 15 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions Yining Li Chen Huang Xiaoou Tang Chen Change Loy 65 22 0 09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval Yuming Shen Li Liu Ling Shao Jingkuan Song 65 49 0 08 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator? Marc Tanti Albert Gatt K. Camilleri 48 56 0 07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention Shuang Li Tong Xiao Hongsheng Li Wei Yang Xiaogang Wang 103 230 0 07 Aug 2017
Discover and Learn New Objects from Documentaries Kai-xiang Chen Hang Song Chen Change Loy Dahua Lin ObjD 82 20 0 30 Jul 2017
Deep Interactive Region Segmentation and Captioning Ali Sharifi Boroujerdi M. Khanian M. Breuß 55 7 0 26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould Lei Zhang AIMat 239 4,232 0 25 Jul 2017
Image Pivoting for Learning Multilingual Multimodal Representations Spandana Gella Rico Sennrich Frank Keller Mirella Lapata SSL 90 78 0 24 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts Xuwang Yin Vicente Ordonez VLM 100 55 0 22 Jul 2017
Learning Visually Grounded Sentence Representations Douwe Kiela Alexis Conneau Allan Jabri Maximilian Nickel SSL 88 69 0 19 Jul 2017
Order-Free RNN with Visual Attention for Multi-Label Classification Shang-Fu Chen Yi-Chen Chen Chih-Kuan Yeh Y. Wang 121 145 0 18 Jul 2017
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network Zizhao Zhang Yuanpu Xie Fuyong Xing M. McGough Ling Yang MedIm 68 303 0 08 Jul 2017
Actor-Critic Sequence Training for Image Captioning Li Zhang Flood Sung Feng Liu Tao Xiang S. Gong Yongxin Yang Timothy M. Hospedales 86 111 0 29 Jun 2017
Image Captioning with Object Detection and Localization Zhongliang Yang Yujin Zhang S. Rehman Yongfeng Huang ObjD VLM 50 47 0 08 Jun 2017
Order embeddings and character-level convolutions for multimodal alignment Jonatas Wehrmann Anderson Mattjie Rodrigo C. Barros 37 27 0 03 Jun 2017
Listen, Interact and Talk: Learning to Speak via Interaction Haichao Zhang Haonan Yu Wenyuan Xu 77 13 0 28 May 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 175 2,967 0 26 May 2017
Deep image representations using caption generators Konda Reddy Mopuri Vishal B. Athreya R. Venkatesh Babu VLM SSL 26 1 0 25 May 2017
Attention-based Natural Language Person Retrieval Tao Zhou Muhao Chen Jie Yu Demetri Terzopoulos 39 14 0 24 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN Syed Ashar Javed A. Nelakanti VLM 81 10 0 11 May 2017
Image Annotation using Multi-Layer Sparse Coding Amara Tariq H. Foroosh 31 2 0 06 May 2017
TALL: Temporal Activity Localization via Language Query J. Gao Chen Sun Zhenheng Yang Ram Nevatia 174 828 0 05 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset Yuya Yoshikawa Yutaro Shigeto A. Takeuchi 3DV 69 118 0 02 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries Masataka Yamaguchi Kuniaki Saito Yoshitaka Ushiku Tatsuya Harada 96 58 0 26 Apr 2017
Inception Recurrent Convolutional Neural Network for Object Recognition Md. Zahangir Alom Mahmudul Hasan C. Yakopcic T. Taha 60 88 0 25 Apr 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition Yufei Wang Zhe Lin Xiaohui Shen Scott D. Cohen G. Cottrell 89 106 0 23 Apr 2017
Spatial Memory for Context Reasoning in Object Detection Xinlei Chen Abhinav Gupta ObjD 101 166 0 13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries Y. Zhang Luyao Yuan Yijie Guo Zhiyuan He I-An Huang Honglak Lee ObjD 92 57 0 12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren Xiaoyu Wang Ning Zhang Xutao Lv Li Li 65 324 0 12 Apr 2017
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation Ngan Le Kha Gia Quach Khoa Luu Marios Savvides Chenchen Zhu 69 71 0 12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders Unnat Jain Ziyu Zhang Alex Schwing 72 152 0 11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang Yin Li Jing-ling Huang Svetlana Lazebnik VLM 110 498 0 11 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People Anna Rohrbach Marcus Rohrbach Siyu Tang Seong Joon Oh Bernt Schiele 417 72 0 05 Apr 2017
Weakly Supervised Dense Video Captioning Zhiqiang Shen Jianguo Li Zhou Su Minjun Li Yurong Chen Yu-Gang Jiang Xiangyang Xue 78 135 0 05 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search Kan Chen Trung Bui Chen Fang Zhaowen Wang Ram Nevatia 69 38 0 03 Apr 2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training Rakshith Shetty Marcus Rohrbach Lisa Anne Hendricks Mario Fritz Bernt Schiele 89 144 0 30 Mar 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 153 828 0 29 Mar 2017
Where to put the Image in an Image Caption Generator Marc Tanti Albert Gatt K. Camilleri 80 96 0 27 Mar 2017