Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1410.1090
Cited By
Explain Images with Multimodal Recurrent Neural Networks
4 October 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Alan Yuille
VLM
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Explain Images with Multimodal Recurrent Neural Networks"
50 / 116 papers shown
Title
Commonly Uncommon: Semantic Sparsity in Situation Recognition
Mark Yatskar
Vicente Ordonez
Luke Zettlemoyer
Ali Farhadi
VLM
17
42
0
03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
155
3,136
0
02 Dec 2016
Video Captioning with Transferred Semantic Attributes
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
27
329
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
26
222
0
17 Nov 2016
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
37
9
0
16 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
48
620
0
05 Nov 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
851
0
21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion
Ankit Gandhi
Arjun Sharma
Arijit Biswas
Om Deshmukh
AI4TS
21
12
0
17 Sep 2016
Linking Image and Text with 2-Way Nets
Aviv Eisenschtat
Lior Wolf
27
176
0
29 Aug 2016
Learning to generalize to new compositions in image understanding
Yuval Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
26
67
0
27 Aug 2016
DeepDiary: Automatic Caption Generation for Lifelogging Image Streams
Chenyou Fan
David J. Crandall
DiffM
14
5
0
12 Aug 2016
Multilingual Visual Sentiment Concept Matching
Nikolaos Pappas
Miriam Redi
Mercan Topkara
Brendan Jou
Hongyi Liu
Tao Chen
Shih-Fu Chang
CVBM
26
14
0
07 Jun 2016
Automated Image Captioning for Rapid Prototyping and Resource Constrained Environments
Karan Sharma
Arun C. S. Kumar
S. Bhandarkar
20
0
0
04 Jun 2016
Annotation Order Matters: Recurrent Image Annotator for Arbitrary Length Image Tagging
Jiren Jin
Hideki Nakayama
3DV
VLM
30
69
0
18 Apr 2016
Generating Visual Explanations
Lisa Anne Hendricks
Zeynep Akata
Marcus Rohrbach
Jeff Donahue
Bernt Schiele
Trevor Darrell
VLM
FAtt
47
618
0
28 Mar 2016
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation
Hoo-Chang Shin
Kirk Roberts
Le Lu
Dina Demner-Fushman
Jianhua Yao
Ronald M. Summers
24
347
0
28 Mar 2016
Content-based Video Indexing and Retrieval Using Corr-LDA
R. Iyer
Sanjeel Parekh
Vikas Mohandoss
Anush Ramsurat
Bhiksha Raj
Rita Singh
16
22
0
27 Feb 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li-Jia Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
108
5,663
0
23 Feb 2016
Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features
Shijian Tang
Song Han
VLM
20
1
0
05 Feb 2016
Event Specific Multimodal Pattern Mining with Image-Caption Pairs
Hongzhi Li
Joseph G. Ellis
Shih-Fu Chang
6
2
0
31 Dec 2015
RNN Fisher Vectors for Action Recognition and Image Annotation
Guy Lev
Gil Sadeh
Benjamin Klein
Lior Wolf
19
163
0
12 Dec 2015
Neural Self Talk: Image Understanding via Continuous Questioning and Answering
Yezhou Yang
Yi Li
Cornelia Fermuller
Yiannis Aloimonos
19
24
0
10 Dec 2015
Natural Language Understanding with Distributed Representation
Kyunghyun Cho
GNN
BDL
21
55
0
24 Nov 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
74
1,160
0
24 Nov 2015
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
34
456
0
23 Nov 2015
Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes
Satwik Kottur
Ramakrishna Vedantam
José M. F. Moura
Devi Parikh
VLM
38
85
0
22 Nov 2015
Asymmetrically Weighted CCA And Hierarchical Kernel Sentence Embedding For Image & Text Retrieval
Youssef Mroueh
E. Marcheret
Vaibhava Goel
21
3
0
19 Nov 2015
Recurrent Neural Networks Hardware Implementation on FPGA
Andre Xian Ming Chang
B. Martini
Eugenio Culurciello
27
126
0
17 Nov 2015
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
37
349
0
16 Nov 2015
From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge
Somak Aditya
Yezhou Yang
Chitta Baral
Cornelia Fermuller
Yiannis Aloimonos
3DV
19
69
0
10 Nov 2015
Automatic Concept Discovery from Parallel Text and Visual Corpora
Chen Sun
Chuang Gan
Ram Nevatia
CoGe
12
107
0
24 Sep 2015
Image Representations and New Domains in Neural Image Captioning
Jack Hessel
Nicolas Savva
Michael J. Wilber
VLM
30
16
0
09 Aug 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
32
411
0
04 Jul 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
60
2,517
0
22 Jun 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
34
117
0
20 Jun 2015
Learning language through pictures
Grzegorz Chrupała
Ákos Kádár
A. Alishahi
VLM
SSL
35
65
0
11 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
27
261
0
01 Jun 2015
A Multi-scale Multiple Instance Video Description Network
Huijuan Xu
Subhashini Venugopalan
Vasili Ramanishka
Marcus Rohrbach
Kate Saenko
40
64
0
21 May 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao
Junhua Mao
Jie Zhou
Zhiheng Huang
Lei Wang
Wenyuan Xu
32
496
0
21 May 2015
Visual Semantic Role Labeling
Saurabh Gupta
Jitendra Malik
29
404
0
17 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning
Jacob Devlin
Saurabh Gupta
Ross B. Girshick
Margaret Mitchell
C. L. Zitnick
27
195
0
17 May 2015
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
44
711
0
08 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
41
535
0
07 May 2015
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin
Hao Cheng
Hao Fang
Saurabh Gupta
Li Deng
Xiaodong He
Geoffrey Zweig
Margaret Mitchell
32
281
0
07 May 2015
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
96
5,383
0
03 May 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images
Junhua Mao
Xu Wei
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
25
154
0
25 Apr 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Lin Ma
Zhengdong Lu
Lifeng Shang
Hang Li
38
337
0
23 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
97
2,434
0
01 Apr 2015
Generating Multi-Sentence Lingual Descriptions of Indoor Scenes
Dahua Lin
Chen Kong
Sanja Fidler
R. Urtasun
3DV
18
27
0
28 Feb 2015
Previous
1
2
3
Next