Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.6632
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
20 December 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"
50 / 417 papers shown
Title
Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition
Michael Wray
Davide Moltisanti
W. Mayol-Cuevas
Dima Damen
67
2
0
24 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
94
241
0
23 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huatian Zhang
Chuang Gan
Eric Xing
GAN
92
203
0
21 Mar 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
105
397
0
19 Feb 2017
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading
Chunlin Tian
Weijun Ji
32
7
0
16 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
95
361
0
11 Jan 2017
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
Laurens van der Maaten
VLM
85
138
0
29 Dec 2016
Image-Text Multi-Modal Representation Learning by Adversarial Backpropagation
Gwangbeen Park
Woobin Im
GAN
68
25
0
26 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
80
9
0
22 Dec 2016
An Empirical Study of Language CNN for Image Captioning
Jiuxiang Gu
G. Wang
Jianfei Cai
Tsuhan Chen
95
134
0
21 Dec 2016
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
65
9
0
15 Dec 2016
Text-guided Attention Model for Image Captioning
Jonghwan Mun
Minsu Cho
Bohyung Han
VLM
59
93
0
12 Dec 2016
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
142
1,458
0
06 Dec 2016
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
115
206
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
95
237
0
02 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
87
88
0
01 Dec 2016
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao
Jiajing Xu
Yushi Jing
Alan Yuille
48
48
0
24 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
112
427
0
23 Nov 2016
Learning Generic Sentence Representations Using Convolutional Neural Networks
Zhe Gan
Yunchen Pu
Ricardo Henao
Chunyuan Li
Xiaodong He
Lawrence Carin
SSL
95
98
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
72
14
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li Li
VLM
103
170
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
106
379
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
84
1,666
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
107
104
0
16 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
134
670
0
02 Nov 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
112
244
0
05 Oct 2016
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
346
517
0
03 Oct 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
138
856
0
21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion
Ankit Gandhi
Arjun Sharma
Arijit Biswas
Om Deshmukh
AI4TS
34
12
0
17 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
78
20
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
84
76
0
13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
83
37
0
31 Aug 2016
Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions
Ronghang Hu
Marcus Rohrbach
Subhashini Venugopalan
Trevor Darrell
VLM
65
18
0
30 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
114
29
0
20 Aug 2016
Detecting Sarcasm in Multimodal Social Platforms
Rossano Schifanella
Paloma de Juan
Joel R. Tetreault
Liangliang Cao
77
170
0
08 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
135
1,281
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
162
1,930
0
29 Jul 2016
A Comprehensive Survey on Cross-modal Retrieval
Kun Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
88
298
0
21 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
118
418
0
20 Jul 2016
Captioning Images with Diverse Objects
Subhashini Venugopalan
Lisa Anne Hendricks
Marcus Rohrbach
Raymond J. Mooney
Trevor Darrell
Kate Saenko
VLM
91
178
0
24 Jun 2016
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions
F. Carrara
Andrea Esuli
T. Fagni
Fabrizio Falchi
Alejandro Moreo
DiffM
56
30
0
23 Jun 2016
Watch What You Just Said: Image Captioning with Text-Conditional Attention
Luowei Zhou
Chenliang Xu
Parker A. Koch
Jason J. Corso
VLM
72
44
0
15 Jun 2016
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
Jie Zhou
Ying Cao
Xuguang Wang
Peng Li
Wenyuan Xu
AIMat
92
217
0
14 Jun 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
353
1,471
0
06 Jun 2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network
Yu Liu
Jianlong Fu
Tao Mei
C. Chen
51
4
0
02 Jun 2016
Attention Correctness in Neural Image Captioning
Chenxi Liu
Junhua Mao
Fei Sha
Alan Yuille
3DV
102
221
0
31 May 2016
SNN: Stacked Neural Networks
Milad Mohammadi
Subhasis Das
36
15
0
27 May 2016
Generative Adversarial Text to Image Synthesis
Scott E. Reed
Zeynep Akata
Xinchen Yan
Lajanugen Logeswaran
Bernt Schiele
Honglak Lee
GAN
211
3,152
0
17 May 2016
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
211
843
0
17 May 2016
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
86
361
0
12 May 2016
Previous
1
2
3
4
5
6
7
8
9
Next