ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6632
  4. Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
    VLM
ArXivPDFHTML

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown
Title
Improving Classification by Improving Labelling: Introducing
  Probabilistic Multi-Label Object Interaction Recognition
Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition
Michael Wray
Davide Moltisanti
W. Mayol-Cuevas
Dima Damen
30
2
0
24 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe-nan Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
36
234
0
23 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huatian Zhang
Chuang Gan
Eric Xing
GAN
27
200
0
21 Mar 2017
Person Search with Natural Language Description
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
24
386
0
19 Feb 2017
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and
  Lipreading
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading
Chunlin Tian
Weijun Ji
27
7
0
16 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
41
359
0
11 Jan 2017
Learning Visual N-Grams from Web Data
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
L. V. D. van der Maaten
VLM
20
136
0
29 Dec 2016
Image-Text Multi-Modal Representation Learning by Adversarial
  Backpropagation
Image-Text Multi-Modal Representation Learning by Adversarial Backpropagation
Gwangbeen Park
Woobin Im
GAN
16
25
0
26 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language
  Machine Comprehension Task
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
27
9
0
22 Dec 2016
An Empirical Study of Language CNN for Image Captioning
An Empirical Study of Language CNN for Image Captioning
Jiuxiang Gu
G. Wang
Jianfei Cai
Tsuhan Chen
31
132
0
21 Dec 2016
Recurrent Image Captioner: Describing Images with Spatial-Invariant
  Transformation and Attention Filtering
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
38
9
0
15 Dec 2016
Text-guided Attention Model for Image Captioning
Text-guided Attention Model for Image Captioning
Jonghwan Mun
Minsu Cho
Bohyung Han
VLM
15
92
0
12 Dec 2016
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image
  Captioning
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
85
1,442
0
06 Dec 2016
Areas of Attention for Image Captioning
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
33
205
0
03 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
23
232
0
02 Dec 2016
Video Captioning with Multi-Faceted Attention
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
24
88
0
01 Dec 2016
Training and Evaluating Multimodal Word Embeddings with Large-scale Web
  Annotated Images
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao
Jiajing Xu
Yushi Jing
Alan Yuille
13
48
0
24 Nov 2016
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
53
425
0
23 Nov 2016
Learning Generic Sentence Representations Using Convolutional Neural
  Networks
Learning Generic Sentence Representations Using Convolutional Neural Networks
Zhe Gan
Yunchen Pu
Ricardo Henao
Chunyuan Li
Xiaodong He
Lawrence Carin
SSL
34
98
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
41
14
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
36
373
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks
  for Image Captioning
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
27
1,650
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
37
103
0
16 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
36
664
0
02 Nov 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
27
235
0
05 Oct 2016
A Survey of Multi-View Representation Learning
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
37
509
0
03 Oct 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
850
0
21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for
  Multimodal Information Fusion
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion
Ankit Gandhi
Arjun Sharma
Arijit Biswas
Om Deshmukh
AI4TS
21
12
0
17 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent
  Trajectories
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
25
18
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
31
75
0
13 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
27
37
0
31 Aug 2016
Utilizing Large Scale Vision and Text Datasets for Image Segmentation
  from Referring Expressions
Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions
Ronghang Hu
Marcus Rohrbach
Subhashini Venugopalan
Trevor Darrell
VLM
17
18
0
30 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
22
29
0
20 Aug 2016
Detecting Sarcasm in Multimodal Social Platforms
Detecting Sarcasm in Multimodal Social Platforms
Rossano Schifanella
Paloma de Juan
Joel R. Tetreault
Liangliang Cao
23
167
0
08 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
32
1,230
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,885
0
29 Jul 2016
A Comprehensive Survey on Cross-modal Retrieval
A Comprehensive Survey on Cross-modal Retrieval
Kun Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
42
294
0
21 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
32
413
0
20 Jul 2016
Captioning Images with Diverse Objects
Captioning Images with Diverse Objects
Subhashini Venugopalan
Lisa Anne Hendricks
Marcus Rohrbach
Raymond J. Mooney
Trevor Darrell
Kate Saenko
VLM
27
178
0
24 Jun 2016
Picture It In Your Mind: Generating High Level Visual Representations
  From Textual Descriptions
Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions
F. Carrara
Andrea Esuli
T. Fagni
Fabrizio Falchi
Alejandro Moreo
DiffM
24
31
0
23 Jun 2016
Watch What You Just Said: Image Captioning with Text-Conditional
  Attention
Watch What You Just Said: Image Captioning with Text-Conditional Attention
Luowei Zhou
Chenliang Xu
Parker A. Koch
Jason J. Corso
VLM
22
44
0
15 Jun 2016
Deep Recurrent Models with Fast-Forward Connections for Neural Machine
  Translation
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
Jie Zhou
Ying Cao
Xuguang Wang
Peng Li
Wenyuan Xu
AIMat
19
215
0
14 Jun 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
167
1,465
0
06 Jun 2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent
  Neural Network
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network
Yu Liu
Jianlong Fu
Tao Mei
C. Chen
13
4
0
02 Jun 2016
Attention Correctness in Neural Image Captioning
Attention Correctness in Neural Image Captioning
Chenxi Liu
Junhua Mao
Fei Sha
Alan Yuille
3DV
38
220
0
31 May 2016
SNN: Stacked Neural Networks
SNN: Stacked Neural Networks
Milad Mohammadi
Subhasis Das
13
15
0
27 May 2016
Generative Adversarial Text to Image Synthesis
Generative Adversarial Text to Image Synthesis
Scott E. Reed
Zeynep Akata
Xinchen Yan
Lajanugen Logeswaran
Bernt Schiele
Honglak Lee
GAN
52
3,126
0
17 May 2016
Learning Deep Representations of Fine-grained Visual Descriptions
Learning Deep Representations of Fine-grained Visual Descriptions
Scott E. Reed
Zeynep Akata
Bernt Schiele
Honglak Lee
OCL
VLM
176
840
0
17 May 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
32
353
0
12 May 2016
Previous
123456789
Next