ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6632
  4. Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
v1v2v3v4v5 (latest)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
    VLM
ArXiv (abs)PDFHTML

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown
Title
Improving Factor-Based Quantitative Investing by Forecasting Company
  Fundamentals
Improving Factor-Based Quantitative Investing by Forecasting Company Fundamentals
J. Alberg
Zachary Chase Lipton
AI4TS
71
48
0
13 Nov 2017
Phrase-based Image Captioning with Hierarchical LSTM Model
Phrase-based Image Captioning with Hierarchical LSTM Model
Y. Tan
Chee Seng Chan
VLM
31
4
0
11 Nov 2017
A Neural-Symbolic Approach to Design of CAPTCHA
A Neural-Symbolic Approach to Design of CAPTCHA
Qiuyuan Huang
P. Smolensky
Xiaodong He
Li Deng
D. Wu
AAML
63
1
0
29 Oct 2017
Learning Social Image Embedding with Deep Multimodal Attention Networks
Learning Social Image Embedding with Deep Multimodal Attention Networks
Feiran Huang
Xiaoming Zhang
Zhoujun Li
Tao Mei
Yueying He
Zhonghua Zhao
59
20
0
18 Oct 2017
Tensor Product Generation Networks for Deep NLP Modeling
Tensor Product Generation Networks for Deep NLP Modeling
Qiuyuan Huang
P. Smolensky
Xiaodong He
Li Deng
D. Wu
87
3
0
26 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention
  Mechanism
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Xiaojun Xu
Xinyun Chen
Chang-rui Liu
Anna Rohrbach
Trevor Darrell
Basel Alomair
AAML
106
41
0
25 Sep 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
59
23
0
15 Sep 2017
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
Stack-Captioning: Coarse-to-Fine Learning for Image Captioning
Jiuxiang Gu
Jianfei Cai
G. Wang
Tsuhan Chen
110
181
0
11 Sep 2017
Predicting Visual Features from Text for Image and Video Caption
  Retrieval
Predicting Visual Features from Text for Image and Video Caption Retrieval
Jianfeng Dong
Xirong Li
Cees G. M. Snoek
94
226
0
05 Sep 2017
Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Xuelong Li
Di Hu
Xiaoqiang Lu
53
10
0
19 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised
  Attention in VQA and Question-Focused Semantic Segmentation
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
106
127
0
15 Aug 2017
Fluency-Guided Cross-Lingual Image Captioning
Fluency-Guided Cross-Lingual Image Captioning
Weiyu Lan
Xirong Li
Jianfeng Dong
71
95
0
15 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative Questions
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
65
22
0
09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual
  Cross Retrieval
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval
Yuming Shen
Li Liu
Ling Shao
Jingkuan Song
65
49
0
08 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption
  Generator?
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?
Marc Tanti
Albert Gatt
K. Camilleri
48
56
0
07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Identity-Aware Textual-Visual Matching with Latent Co-attention
Shuang Li
Tong Xiao
Hongsheng Li
Wei Yang
Xiaogang Wang
103
230
0
07 Aug 2017
Discover and Learn New Objects from Documentaries
Discover and Learn New Objects from Documentaries
Kai-xiang Chen
Hang Song
Chen Change Loy
Dahua Lin
ObjD
82
20
0
30 Jul 2017
Deep Interactive Region Segmentation and Captioning
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
55
7
0
26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
239
4,232
0
25 Jul 2017
Image Pivoting for Learning Multilingual Multimodal Representations
Image Pivoting for Learning Multilingual Multimodal Representations
Spandana Gella
Rico Sennrich
Frank Keller
Mirella Lapata
SSL
90
78
0
24 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
Xuwang Yin
Vicente Ordonez
VLM
100
55
0
22 Jul 2017
Learning Visually Grounded Sentence Representations
Learning Visually Grounded Sentence Representations
Douwe Kiela
Alexis Conneau
Allan Jabri
Maximilian Nickel
SSL
88
69
0
19 Jul 2017
Order-Free RNN with Visual Attention for Multi-Label Classification
Order-Free RNN with Visual Attention for Multi-Label Classification
Shang-Fu Chen
Yi-Chen Chen
Chih-Kuan Yeh
Y. Wang
121
145
0
18 Jul 2017
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis
  Network
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Zizhao Zhang
Yuanpu Xie
Fuyong Xing
M. McGough
Ling Yang
MedIm
68
303
0
08 Jul 2017
Actor-Critic Sequence Training for Image Captioning
Actor-Critic Sequence Training for Image Captioning
Li Zhang
Flood Sung
Feng Liu
Tao Xiang
S. Gong
Yongxin Yang
Timothy M. Hospedales
86
111
0
29 Jun 2017
Image Captioning with Object Detection and Localization
Image Captioning with Object Detection and Localization
Zhongliang Yang
Yujin Zhang
S. Rehman
Yongfeng Huang
ObjDVLM
50
47
0
08 Jun 2017
Order embeddings and character-level convolutions for multimodal
  alignment
Order embeddings and character-level convolutions for multimodal alignment
Jonatas Wehrmann
Anderson Mattjie
Rodrigo C. Barros
37
27
0
03 Jun 2017
Listen, Interact and Talk: Learning to Speak via Interaction
Listen, Interact and Talk: Learning to Speak via Interaction
Haichao Zhang
Haonan Yu
Wenyuan Xu
77
13
0
28 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
175
2,967
0
26 May 2017
Deep image representations using caption generators
Deep image representations using caption generators
Konda Reddy Mopuri
Vishal B. Athreya
R. Venkatesh Babu
VLMSSL
26
1
0
25 May 2017
Attention-based Natural Language Person Retrieval
Attention-based Natural Language Person Retrieval
Tao Zhou
Muhao Chen
Jie Yu
Demetri Terzopoulos
39
14
0
24 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN
Object-Level Context Modeling For Scene Classification with Context-CNN
Syed Ashar Javed
A. Nelakanti
VLM
81
10
0
11 May 2017
Image Annotation using Multi-Layer Sparse Coding
Image Annotation using Multi-Layer Sparse Coding
Amara Tariq
H. Foroosh
31
2
0
06 May 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
174
828
0
05 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption
  Dataset
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
69
118
0
02 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
96
58
0
26 Apr 2017
Inception Recurrent Convolutional Neural Network for Object Recognition
Inception Recurrent Convolutional Neural Network for Object Recognition
Md. Zahangir Alom
Mahmudul Hasan
C. Yakopcic
T. Taha
60
88
0
25 Apr 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Yufei Wang
Zhe Lin
Xiaohui Shen
Scott D. Cohen
G. Cottrell
89
106
0
23 Apr 2017
Spatial Memory for Context Reasoning in Object Detection
Spatial Memory for Context Reasoning in Object Detection
Xinlei Chen
Abhinav Gupta
ObjD
101
166
0
13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection
  with Natural Language Queries
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
92
57
0
12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li Li
65
324
0
12 Apr 2017
Reformulating Level Sets as Deep Recurrent Neural Network Approach to
  Semantic Segmentation
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation
Ngan Le
Kha Gia Quach
Khoa Luu
Marios Savvides
Chenchen Zhu
69
71
0
12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders
Creativity: Generating Diverse Questions using Variational Autoencoders
Unnat Jain
Ziyu Zhang
Alex Schwing
72
152
0
11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
110
498
0
11 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People
Generating Descriptions with Grounded and Co-Referenced People
Anna Rohrbach
Marcus Rohrbach
Siyu Tang
Seong Joon Oh
Bernt Schiele
417
72
0
05 Apr 2017
Weakly Supervised Dense Video Captioning
Weakly Supervised Dense Video Captioning
Zhiqiang Shen
Jianguo Li
Zhou Su
Minjun Li
Yurong Chen
Yu-Gang Jiang
Xiangyang Xue
78
135
0
05 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search
AMC: Attention guided Multi-modal Correlation Learning for Image Search
Kan Chen
Trung Bui
Chen Fang
Zhaowen Wang
Ram Nevatia
69
38
0
03 Apr 2017
Speaking the Same Language: Matching Machine to Human Captions by
  Adversarial Training
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
89
144
0
30 Mar 2017
Survey of the State of the Art in Natural Language Generation: Core
  tasks, applications and evaluation
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MAELM
153
828
0
29 Mar 2017
Where to put the Image in an Image Caption Generator
Where to put the Image in an Image Caption Generator
Marc Tanti
Albert Gatt
K. Camilleri
80
96
0
27 Mar 2017
Previous
123456789
Next