ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6632
  4. Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
v1v2v3v4v5 (latest)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
    VLM
ArXiv (abs)PDFHTML

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown
Title
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Reid Pryzant
Ziyi Yang
Yichong Xu
Chenguang Zhu
Michael Zeng
81
10
0
18 May 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
  Context for Image Captioning
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
97
55
0
09 May 2022
Diverse Image Captioning with Grounded Style
Diverse Image Captioning with Grounded Style
Franz Klein
Shweta Mahajan
S. Roth
72
7
0
03 May 2022
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Zhaowei Cai
Gukyeong Kwon
Avinash Ravichandran
Erhan Bas
Zhuowen Tu
Rahul Bhotika
Stefano Soatto
ObjDMLLMVLM
67
50
0
12 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
87
16
0
08 Apr 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chen Chen
102
12
0
07 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large
  Models
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
Shixuan Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TSVLM
79
37
0
03 Mar 2022
Inference of captions from histopathological patches
Inference of captions from histopathological patches
M. Tsuneki
F. Kanavati
84
32
0
07 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network
  Accelerators
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators
Lois Orosa
Skanda Koppula
Yaman Umuroglu
Konstantinos Kanellopoulos
Juan Gómez Luna
Michaela Blott
K. Vissers
O. Mutlu
82
4
0
04 Feb 2022
Multi-Label Classification on Remote-Sensing Images
Multi-Label Classification on Remote-Sensing Images
A. Singh
B. Uma Shankar
38
0
0
06 Jan 2022
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
122
197
0
29 Nov 2021
Contrastive Learning of Visual-Semantic Embeddings
Contrastive Learning of Visual-Semantic Embeddings
Anurag Jain
Yashaswi Verma
SSL
66
1
0
17 Oct 2021
Geometry Attention Transformer with Position-aware LSTMs for Image
  Captioning
Geometry Attention Transformer with Position-aware LSTMs for Image Captioning
Chi-Yin Wang
Yulin Shen
Luping Ji
ViT
106
53
0
01 Oct 2021
Cross Modification Attention Based Deliberation Model for Image
  Captioning
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
64
5
0
17 Sep 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal
  Attention
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
94
13
0
07 Sep 2021
Group-based Distinctive Image Captioning with Memory Attention
Group-based Distinctive Image Captioning with Memory Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
100
18
0
20 Aug 2021
Caption Generation on Scenes with Seen and Unseen Object Categories
Caption Generation on Scenes with Seen and Unseen Object Categories
B. Demirel
R. G. Cinbis
VLM
115
1
0
13 Aug 2021
A Better Loss for Visual-Textual Grounding
A Better Loss for Visual-Textual Grounding
Davide Rigoni
Luciano Serafini
A. Sperduti
ObjD
60
3
0
11 Aug 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
207
160
0
07 Aug 2021
Structured Multi-modal Feature Embedding and Alignment for
  Image-Sentence Retrieval
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
J. Jose
Zhilong Ji
Zhongqin Wu
Xiao-Chang Liu
72
57
0
05 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DVVLMMLLM
153
270
0
14 Jul 2021
A comparison of LSTM and GRU networks for learning symbolic sequences
A comparison of LSTM and GRU networks for learning symbolic sequences
Roberto Cahuantzi
Xinye Chen
S. Güttel
96
143
0
05 Jul 2021
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by
  Bidirectional Matching between Parts and Words
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words
Chuan Tang
Xi Yang
Bojian Wu
Zhizhong Han
Yi Chang
3DPC
91
13
0
05 Jul 2021
Case Relation Transformer: A Crossmodal Language Generation Model for
  Fetching Instructions
Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions
Motonari Kambara
K. Sugiura
ViT
62
6
0
02 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
82
38
0
01 Jul 2021
New Encoder Learning for Captioning Heavy Rain Images via Semantic
  Visual Feature Matching
New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching
Chang-Hwan Son
Pung-Hwi Ye
123
3
0
28 May 2021
Writing by Memorizing: Hierarchical Retrieval-based Medical Report
  Generation
Writing by Memorizing: Hierarchical Retrieval-based Medical Report Generation
Xingyi Yang
Muchao Ye
Quanzeng You
Fenglong Ma
MedIm
57
38
0
25 May 2021
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image
  Retrieval
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval
K. Ueki
45
4
0
16 May 2021
End-to-End Attention-based Image Captioning
End-to-End Attention-based Image Captioning
Carola Sundaramoorthy
Lin Ziwen Kelvin
Mahak Sarin
Shubham Gupta
ViT
57
6
0
30 Apr 2021
Multi-view Deep One-class Classification: A Systematic Exploration
Multi-view Deep One-class Classification: A Systematic Exploration
Siqi Wang
Jiyuan Liu
Guang Yu
Xinwang Liu
Sihang Zhou
En Zhu
Yuexiang Yang
Jianping Yin
24
1
0
27 Apr 2021
Towards Open-World Text-Guided Face Image Generation and Manipulation
Towards Open-World Text-Guided Face Image Generation and Manipulation
Weihao Xia
Yujiu Yang
Jing-Hao Xue
Baoyuan Wu
DiffM
69
42
0
18 Apr 2021
Integrating Information Theory and Adversarial Learning for Cross-modal
  Retrieval
Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval
Wei Chen
Yu Liu
E. Bakker
M. Lew
GAN
41
27
0
11 Apr 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
71
17
0
27 Mar 2021
Sequential Learning on Liver Tumor Boundary Semantics and Prognostic
  Biomarker Mining
Sequential Learning on Liver Tumor Boundary Semantics and Prognostic Biomarker Mining
Jieneng Chen
K. Yan
Yu-Dong Zhang
Youbao Tang
Xun Xu
...
Lingyun Huang
Jing Xiao
Alan Yuille
Ya Zhang
Le Lu
30
2
0
09 Mar 2021
Analysis of Convolutional Decoder for Image Caption Generation
Analysis of Convolutional Decoder for Image Caption Generation
Sulabh Katiyar
S. Borgohain
52
0
0
08 Mar 2021
A Universal Model for Cross Modality Mapping by Relational Reasoning
A Universal Model for Cross Modality Mapping by Relational Reasoning
Zun Li
Congyan Lang
Liqian Liang
Tao Wang
Songhe Feng
Jun Wu
Yidong Li
56
2
0
26 Feb 2021
Comparative evaluation of CNN architectures for Image Caption Generation
Comparative evaluation of CNN architectures for Image Caption Generation
Sulabh Katiyar
S. Borgohain
74
24
0
23 Feb 2021
Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings
  and Data Augmentation
Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation
Sulabh Katiyar
S. Borgohain
VLM
59
14
0
22 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
90
67
0
31 Dec 2020
SubICap: Towards Subword-informed Image Captioning
SubICap: Towards Subword-informed Image Captioning
Naeha Sharif
Bennamoun
Wei Liu
Syed Afaq Ali Shah
45
2
0
24 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
102
9
0
16 Dec 2020
StacMR: Scene-Text Aware Cross-Modal Retrieval
StacMR: Scene-Text Aware Cross-Modal Retrieval
Andrés Mafla
Rafael Sampaio de Rezende
Lluís Gómez
Diane Larlus
Dimosthenis Karatzas
3DV
102
14
0
08 Dec 2020
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
Weihao Xia
Yujiu Yang
Jing-Hao Xue
Baoyuan Wu
DiffM
118
23
0
06 Dec 2020
Robust Image Captioning
Robust Image Captioning
Daniel Yarnell
Xian Wang
46
0
0
06 Dec 2020
Understanding Guided Image Captioning Performance across Domains
Understanding Guided Image Captioning Performance across Domains
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
118
25
0
04 Dec 2020
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling
Jing Su
Qingyun Dai
Frank Guerin
Mian Zhou
70
24
0
03 Dec 2020
Diverse Image Captioning with Context-Object Split Latent Spaces
Diverse Image Captioning with Context-Object Split Latent Spaces
Shweta Mahajan
Stefan Roth
64
42
0
02 Nov 2020
Personalized Multimodal Feedback Generation in Education
Personalized Multimodal Feedback Generation in Education
Haochen Liu
Zitao Liu
Zhongqin Wu
Jiliang Tang
54
9
0
31 Oct 2020
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in
  the Conversation
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in the Conversation
Yuzhao Mao
Qi Sun
Guang Liu
Xiaojie Wang
Weiguo Gao
Xuan Li
Jianping Shen
75
26
0
15 Oct 2020
Spatial Attention as an Interface for Image Captioning Models
Spatial Attention as an Interface for Image Captioning Models
P. Sadler
53
0
0
29 Sep 2020
Previous
123456789
Next