Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1410.1090
Cited By
Explain Images with Multimodal Recurrent Neural Networks
4 October 2014
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Alan Yuille
VLM
GAN
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Explain Images with Multimodal Recurrent Neural Networks"
50 / 116 papers shown
Title
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
37
4
0
08 Dec 2023
Object Recognition as Next Token Prediction
Kaiyu Yue
Borchun Chen
Jonas Geiping
Hengduo Li
Tom Goldstein
Ser-Nam Lim
40
9
0
04 Dec 2023
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai Le-Duc
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
31
5
0
23 Sep 2023
A Comprehensive Analysis of Real-World Image Captioning and Scene Identification
Sai Suprabhanu Nallapaneni
Subrahmanyam Konakanchi
30
2
0
05 Aug 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
34
37
0
23 Nov 2022
Language Models Can See: Plugging Visual Controls in Text Generation
Yixuan Su
Tian Lan
Yahui Liu
Fangyu Liu
Dani Yogatama
Yan Wang
Lingpeng Kong
Nigel Collier
VLM
MLLM
46
97
0
05 May 2022
Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling
Tengpeng Li
Hanli Wang
Bin He
Changan Chen
DiffM
21
9
0
10 Mar 2022
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Ling Cheng
Wei Wei
Feida Zhu
Yong-jin Liu
Chunyan Miao
ViT
21
3
0
29 Sep 2021
Heterogeneous Contrastive Learning
Lecheng Zheng
Jinjun Xiong
Yada Zhu
Jingrui He
40
21
0
19 May 2021
Characterization and recognition of handwritten digits using Julia
Md Asifuzzaman Jishan
M. Alam
A. Islam
I. R. Mazumder
K. Mahmud
A. K. Azad
19
0
0
24 Feb 2021
Image to Bengali Caption Generation Using Deep CNN and Bidirectional Gated Recurrent Unit
Albay Faruk
Hasan Al Faraby
M. M. Azad
Md. Riduyan Fedous
Md. Kishor Morol
17
15
0
22 Dec 2020
Intrinsic Image Captioning Evaluation
Chao Zeng
Sam Kwong
21
0
0
14 Dec 2020
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
Yujie Zhong
Linhai Xie
Sen Wang
Lucia Specia
Yishu Miao
SSL
11
0
0
19 Nov 2020
TextMage: The Automated Bangla Caption Generator Based On Deep Learning
Abrar Hasin Kamal
Md Asifuzzaman Jishan
N. Mansoor
VLM
8
17
0
15 Oct 2020
X-Linear Attention Networks for Image Captioning
Yingwei Pan
Ting Yao
Yehao Li
Tao Mei
20
509
0
31 Mar 2020
Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis
Alexander Schindler
15
8
0
01 Feb 2020
On Architectures for Including Visual Information in Neural Language Models for Image Description
Marc Tanti
Albert Gatt
K. Camilleri
VLM
30
2
0
09 Nov 2019
Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style
Hongwei Ge
Zehang Yan
Kai Zhang
Mingde Zhao
Liang Sun
30
24
0
15 Oct 2019
Visuallly Grounded Generation of Entailments from Premises
Somayeh Jafaritazehjani
Albert Gatt
Marc Tanti
LRM
27
1
0
21 Sep 2019
Using Clinical Notes with Time Series Data for ICU Management
Swaraj Khadanga
Karan Aggarwal
Chenyu You
Jaideep Srivastava
8
55
0
12 Sep 2019
Conditional Text Generation for Harmonious Human-Machine Interaction
Bin Guo
Hao Wang
Yasan Ding
Wei Wu
Shaoyang Hao
Yueqi Sun
Zhiwen Yu
21
4
0
08 Sep 2019
Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation
Wei Wei
Ling Cheng
Xian-Ling Mao
Guangyou Zhou
Feida Zhu
DiffM
22
19
0
05 Sep 2019
A Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling
Haoran Chen
Ke Lin
A. Maye
Jianmin Li
Xiaoling Hu
25
47
0
31 Aug 2019
Modeling question asking using neural program generation
ZiYun Wang
Brenden M. Lake
16
7
0
23 Jul 2019
Kite: Automatic speech recognition for unmanned aerial vehicles
Dan Oneaţă
H. Cucu
21
13
0
02 Jul 2019
AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
Qiuyun Zhang
Bin Guo
Hao Wang
Yunji Liang
Shaoyang Hao
Zhiwen Yu
15
6
0
01 May 2019
Improving Image Captioning by Leveraging Knowledge Graphs
Yimin Zhou
Yiwei Sun
Vasant Honavar
VLM
14
54
0
25 Jan 2019
Transfer learning from language models to image caption generators: Better models may not transfer better
Marc Tanti
Albert Gatt
K. Camilleri
VLM
23
3
0
01 Jan 2019
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
45
760
0
06 Oct 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Shuming Ma
Lei Cui
Damai Dai
Furu Wei
Xu Sun
VGen
23
61
0
13 Sep 2018
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee
A. Schwing
19
66
0
03 Sep 2018
Live Video Comment Generation Based on Surrounding Frames and Live Comments
Damai Dai
VGen
8
0
0
13 Aug 2018
Doubly Attentive Transformer Machine Translation
Hasan Sait Arslan
Mark Fishel
G. Anbarjafari
32
13
0
30 Jul 2018
Improving Image Captioning with Conditional Generative Adversarial Nets
Chen Chen
Shuai Mu
Wanpeng Xiao
Zexiong Ye
Liesi Wu
Qi Ju
GAN
29
90
0
18 May 2018
Video Object Detection with an Aligned Spatial-Temporal Memory
Fanyi Xiao
Yong Jae Lee
49
189
0
18 Dec 2017
Learning Semantic Concepts and Order for Image and Sentence Matching
Yan Huang
Qi Wu
Liang Wang
VLM
8
302
0
06 Dec 2017
Learning Functional Causal Models with Generative Neural Networks
Hugo Jair Escalante
Sergio Escalera
Xavier Baro
Isabelle M Guyon
Umut Güçlü
Marcel van Gerven
CML
BDL
20
107
0
15 Sep 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?
Marc Tanti
Albert Gatt
K. Camilleri
21
56
0
07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Shuang Li
Tong Xiao
Hongsheng Li
Wei Yang
Xiaogang Wang
22
227
0
07 Aug 2017
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
24
7
0
26 Jul 2017
Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text
Ayush Jaiswal
Ekraam Sabir
Wael Abd-Almageed
Premkumar Natarajan
16
44
0
06 Jul 2017
MirBot: A collaborative object recognition system for smartphones using convolutional neural networks
A. Pertusa
Antonio Javier Gallego
Marisa Bernabeu
ObjD
14
13
0
09 Jun 2017
Query-adaptive Video Summarization via Quality-aware Relevance Estimation
A. Vasudevan
Michael Gygli
Anna Volokitin
Luc Van Gool
32
93
0
01 May 2017
Where to put the Image in an Image Caption Generator
Marc Tanti
Albert Gatt
K. Camilleri
47
96
0
27 Mar 2017
A New Evaluation Protocol and Benchmarking Results for Extendable Cross-media Retrieval
Ruoyu Liu
Yao Zhao
Liang Zheng
Shikui Wei
Yi Yang
25
12
0
10 Mar 2017
Gated Multimodal Units for Information Fusion
John Arevalo
Thamar Solorio
Manuel Montes-y-Gómez
Fabio Gonzalez
33
371
0
07 Feb 2017
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation
Iacer Calixto
Qun Liu
N. Campbell
40
179
0
04 Feb 2017
Incorporating Global Visual Features into Attention-Based Neural Machine Translation
Iacer Calixto
Qun Liu
Nick Campbell
32
154
0
23 Jan 2017
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
30
9
0
15 Dec 2016
1
2
3
Next