ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Improving the Performance of Automated Audio Captioning via Integrating
  the Acoustic and Semantic Information
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Zhongjie Ye
Helin Wang
Dongchao Yang
Yuexian Zou
106
28
0
12 Oct 2021
Multi-Modal Interaction Graph Convolutional Network for Temporal
  Language Localization in Videos
Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos
Zongmeng Zhang
Xianjing Han
Xuemeng Song
Yan Yan
Liqiang Nie
120
37
0
12 Oct 2021
Topic Scene Graph Generation by Attention Distillation from Caption
Topic Scene Graph Generation by Attention Distillation from Caption
Wenbin Wang
R. Wang
X. Chen
DiffM
94
14
0
12 Oct 2021
Reason induced visual attention for explainable autonomous driving
Reason induced visual attention for explainable autonomous driving
Sikai Chen
Jiqian Dong
Runjia Du
Yujie Li
Samuel Labi
68
1
0
11 Oct 2021
Semi-Autoregressive Image Captioning
Semi-Autoregressive Image Captioning
Xu Yan
Zhengcong Fei
Zekang Li
Shuhui Wang
Qingming Huang
Qi Tian
91
25
0
11 Oct 2021
Supervision Exists Everywhere: A Data Efficient Contrastive
  Language-Image Pre-training Paradigm
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLMCLIP
167
458
0
11 Oct 2021
Recurrent Attention Models with Object-centric Capsule Representation
  for Multi-object Recognition
Recurrent Attention Models with Object-centric Capsule Representation for Multi-object Recognition
Hossein Adeli
Seoyoung Ahn
G. Zelinsky
OCL
58
3
0
11 Oct 2021
Accessible Visualization via Natural Language Descriptions: A Four-Level
  Model of Semantic Content
Accessible Visualization via Natural Language Descriptions: A Four-Level Model of Semantic Content
Alan Lundgard
Arvind Satyanarayan
59
136
0
08 Oct 2021
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
J. Tan
C. Chan
Joon Huang Chuah
VLM
132
16
0
07 Oct 2021
Attentive Walk-Aggregating Graph Neural Networks
Attentive Walk-Aggregating Graph Neural Networks
M. F. Demirel
Shengchao Liu
Siddhant Garg
Zhenmei Shi
Yingyu Liang
133
10
0
06 Oct 2021
Let there be a clock on the beach: Reducing Object Hallucination in
  Image Captioning
Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning
Ali Furkan Biten
L. G. I. Bigorda
Dimosthenis Karatzas
168
63
0
04 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Yue Liu
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
213
384
0
04 Oct 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real
  Images
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
Zhuowan Li
Elias Stengel-Eskin
Yixiao Zhang
Cihang Xie
Q. Tran
Benjamin Van Durme
Alan Yuille
VLM
73
15
0
01 Oct 2021
Geometry Attention Transformer with Position-aware LSTMs for Image
  Captioning
Geometry Attention Transformer with Position-aware LSTMs for Image Captioning
Chi-Yin Wang
Yulin Shen
Luping Ji
ViT
113
53
0
01 Oct 2021
Multi-granular Legal Topic Classification on Greek Legislation
Multi-granular Legal Topic Classification on Greek Legislation
C. Papaloukas
Ilias Chalkidis
Konstantinos Athinaios
D. Pantazi
Manolis Koubarakis
AILaw
80
25
0
30 Sep 2021
Google Neural Network Models for Edge Devices: Analyzing and Mitigating
  Machine Learning Inference Bottlenecks
Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks
Amirali Boroumand
Saugata Ghose
Berkin Akin
Ravi Narayanaswami
Geraldo F. Oliveira
Xiaoyu Ma
Eric Shiu
O. Mutlu
80
86
0
29 Sep 2021
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Ling Cheng
Wei Wei
Feida Zhu
Yong Liu
Chunyan Miao
ViT
47
3
0
29 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual
  Question Answering
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
67
19
0
27 Sep 2021
Optimising for Interpretability: Convolutional Dynamic Alignment
  Networks
Optimising for Interpretability: Convolutional Dynamic Alignment Networks
Moritz D Boehle
Mario Fritz
Bernt Schiele
39
3
0
27 Sep 2021
The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese
  Dialogue Dataset for E-commerce Customer Service
The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service
Nan Zhao
Haoran Li
Youzheng Wu
Xiaodong He
Bowen Zhou
50
9
0
27 Sep 2021
Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation
Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation
An Yan
Zexue He
Xing Lu
Jingfeng Du
E. Chang
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
MedIm
183
65
0
25 Sep 2021
Scene Graph Generation for Better Image Captioning?
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes
Martin Schmitt
Vladimir Golkov
Hinrich Schütze
Zorah Lähner
GNN
76
3
0
23 Sep 2021
Cross-Modal Coherence for Text-to-Image Retrieval
Cross-Modal Coherence for Text-to-Image Retrieval
Malihe Alikhani
Fangda Han
Hareesh Ravi
Mubbasir Kapadia
Vladimir Pavlovic
Matthew Stone
72
9
0
22 Sep 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLMViTVLM
307
351
0
22 Sep 2021
Caption Enriched Samples for Improving Hateful Memes Detection
Caption Enriched Samples for Improving Hateful Memes Detection
Efrat Blaier
Itzik Malkiel
Lior Wolf
VLM
96
24
0
22 Sep 2021
Latexify Math: Mathematical Formula Markup Revision to Assist
  Collaborative Editing in Math Q&A Sites
Latexify Math: Mathematical Formula Markup Revision to Assist Collaborative Editing in Math Q&A Sites
Suyu Ma
Chunyang Chen
Hourieh Khalajzadeh
J. Grundy
HAIAIMat
36
5
0
20 Sep 2021
Multimodal Incremental Transformer with Visual Grounding for Visual
  Dialogue Generation
Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Feilong Chen
Fandong Meng
Xiuyi Chen
Peng Li
Jie Zhou
102
23
0
17 Sep 2021
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Feilong Chen
Xiuyi Chen
Fandong Meng
Peng Li
Jie Zhou
145
35
0
17 Sep 2021
Cross Modification Attention Based Deliberation Model for Image
  Captioning
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
69
5
0
17 Sep 2021
Label-Attention Transformer with Geometrically Coherent Objects for
  Image Captioning
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning
Shikha Dubey
Farrukh Olimov
M. Rafique
Joonmo Kim
M. Jeon
ViT
84
43
0
16 Sep 2021
SafeAccess+: An Intelligent System to make Smart Home Safer and
  Americans with Disability Act Compliant
SafeAccess+: An Intelligent System to make Smart Home Safer and Americans with Disability Act Compliant
Shahinur Alam
40
2
0
14 Sep 2021
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection
Steven Lang
Fabrizio G. Ventola
Kristian Kersting
88
15
0
13 Sep 2021
Learning to Ground Visual Objects for Visual Dialog
Learning to Ground Visual Objects for Visual Dialog
Feilong Chen
Xiuyi Chen
Can Xu
Daxin Jiang
OOD
94
18
0
13 Sep 2021
Explain Me the Painting: Multi-Topic Knowledgeable Art Description
  Generation
Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation
Zechen Bai
Yuta Nakashima
Noa Garcia
118
44
0
13 Sep 2021
Bornon: Bengali Image Captioning with Transformer-based Deep learning
  approach
Bornon: Bengali Image Captioning with Transformer-based Deep learning approach
Faisal Muhammad Shah
Mayeesha Humaira
Md Abidur Rahman Khan Jim
Amit Saha Ami
Shimul Paul
55
19
0
11 Sep 2021
We went to look for meaning and all we got were these lousy
  representations: aspects of meaning representation for computational
  semantics
We went to look for meaning and all we got were these lousy representations: aspects of meaning representation for computational semantics
Simon Dobnik
R. Cooper
Adam Ek
Bill Noble
Staffan Larsson
N. Ilinykh
Vladislav Maraev
Vidya Somashekarappa
66
0
0
10 Sep 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
125
142
0
09 Sep 2021
Dynamic Modeling of Hand-Object Interactions via Tactile Sensing
Dynamic Modeling of Hand-Object Interactions via Tactile Sensing
Qiang Zhang
Yunzhu Li
Yiyue Luo
Wan Shou
Michael Foshey
Junchi Yan
J. Tenenbaum
Wojciech Matusik
Antonio Torralba
69
18
0
09 Sep 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal
  Attention
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
94
13
0
07 Sep 2021
Journalistic Guidelines Aware News Image Captioning
Journalistic Guidelines Aware News Image Captioning
Xuewen Yang
Svebor Karaman
Joel R. Tetreault
Alex Jaimes
90
27
0
07 Sep 2021
Ultra-high Resolution Image Segmentation via Locality-aware Context
  Fusion and Alternating Local Enhancement
Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement
Wenxi Liu
Qi Li
Xin Lin
Weixiang Yang
Shengfeng He
Yuanlong Yu
78
8
0
06 Sep 2021
LAViTeR: Learning Aligned Visual and Textual Representations Assisted by
  Image and Caption Generation
LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation
Mohammad Abuzar Shaikh
Zhanghexuan Ji
Dana Moukheiber
Yan Shen
S. Srihari
Mingchen Gao
VLM
65
1
0
04 Sep 2021
Attentive Neural Controlled Differential Equations for Time-series
  Classification and Forecasting
Attentive Neural Controlled Differential Equations for Time-series Classification and Forecasting
Sheo Yon Jhin
H. Shin
Seoyoung Hong
Solhee Park
Noseong Park
AI4TS
66
24
0
04 Sep 2021
IMG2SMI: Translating Molecular Structure Images to Simplified
  Molecular-input Line-entry System
IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System
Daniel Fernando Campos
Heng Ji
66
12
0
03 Sep 2021
Sequence-to-Sequence Learning with Latent Neural Grammars
Sequence-to-Sequence Learning with Latent Neural Grammars
Yoon Kim
168
40
0
02 Sep 2021
Causal Inference in Natural Language Processing: Estimation, Prediction,
  Interpretation and Beyond
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond
Amir Feder
Katherine A. Keith
Emaad A. Manzoor
Reid Pryzant
Dhanya Sridhar
...
Roi Reichart
Margaret E. Roberts
Brandon M Stewart
Victor Veitch
Diyi Yang
CML
123
246
0
02 Sep 2021
Working Memory Connections for LSTM
Working Memory Connections for LSTM
Federico Landi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
KELM
74
173
0
31 Aug 2021
Automated Generation of Accurate \& Fluent Medical X-ray Reports
Automated Generation of Accurate \& Fluent Medical X-ray Reports
Hoang T.N. Nguyen
Dong Nie
Taivanbat Badamdorj
Yujie Liu
Yingying Zhu
J. Truong
Li Cheng
MedImLM&MA
73
40
0
27 Aug 2021
Similar Scenes arouse Similar Emotions: Parallel Data Augmentation for
  Stylized Image Captioning
Similar Scenes arouse Similar Emotions: Parallel Data Augmentation for Stylized Image Captioning
Guodun Li
Yuchen Zhai
Zehao Lin
Yin Zhang
114
21
0
26 Aug 2021
Glimpse-Attend-and-Explore: Self-Attention for Active Visual Exploration
Glimpse-Attend-and-Explore: Self-Attention for Active Visual Exploration
Soroush Seifi
Abhishek Jha
Tinne Tuytelaars
44
10
0
26 Aug 2021
Previous
123...181920...697071
Next