ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning
  Optimizer is a Rational Function of Batch Size
The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Hideaki Iiduka
81
2
0
26 Aug 2021
Monocular Depth Estimation Primed by Salient Point Detection and
  Normalized Hessian Loss
Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss
Lam Huynh
Matteo Pedone
Phong H. Nguyen
Jirí Matas
Esa Rahtu
J. Heikkilä
MDE3DPC
83
3
0
25 Aug 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
Auto-Parsing Network for Image Captioning and Visual Question Answering
Xu Yang
Chongyang Gao
Hanwang Zhang
Jianfei Cai
117
37
0
24 Aug 2021
MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples
  in Pedestrian Trajectory Prediction
MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction
Patrick Dendorfer
Sven Elflein
Laura Leal-Taixé
62
100
0
20 Aug 2021
Group-based Distinctive Image Captioning with Memory Attention
Group-based Distinctive Image Captioning with Memory Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
105
18
0
20 Aug 2021
Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous
  Convolutions
Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions
Hyeongseok Son
Junyong Lee
Sunghyun Cho
Seungyong Lee
SupR
64
95
0
20 Aug 2021
Deep Sequence Modeling: Development and Applications in Asset Pricing
Deep Sequence Modeling: Development and Applications in Asset Pricing
Lingbo Cong
Ke Tang
Jingyuan Wang
Yang Zhang
44
15
0
20 Aug 2021
Causal Attention for Unbiased Visual Recognition
Causal Attention for Unbiased Visual Recognition
Tan Wang
Chan Zhou
Qianru Sun
Hanwang Zhang
OODCML
110
114
0
19 Aug 2021
Social Fabric: Tubelet Compositions for Video Relation Detection
Social Fabric: Tubelet Compositions for Video Relation Detection
Shuo Chen
Zenglin Shi
Pascal Mettes
Cees G. M. Snoek
ViT
83
21
0
18 Aug 2021
X-modaler: A Versatile and High-performance Codebase for Cross-modal
  Analytics
X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics
Yehao Li
Yingwei Pan
Jingwen Chen
Ting Yao
Tao Mei
VLM
88
31
0
18 Aug 2021
An Attention Module for Convolutional Neural Networks
An Attention Module for Convolutional Neural Networks
Zhu Baozhou
P. Hofstee
Jinho Lee
Zaid Al-Ars
110
25
0
18 Aug 2021
Target Adaptive Context Aggregation for Video Scene Graph Generation
Target Adaptive Context Aggregation for Video Scene Graph Generation
Yao Teng
Limin Wang
Zhifeng Li
Gangshan Wu
96
64
0
18 Aug 2021
VisBuddy -- A Smart Wearable Assistant for the Visually Challenged
VisBuddy -- A Smart Wearable Assistant for the Visually Challenged
Ishwarya Sivakumar
Nishaali Meenakshisundaram
Ishwarya Ramesh
Shiloah Elizabeth
Sunil Retmin
34
0
0
17 Aug 2021
Challenges for cognitive decoding using deep learning methods
Challenges for cognitive decoding using deep learning methods
A. Thomas
Christopher Ré
R. Poldrack
AI4CE
63
6
0
16 Aug 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Cross-Modal Graph with Meta Concepts for Video Captioning
Hao Wang
Guosheng Lin
Guosheng Lin
Chunyan Miao
67
7
0
14 Aug 2021
Caption Generation on Scenes with Seen and Unseen Object Categories
Caption Generation on Scenes with Seen and Unseen Object Categories
B. Demirel
R. G. Cinbis
VLM
115
1
0
13 Aug 2021
Memory-based Semantic Segmentation for Off-road Unstructured Natural
  Environments
Memory-based Semantic Segmentation for Off-road Unstructured Natural Environments
Youngsaeng Jin
D. Han
Hanseok Ko
33
13
0
12 Aug 2021
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report
  Generation With Alternate Learning
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning
Guangyi Liu
Yinghong Liao
Fuyu Wang
Bin Zhang
Lu Zhang
...
Xiang Wan
Shaolin Li
Zhen Li
Shuixing Zhang
Shuguang Cui
114
59
0
11 Aug 2021
Adaptive Multi-Resolution Attention with Linear Complexity
Adaptive Multi-Resolution Attention with Linear Complexity
Yao Zhang
Yunpu Ma
T. Seidl
Volker Tresp
37
1
0
10 Aug 2021
Understanding Character Recognition using Visual Explanations Derived
  from the Human Visual System and Deep Networks
Understanding Character Recognition using Visual Explanations Derived from the Human Visual System and Deep Networks
Chetan Ralekar
Shubham Choudhary
Tapan K. Gandhi
S. Chaudhury
FAtt
26
1
0
10 Aug 2021
An Interpretable Approach to Hateful Meme Detection
An Interpretable Approach to Hateful Meme Detection
Tanvi Deshpande
Nitya Mani
VLM
67
15
0
09 Aug 2021
Pose is all you need: The pose only group activity recognition system
  (POGARS)
Pose is all you need: The pose only group activity recognition system (POGARS)
Haritha Thilakarathne
Aiden Nibali
Zhen He
Stuart Morgan
53
28
0
09 Aug 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
223
160
0
07 Aug 2021
Information Bottleneck Approach to Spatial Attention Learning
Information Bottleneck Approach to Spatial Attention Learning
Qiuxia Lai
Yu Li
Ailing Zeng
Minhao Liu
Hanqiu Sun
Qiang Xu
98
9
0
07 Aug 2021
Neural Twins Talk & Alternative Calculations
Neural Twins Talk & Alternative Calculations
Zanyar Zohourianshahzadi
Jugal Kalita
57
0
0
05 Aug 2021
Fast Convergence of DETR with Spatially Modulated Co-Attention
Fast Convergence of DETR with Spatially Modulated Co-Attention
Peng Gao
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
95
308
0
05 Aug 2021
Dual Graph Convolutional Networks with Transformer and Curriculum
  Learning for Image Captioning
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
152
68
0
05 Aug 2021
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable
  Video Captioning
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Bang-ju Yang
Shen Ge
Yuexian Zou
Xu Sun
83
32
0
05 Aug 2021
Ordered Attention for Coherent Visual Storytelling
Ordered Attention for Coherent Visual Storytelling
Tom Braude
Idan Schwartz
Alex Schwing
Ariel Shamir
69
9
0
04 Aug 2021
Question-controlled Text-aware Image Captioning
Question-controlled Text-aware Image Captioning
Anwen Hu
Shizhe Chen
Qin Jin
76
15
0
04 Aug 2021
ICECAP: Information Concentrated Entity-aware Image Captioning
ICECAP: Information Concentrated Entity-aware Image Captioning
Anwen Hu
Shizhe Chen
Qin Jin
61
20
0
04 Aug 2021
Cross-modality Discrepant Interaction Network for RGB-D Salient Object
  Detection
Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection
Chen Zhang
Runmin Cong
Qin Lin
Lin Ma
Feng Li
Yao-Min Zhao
Sam Kwong
77
108
0
04 Aug 2021
Vision Transformer with Progressive Sampling
Vision Transformer with Progressive Sampling
Xiaoyu Yue
Shuyang Sun
Zhanghui Kuang
Meng Wei
Philip Torr
Wayne Zhang
Dahua Lin
ViT
89
85
0
03 Aug 2021
RAIN: Reinforced Hybrid Attention Inference Network for Motion
  Forecasting
RAIN: Reinforced Hybrid Attention Inference Network for Motion Forecasting
Jiachen Li
Fan Yang
Hengbo Ma
Srikanth Malla
Masayoshi Tomizuka
Chiho Choi
91
42
0
03 Aug 2021
GTNet:Guided Transformer Network for Detecting Human-Object Interactions
GTNet:Guided Transformer Network for Detecting Human-Object Interactions
A S M Iftekhar
Satish Kumar
R. McEver
Suya You
B. S. Manjunath
ViT
180
13
0
02 Aug 2021
Chest ImaGenome Dataset for Clinical Reasoning
Chest ImaGenome Dataset for Clinical Reasoning
Joy T. Wu
Nkechinyere N. Agu
Ismini Lourentzou
Arjun Sharma
J. Paguio
...
William Mitchell
Satyananda Kashyap
Andrea Giovannini
Leo Anthony Celi
Mehdi Moradi
58
67
0
31 Jul 2021
Enhancing Social Relation Inference with Concise Interaction Graph and
  Discriminative Scene Representation
Enhancing Social Relation Inference with Concise Interaction Graph and Discriminative Scene Representation
Xiaotian Yu
Hanling Yi
Yi Yu
Ling Xing
Shiliang Zhang
Xiaoyu Wang
GNN
110
0
0
30 Jul 2021
ReFormer: The Relational Transformer for Image Captioning
ReFormer: The Relational Transformer for Image Captioning
Xuewen Yang
Yingru Liu
Xin Wang
ViT
103
57
0
29 Jul 2021
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth
  Estimation
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation
Tianxiao Gao
Wu Wei
Zhongbin Cai
Zhun Fan
Shane Xie
Xinmei Wang
Qiuda Yu
72
7
0
29 Jul 2021
Multimodal Co-learning: Challenges, Applications with Datasets, Recent
  Advances and Future Directions
Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future Directions
Anil Rahate
Rahee Walambe
S. Ramanna
K. Kotecha
116
143
0
29 Jul 2021
The Who in XAI: How AI Background Shapes Perceptions of AI Explanations
The Who in XAI: How AI Background Shapes Perceptions of AI Explanations
Upol Ehsan
Samir Passi
Q. V. Liao
Larry Chan
I-Hsiang Lee
Michael J. Muller
Mark O. Riedl
97
96
0
28 Jul 2021
Predicting the Future from First Person (Egocentric) Vision: A Survey
Predicting the Future from First Person (Egocentric) Vision: A Survey
Ivan Rodin
Antonino Furnari
Dimitrios Mavroeidis
G. Farinella
EgoV
104
44
0
28 Jul 2021
Experimenting with Self-Supervision using Rotation Prediction for Image
  Captioning
Experimenting with Self-Supervision using Rotation Prediction for Image Captioning
Ahmed Elhagry
Karima Kadaoui
SSL
37
0
0
28 Jul 2021
Transfer Learning in Electronic Health Records through Clinical Concept
  Embedding
Transfer Learning in Electronic Health Records through Clinical Concept Embedding
J. R. A. Solares
Yajie Zhu
A. Hassaine
Shishir Rao
Yikuan Li
M. Mamouei
D. Canoy
K. Rahimi
G. Salimi-Khorshidi
126
6
0
27 Jul 2021
Towards Efficient Tensor Decomposition-Based DNN Model Compression with
  Optimization Framework
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
Miao Yin
Yang Sui
Siyu Liao
Bo Yuan
60
81
0
26 Jul 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
72
129
0
26 Jul 2021
Towards Unbiased Visual Emotion Recognition via Causal Intervention
Towards Unbiased Visual Emotion Recognition via Causal Intervention
Yuedong Chen
Xu Yang
Tat-Jen Cham
Jianfei Cai
OODCML
79
19
0
26 Jul 2021
Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph
Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph
Wentian Zhao
Yao Hu
Heda Wang
Xinxiao Wu
Jiebo Luo
55
49
0
26 Jul 2021
Will Multi-modal Data Improves Few-shot Learning?
Will Multi-modal Data Improves Few-shot Learning?
Zilun Zhang
Shihao Ma
Yichun Zhang
34
2
0
25 Jul 2021
Multi-Label Image Classification with Contrastive Learning
Multi-Label Image Classification with Contrastive Learning
Son D.Dao
Ethan Zhao
D.Q. Phung
Jianfei Cai
SSL
66
26
0
24 Jul 2021
Previous
123...192021...697071
Next