ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Causal Attention for Interpretable and Generalizable Graph
  Classification
Causal Attention for Interpretable and Generalizable Graph Classification
Yongduo Sui
Xiang Wang
Jiancan Wu
Min Lin
Xiangnan He
Tat-Seng Chua
CMLOOD
90
161
0
30 Dec 2021
Radiology Report Generation with a Learned Knowledge Base and
  Multi-modal Alignment
Radiology Report Generation with a Learned Knowledge Base and Multi-modal Alignment
Shuxin Yang
Xian Wu
Shen Ge
S.Kevin Zhou
Li Xiao
MedIm
70
97
0
30 Dec 2021
Knowledge Matters: Radiology Report Generation with General and Specific
  Knowledge
Knowledge Matters: Radiology Report Generation with General and Specific Knowledge
Shuxin Yang
Xian Wu
Shen Ge
S.Kevin Zhou
Li Xiao
MedIm
91
120
0
30 Dec 2021
An empirical user-study of text-based nonverbal annotation systems for
  human-human conversations
An empirical user-study of text-based nonverbal annotation systems for human-human conversations
Joshua Y. Kim
K. Yacef
48
1
0
30 Dec 2021
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image
  Synthesis and Image Recognition
Learning Spatially-Adaptive Squeeze-Excitation Networks for Image Synthesis and Image Recognition
Jianghao Shen
Tianfu Wu
ViT
49
0
0
29 Dec 2021
Synchronized Audio-Visual Frames with Fractional Positional Encoding for
  Transformers in Video-to-Text Translation
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation
Philipp Harzig
Moritz Einfalt
Rainer Lienhart
ViT
68
2
0
28 Dec 2021
Associative Adversarial Learning Based on Selective Attack
Associative Adversarial Learning Based on Selective Attack
Runqi Wang
Xiaoyue Duan
Baochang Zhang
Shenjun Xue
Wentao Zhu
David Doermann
G. Guo
AAML
83
0
0
28 Dec 2021
Adaptive Beam Search to Enhance On-device Abstractive Summarization
Adaptive Beam Search to Enhance On-device Abstractive Summarization
S. HarichandanaB.S.
Sumit Kumar
35
1
0
22 Dec 2021
Comparing radiologists' gaze and saliency maps generated by
  interpretability methods for chest x-rays
Comparing radiologists' gaze and saliency maps generated by interpretability methods for chest x-rays
Ricardo Bigolin Lanfredi
Ambuj Arora
Trafton Drew
Joyce D. Schroeder
Tolga Tasdizen
MedIm
32
9
0
22 Dec 2021
Fusion of medical imaging and electronic health records with attention
  and multi-head machanisms
Fusion of medical imaging and electronic health records with attention and multi-head machanisms
Cheng Jiang
Yihao Chen
Jianbo Chang
M. Feng
Renzhi Wang
Jianhua Yao
49
8
0
22 Dec 2021
Toward Explainable AI for Regression Models
Toward Explainable AI for Regression Models
S. Letzgus
Patrick Wagner
Jonas Lederer
Wojciech Samek
Klaus-Robert Muller
G. Montavon
XAI
97
67
0
21 Dec 2021
Continual Learning with Knowledge Transfer for Sentiment Classification
Continual Learning with Knowledge Transfer for Sentiment Classification
Zixuan Ke
Bing-Quan Liu
Hao Wang
Lei Shu
CLL
95
31
0
18 Dec 2021
Inherently Explainable Reinforcement Learning in Natural Language
Inherently Explainable Reinforcement Learning in Natural Language
Xiangyu Peng
Mark O. Riedl
Prithviraj Ammanabrolu
LRM
64
21
0
16 Dec 2021
Positional Encoding Augmented GAN for the Assessment of Wind Flow for
  Pedestrian Comfort in Urban Areas
Positional Encoding Augmented GAN for the Assessment of Wind Flow for Pedestrian Comfort in Urban Areas
Henrik Hoiness
Kristoffer Gjerde
L. Oggiano
K. E. Giljarhus
M. Ruocco
DiffMAI4CE
28
5
0
15 Dec 2021
Towards Controllable Agent in MOBA Games with Generative Modeling
Towards Controllable Agent in MOBA Games with Generative Modeling
Shubao Zhang
68
0
0
15 Dec 2021
Minimization of Stochastic First-order Oracle Complexity of Adaptive
  Methods for Nonconvex Optimization
Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Hideaki Iiduka
48
0
0
14 Dec 2021
Hybrid Graph Neural Networks for Few-Shot Learning
Hybrid Graph Neural Networks for Few-Shot Learning
Tianyuan Yu
Sen He
Yi-Zhe Song
Tao Xiang
60
64
0
13 Dec 2021
PartGlot: Learning Shape Part Segmentation from Language Reference Games
PartGlot: Learning Shape Part Segmentation from Language Reference Games
Juil Koo
Ian Huang
Panos Achlioptas
Leonidas Guibas
Minhyuk Sung
3DPC
121
30
0
13 Dec 2021
Towards More Efficient Insertion Transformer with Fractional Positional
  Encoding
Towards More Efficient Insertion Transformer with Fractional Positional Encoding
Zhisong Zhang
Yizhe Zhang
W. Dolan
101
0
0
12 Dec 2021
Neural Attention Models in Deep Learning: Survey and Taxonomy
Neural Attention Models in Deep Learning: Survey and Taxonomy
Alana de Santana Correia
Esther Colombini
MLAU
50
19
0
11 Dec 2021
Quality-Aware Multimodal Biometric Recognition
Quality-Aware Multimodal Biometric Recognition
Sobhan Soleymani
Ali Dabouei
Fariborz Taherkhani
Seyed Mehdi Iranmanesh
J. Dawson
Nasser M. Nasrabadi
CVBM
89
3
0
10 Dec 2021
VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface
  Modeling
VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface Modeling
Yang Li
Gang Li
Xin Zhou
Mostafa Dehghani
A. Gritsenko
MLLM
94
36
0
10 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViTVLM
86
91
0
09 Dec 2021
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Self-Supervised Image-to-Text and Text-to-Image Synthesis
Anindya Sundar Das
S. Saha
SSL
28
5
0
09 Dec 2021
Progressive Attention on Multi-Level Dense Difference Maps for Generic
  Event Boundary Detection
Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
Jiaqi Tang
Zhaoyang Liu
Chao Qian
Wayne Wu
Limin Wang
100
18
0
09 Dec 2021
Trajectory-Constrained Deep Latent Visual Attention for Improved Local
  Planning in Presence of Heterogeneous Terrain
Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain
Stefan Wapnick
Travis Manderson
David Meger
Gregory Dudek
94
5
0
09 Dec 2021
Relating Blindsight and AI: A Review
Relating Blindsight and AI: A Review
Joshua Bensemann
Qiming Bao
Gaël Gendron
Tim Hartill
Michael Witbrock
107
2
0
09 Dec 2021
Forecasting Brain Activity Based on Models of Spatio-Temporal Brain
  Dynamics: A Comparison of Graph Neural Network Architectures
Forecasting Brain Activity Based on Models of Spatio-Temporal Brain Dynamics: A Comparison of Graph Neural Network Architectures
S. Wein
Alina Schüller
A. Tomé
W. Malloni
M. Greenlee
E. Lang
AI4CE
86
15
0
08 Dec 2021
BA-Net: Bridge Attention for Deep Convolutional Neural Networks
BA-Net: Bridge Attention for Deep Convolutional Neural Networks
Yue Zhao
Junzhou Chen
Zirui Zhang
Ronghui Zhang
89
17
0
08 Dec 2021
Active Sensing for Communications by Learning
Active Sensing for Communications by Learning
Foad Sohrabi
Tao Jiang
Wei Cui
Wei Yu
111
56
0
08 Dec 2021
CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification
CMA-CLIP: Cross-Modality Attention CLIP for Image-Text Classification
Huidong Liu
Shaoyuan Xu
Jinmiao Fu
Yang Liu
Ning Xie
Chien Wang
Bryan Wang
Yi Sun
CLIPVLM
72
29
0
07 Dec 2021
Protecting Intellectual Property of Language Generation APIs with
  Lexical Watermark
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark
Xuanli He
Xingliang Yuan
Lingjuan Lyu
Fangzhao Wu
Chenguang Wang
WaLM
249
98
0
05 Dec 2021
Explainable Deep Learning in Healthcare: A Methodological Survey from an
  Attribution View
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View
Di Jin
Elena Sergeeva
W. Weng
Geeticka Chauhan
Peter Szolovits
OOD
120
58
0
05 Dec 2021
VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts
VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts
Longtian Qiu
Renrui Zhang
Ziyu Guo
Wei Zhang
Zilu Guo
Ziyao Zeng
Guangnan Zhang
VLMCLIP
80
45
0
04 Dec 2021
BAANet: Learning Bi-directional Adaptive Attention Gates for
  Multispectral Pedestrian Detection
BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection
Xiaoxiao Yang
Yeqian Qiang
Huijie Zhu
Chunxiang Wang
Ming Yang
65
35
0
04 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning
  and Visual Grounding
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
81
32
0
02 Dec 2021
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLMCLIP
232
584
0
02 Dec 2021
SCNet: A Generalized Attention-based Model for Crack Fault Segmentation
SCNet: A Generalized Attention-based Model for Crack Fault Segmentation
Hrishikesh Sharma
Pandaba Pradhan
P. Balamuralidhar
72
6
0
02 Dec 2021
Attention based Occlusion Removal for Hybrid Telepresence Systems
Attention based Occlusion Removal for Hybrid Telepresence Systems
Surabhi Gupta
Ashwath Shetty
Avinash Sharma
CVBM3DH
49
2
0
02 Dec 2021
N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event
  Cameras
N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras
Junho Kim
Jaehyeok Bae
Gang-Ryeong Park
Dongsu Zhang
Y. Kim
ObjD
103
87
0
02 Dec 2021
Consensus Graph Representation Learning for Better Grounded Image
  Captioning
Consensus Graph Representation Learning for Better Grounded Image Captioning
Wenqiao Zhang
Haochen Shi
Siliang Tang
Jun Xiao
Qiang Yu
Yueting Zhuang
81
56
0
02 Dec 2021
Object-Centric Unsupervised Image Captioning
Object-Centric Unsupervised Image Captioning
Zihang Meng
David Yang
Xuefei Cao
Ashish Shah
Ser-Nam Lim
OCLVLM
80
12
0
02 Dec 2021
Visual-Semantic Transformer for Scene Text Recognition
Visual-Semantic Transformer for Scene Text Recognition
Xin Tang
Yongquan Lai
Ying Liu
Yuanyuan Fu
Rui Fang
ViT
66
9
0
02 Dec 2021
Transformer-based Network for RGB-D Saliency Detection
Transformer-based Network for RGB-D Saliency Detection
Yue Wang
Xu Jia
Lu Zhang
Yuke Li
J. Elder
Huchuan Lu
ViT
113
5
0
01 Dec 2021
Weakly-Supervised Video Object Grounding via Causal Intervention
Weakly-Supervised Video Object Grounding via Causal Intervention
Wei Wang
Junyu Gao
Changsheng Xu
CML
104
22
0
01 Dec 2021
Dyadic Human Motion Prediction
Dyadic Human Motion Prediction
Isinsu Katircioglu
C. Georgantas
Mathieu Salzmann
Pascal Fua
126
11
0
01 Dec 2021
ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point
  Clouds
ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds
Georg Bökman
Fredrik Kahl
Axel Flinth
3DPC
69
20
0
30 Nov 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
95
47
0
29 Nov 2021
LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video
  Question Answering
LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering
Jingjing Jiang
Zi-yi Liu
N. Zheng
87
14
0
29 Nov 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
122
197
0
29 Nov 2021
Previous
123...161718...697071
Next