ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Learning Multi-Attention Context Graph for Group-Based Re-Identification
Learning Multi-Attention Context Graph for Group-Based Re-Identification
Yichao Yan
Jie Qin
Bingbing Ni
Jiaxin Chen
Li Liu
Fan Zhu
Weishi Zheng
Xiaokang Yang
Ling Shao
122
45
0
29 Apr 2021
Exploring Relational Context for Multi-Task Dense Prediction
Exploring Relational Context for Multi-Task Dense Prediction
David Brüggemann
Menelaos Kanakis
Anton Obukhov
Stamatios Georgoulis
Luc Van Gool
128
77
0
28 Apr 2021
Removing Word-Level Spurious Alignment between Images and
  Pseudo-Captions in Unsupervised Image Captioning
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
Ukyo Honda
Yoshitaka Ushiku
Atsushi Hashimoto
Taro Watanabe
Yuji Matsumoto
73
23
0
28 Apr 2021
CAGAN: Text-To-Image Generation with Combined Attention GANs
CAGAN: Text-To-Image Generation with Combined Attention GANs
Henning Schulze
Dogucan Yaman
Alexander Waibel
GAN
42
3
0
26 Apr 2021
MusCaps: Generating Captions for Music Audio
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
116
37
0
24 Apr 2021
EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep
  learning representations with expert knowledge graphs: the MonuMAI cultural
  heritage use case
EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case
Natalia Díaz Rodríguez
Alberto Lamas
Jules Sanchez
Gianni Franchi
Ivan Donadello
Siham Tabik
David Filliat
P. Cruz
Rosana Montes
Francisco Herrera
140
78
0
24 Apr 2021
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
Ran Ben Izhak
Alon Lahav
A. Tal
3DV
109
10
0
23 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity
  Exploration
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
86
58
0
23 Apr 2021
Multi-task Learning with Attention for End-to-end Autonomous Driving
Multi-task Learning with Attention for End-to-end Autonomous Driving
Keishi Ishihara
Anssi Kanervisto
J. Miura
Ville Hautamaki
99
65
0
21 Apr 2021
Discrete-continuous Action Space Policy Gradient-based Attention for
  Image-Text Matching
Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching
Shiyang Yan
Li Yu
Yuan Xie
91
34
0
21 Apr 2021
Improving Weakly-supervised Object Localization via Causal Intervention
Improving Weakly-supervised Object Localization via Causal Intervention
Feifei Shao
Yawei Luo
Li Zhang
Lu Ye
Siliang Tang
Yi Yang
Jun Xiao
WSOL
78
25
0
21 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLMALM
94
25
0
20 Apr 2021
Visual Navigation with Spatial Attention
Visual Navigation with Spatial Attention
Bar Mayo
Tamir Hazan
A. Tal
EgoV
84
76
0
20 Apr 2021
Attention in Attention Network for Image Super-Resolution
Attention in Attention Network for Image Super-Resolution
Haoyu Chen
Jinjin Gu
Zhi-Li Zhang
SupR
74
70
0
19 Apr 2021
Surrogate Gradient Field for Latent Space Manipulation
Surrogate Gradient Field for Latent Space Manipulation
Minjun Li
Yanghua Jin
Huachun Zhu
GAN
67
18
0
19 Apr 2021
Concadia: Towards Image-Based Text Generation with a Purpose
Concadia: Towards Image-Based Text Generation with a Purpose
Elisa Kreiss
Fei Fang
Noah D. Goodman
Christopher Potts
139
23
0
16 Apr 2021
Robust Open-Vocabulary Translation from Visual Text Representations
Robust Open-Vocabulary Translation from Visual Text Representations
Elizabeth Salesky
David Etter
Matt Post
VLM
84
42
0
16 Apr 2021
Pose Recognition with Cascade Transformers
Pose Recognition with Cascade Transformers
Ke Li
Shijie Wang
Xiang Zhang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
76
214
0
14 Apr 2021
Autonomous Vehicles Drive into Shared Spaces: eHMI Design Concept
  Focusing on Vulnerable Road Users
Autonomous Vehicles Drive into Shared Spaces: eHMI Design Concept Focusing on Vulnerable Road Users
Yang Li
Hao Cheng
Zhe Zeng
Hailong Liu
Monika Sester
82
30
0
14 Apr 2021
Revisiting the Onsets and Frames Model with Additive Attention
Revisiting the Onsets and Frames Model with Additive Attention
K. Cheuk
Yin-Jyun Luo
Emmanouil Benetos
Dorien Herremans
55
20
0
14 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Co-Scale Conv-Attentional Image Transformers
Weijian Xu
Yifan Xu
Tyler A. Chang
Zhuowen Tu
ViT
61
377
0
13 Apr 2021
A State-of-the-art Survey of Artificial Neural Networks for Whole-slide
  Image Analysis:from Popular Convolutional Neural Networks to Potential Visual
  Transformers
A State-of-the-art Survey of Artificial Neural Networks for Whole-slide Image Analysis:from Popular Convolutional Neural Networks to Potential Visual Transformers
Xintong Li
Xirong Li
Chen Li
M. Rahaman
Jian Wu
Xiaoqi Li
Yudong Yao
M. Grzegorzek
ViTMedIm
84
45
0
13 Apr 2021
Automatic Generation of Descriptive Titles for Video Clips Using Deep
  Learning
Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning
Soheyla Amirian
Khaled Rasheed
T. Taha
H. Arabnia
VLMVGen
54
23
0
07 Apr 2021
Differentiable Patch Selection for Image Recognition
Differentiable Patch Selection for Image Recognition
Jean-Baptiste Cordonnier
Aravindh Mahendran
Alexey Dosovitskiy
Dirk Weissenborn
Jakob Uszkoreit
Thomas Unterthiner
112
96
0
07 Apr 2021
Multimodal Continuous Visual Attention Mechanisms
Multimodal Continuous Visual Attention Mechanisms
António Farinhas
André F. T. Martins
P. Aguiar
69
7
0
07 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
129
99
0
05 Apr 2021
FixMyPose: Pose Correctional Captioning and Retrieval
FixMyPose: Pose Correctional Captioning and Retrieval
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Joey Tianyi Zhou
66
16
0
04 Apr 2021
Influencing Reinforcement Learning through Natural Language Guidance
Influencing Reinforcement Learning through Natural Language Guidance
Tasmia Tasrin
Md Sultan al Nahian
Habarakadage Perera
Brent Harrison
59
6
0
04 Apr 2021
M3L: Language-based Video Editing via Multi-Modal Multi-Level
  Transformers
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers
Tsu-Jui Fu
Xinze Wang
Scott T. Grafton
Miguel P. Eckstein
Wenjie Wang
122
9
0
02 Apr 2021
The Spatially-Correlative Loss for Various Image Translation Tasks
The Spatially-Correlative Loss for Various Image Translation Tasks
Chuanxia Zheng
Tat-Jen Cham
Jianfei Cai
99
121
0
02 Apr 2021
Towards General Purpose Vision Systems
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
105
53
0
01 Apr 2021
DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared
  Cross-modality Person Re-identification
DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared Cross-modality Person Re-identification
Junhui Yin
Zhanyu Ma
Jiyang Xie
Shibo Nie
Kongming Liang
Jun Guo
67
2
0
01 Apr 2021
Qualitative Planning in Imperfect Information Games with Active Sensing
  and Reactive Sensor Attacks: Cost of Unawareness
Qualitative Planning in Imperfect Information Games with Active Sensing and Reactive Sensor Attacks: Cost of Unawareness
A. Kulkarni
Shuo Han
Nandi O. Leslie
Charles A. Kamhoua
Jie Fu
54
2
0
01 Apr 2021
NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network
  Training and Architecture Optimization
NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization
Tien-Ju Yang
Yi-Lun Liao
Vivienne Sze
118
57
0
31 Mar 2021
FANet: A Feedback Attention Network for Improved Biomedical Image
  Segmentation
FANet: A Feedback Attention Network for Improved Biomedical Image Segmentation
Nikhil Kumar Tomar
Debesh Jha
Michael A. Riegler
Haavard D. Johansen
Dag Johansen
J. Rittscher
Pål Halvorsen
Sharib Ali
MedIm
88
154
0
31 Mar 2021
Data in context: How digital transformation can support human reasoning
  in cyber-physical production systems
Data in context: How digital transformation can support human reasoning in cyber-physical production systems
Romy Müller
F. Kessler
David W. Humphrey
Julian Rahm
20
7
0
31 Mar 2021
Channel-Based Attention for LCC Using Sentinel-2 Time Series
Channel-Based Attention for LCC Using Sentinel-2 Time Series
Hermann Courteille
A. Benoît
N. Méger
A. Atto
Dino Ienco
AI4TS
31
1
0
31 Mar 2021
Attention, please! A survey of Neural Attention Models in Deep Learning
Attention, please! A survey of Neural Attention Models in Deep Learning
Alana de Santana Correia
Esther Luna Colombini
HAI
132
198
0
31 Mar 2021
Dual Contrastive Loss and Attention for GANs
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
135
61
0
31 Mar 2021
A study of latent monotonic attention variants
A study of latent monotonic attention variants
Albert Zeyer
Ralf Schluter
Hermann Ney
75
5
0
30 Mar 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
103
121
0
30 Mar 2021
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Xiaosong Wang
Ziyue Xu
Leo K. Tam
Dong Yang
Daguang Xu
ViTMedIm
73
24
0
30 Mar 2021
Embedding API Dependency Graph for Neural Code Generation
Embedding API Dependency Graph for Neural Code Generation
Chen Lyu
Ruyun Wang
Hongyu Zhang
Hanwen Zhang
Songlin Hu
GNN
62
20
0
29 Mar 2021
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D
  Sequences
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences
Shun-cheng Wu
Johanna Wald
Keisuke Tateno
Nassir Navab
Federico Tombari
3DPC
87
161
0
27 Mar 2021
Dodrio: Exploring Transformer Models with Interactive Visualization
Dodrio: Exploring Transformer Models with Interactive Visualization
Zijie J. Wang
Robert Turko
Duen Horng Chau
85
36
0
26 Mar 2021
Understanding Robustness of Transformers for Image Classification
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
139
392
0
26 Mar 2021
Deep EHR Spotlight: a Framework and Mechanism to Highlight Events in
  Electronic Health Records for Explainable Predictions
Deep EHR Spotlight: a Framework and Mechanism to Highlight Events in Electronic Health Records for Explainable Predictions
Thanh Nguyen-Duc
N. Mulligan
G. Mannu
Joao H. Bettencourt-Silva
BDL
19
6
0
25 Mar 2021
Describing and Localizing Multiple Changes with Transformers
Describing and Localizing Multiple Changes with Transformers
Yue Qiu
Shintaro Yamamoto
Kodai Nakashima
Ryota Suzuki
K. Iwata
Hirokatsu Kataoka
Y. Satoh
93
59
0
25 Mar 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
  Forecasting
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
Ye Yuan
Xinshuo Weng
Yanglan Ou
Kris Kitani
AI4TS
117
461
0
25 Mar 2021
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
  Sketch Based Image Retrieval
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
A. Bhunia
Pinaki Nath Chowdhury
Aneeshan Sain
Yongxin Yang
Tao Xiang
Yi-Zhe Song
GANSSL
91
62
0
25 Mar 2021
Previous
123...222324...697071
Next