Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
Learning Multi-Attention Context Graph for Group-Based Re-Identification
Yichao Yan
Jie Qin
Bingbing Ni
Jiaxin Chen
Li Liu
Fan Zhu
Weishi Zheng
Xiaokang Yang
Ling Shao
122
45
0
29 Apr 2021
Exploring Relational Context for Multi-Task Dense Prediction
David Brüggemann
Menelaos Kanakis
Anton Obukhov
Stamatios Georgoulis
Luc Van Gool
128
77
0
28 Apr 2021
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
Ukyo Honda
Yoshitaka Ushiku
Atsushi Hashimoto
Taro Watanabe
Yuji Matsumoto
73
23
0
28 Apr 2021
CAGAN: Text-To-Image Generation with Combined Attention GANs
Henning Schulze
Dogucan Yaman
Alexander Waibel
GAN
42
3
0
26 Apr 2021
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
116
37
0
24 Apr 2021
EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case
Natalia Díaz Rodríguez
Alberto Lamas
Jules Sanchez
Gianni Franchi
Ivan Donadello
Siham Tabik
David Filliat
P. Cruz
Rosana Montes
Francisco Herrera
140
78
0
24 Apr 2021
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
Ran Ben Izhak
Alon Lahav
A. Tal
3DV
109
10
0
23 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
86
58
0
23 Apr 2021
Multi-task Learning with Attention for End-to-end Autonomous Driving
Keishi Ishihara
Anssi Kanervisto
J. Miura
Ville Hautamaki
99
65
0
21 Apr 2021
Discrete-continuous Action Space Policy Gradient-based Attention for Image-Text Matching
Shiyang Yan
Li Yu
Yuan Xie
91
34
0
21 Apr 2021
Improving Weakly-supervised Object Localization via Causal Intervention
Feifei Shao
Yawei Luo
Li Zhang
Lu Ye
Siliang Tang
Yi Yang
Jun Xiao
WSOL
78
25
0
21 Apr 2021
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLM
ALM
94
25
0
20 Apr 2021
Visual Navigation with Spatial Attention
Bar Mayo
Tamir Hazan
A. Tal
EgoV
84
76
0
20 Apr 2021
Attention in Attention Network for Image Super-Resolution
Haoyu Chen
Jinjin Gu
Zhi-Li Zhang
SupR
74
70
0
19 Apr 2021
Surrogate Gradient Field for Latent Space Manipulation
Minjun Li
Yanghua Jin
Huachun Zhu
GAN
67
18
0
19 Apr 2021
Concadia: Towards Image-Based Text Generation with a Purpose
Elisa Kreiss
Fei Fang
Noah D. Goodman
Christopher Potts
139
23
0
16 Apr 2021
Robust Open-Vocabulary Translation from Visual Text Representations
Elizabeth Salesky
David Etter
Matt Post
VLM
84
42
0
16 Apr 2021
Pose Recognition with Cascade Transformers
Ke Li
Shijie Wang
Xiang Zhang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
76
214
0
14 Apr 2021
Autonomous Vehicles Drive into Shared Spaces: eHMI Design Concept Focusing on Vulnerable Road Users
Yang Li
Hao Cheng
Zhe Zeng
Hailong Liu
Monika Sester
82
30
0
14 Apr 2021
Revisiting the Onsets and Frames Model with Additive Attention
K. Cheuk
Yin-Jyun Luo
Emmanouil Benetos
Dorien Herremans
55
20
0
14 Apr 2021
Co-Scale Conv-Attentional Image Transformers
Weijian Xu
Yifan Xu
Tyler A. Chang
Zhuowen Tu
ViT
61
377
0
13 Apr 2021
A State-of-the-art Survey of Artificial Neural Networks for Whole-slide Image Analysis:from Popular Convolutional Neural Networks to Potential Visual Transformers
Xintong Li
Xirong Li
Chen Li
M. Rahaman
Jian Wu
Xiaoqi Li
Yudong Yao
M. Grzegorzek
ViT
MedIm
84
45
0
13 Apr 2021
Automatic Generation of Descriptive Titles for Video Clips Using Deep Learning
Soheyla Amirian
Khaled Rasheed
T. Taha
H. Arabnia
VLM
VGen
54
23
0
07 Apr 2021
Differentiable Patch Selection for Image Recognition
Jean-Baptiste Cordonnier
Aravindh Mahendran
Alexey Dosovitskiy
Dirk Weissenborn
Jakob Uszkoreit
Thomas Unterthiner
112
96
0
07 Apr 2021
Multimodal Continuous Visual Attention Mechanisms
António Farinhas
André F. T. Martins
P. Aguiar
69
7
0
07 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
129
99
0
05 Apr 2021
FixMyPose: Pose Correctional Captioning and Retrieval
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Joey Tianyi Zhou
66
16
0
04 Apr 2021
Influencing Reinforcement Learning through Natural Language Guidance
Tasmia Tasrin
Md Sultan al Nahian
Habarakadage Perera
Brent Harrison
59
6
0
04 Apr 2021
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers
Tsu-Jui Fu
Xinze Wang
Scott T. Grafton
Miguel P. Eckstein
Wenjie Wang
122
9
0
02 Apr 2021
The Spatially-Correlative Loss for Various Image Translation Tasks
Chuanxia Zheng
Tat-Jen Cham
Jianfei Cai
99
121
0
02 Apr 2021
Towards General Purpose Vision Systems
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
105
53
0
01 Apr 2021
DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared Cross-modality Person Re-identification
Junhui Yin
Zhanyu Ma
Jiyang Xie
Shibo Nie
Kongming Liang
Jun Guo
67
2
0
01 Apr 2021
Qualitative Planning in Imperfect Information Games with Active Sensing and Reactive Sensor Attacks: Cost of Unawareness
A. Kulkarni
Shuo Han
Nandi O. Leslie
Charles A. Kamhoua
Jie Fu
54
2
0
01 Apr 2021
NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization
Tien-Ju Yang
Yi-Lun Liao
Vivienne Sze
118
57
0
31 Mar 2021
FANet: A Feedback Attention Network for Improved Biomedical Image Segmentation
Nikhil Kumar Tomar
Debesh Jha
Michael A. Riegler
Haavard D. Johansen
Dag Johansen
J. Rittscher
Pål Halvorsen
Sharib Ali
MedIm
88
154
0
31 Mar 2021
Data in context: How digital transformation can support human reasoning in cyber-physical production systems
Romy Müller
F. Kessler
David W. Humphrey
Julian Rahm
20
7
0
31 Mar 2021
Channel-Based Attention for LCC Using Sentinel-2 Time Series
Hermann Courteille
A. Benoît
N. Méger
A. Atto
Dino Ienco
AI4TS
31
1
0
31 Mar 2021
Attention, please! A survey of Neural Attention Models in Deep Learning
Alana de Santana Correia
Esther Luna Colombini
HAI
132
198
0
31 Mar 2021
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
135
61
0
31 Mar 2021
A study of latent monotonic attention variants
Albert Zeyer
Ralf Schluter
Hermann Ney
75
5
0
30 Mar 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
103
121
0
30 Mar 2021
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Xiaosong Wang
Ziyue Xu
Leo K. Tam
Dong Yang
Daguang Xu
ViT
MedIm
73
24
0
30 Mar 2021
Embedding API Dependency Graph for Neural Code Generation
Chen Lyu
Ruyun Wang
Hongyu Zhang
Hanwen Zhang
Songlin Hu
GNN
62
20
0
29 Mar 2021
SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences
Shun-cheng Wu
Johanna Wald
Keisuke Tateno
Nassir Navab
Federico Tombari
3DPC
87
161
0
27 Mar 2021
Dodrio: Exploring Transformer Models with Interactive Visualization
Zijie J. Wang
Robert Turko
Duen Horng Chau
85
36
0
26 Mar 2021
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
139
392
0
26 Mar 2021
Deep EHR Spotlight: a Framework and Mechanism to Highlight Events in Electronic Health Records for Explainable Predictions
Thanh Nguyen-Duc
N. Mulligan
G. Mannu
Joao H. Bettencourt-Silva
BDL
19
6
0
25 Mar 2021
Describing and Localizing Multiple Changes with Transformers
Yue Qiu
Shintaro Yamamoto
Kodai Nakashima
Ryota Suzuki
K. Iwata
Hirokatsu Kataoka
Y. Satoh
93
59
0
25 Mar 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
Ye Yuan
Xinshuo Weng
Yanglan Ou
Kris Kitani
AI4TS
117
461
0
25 Mar 2021
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
A. Bhunia
Pinaki Nath Chowdhury
Aneeshan Sain
Yongxin Yang
Tao Xiang
Yi-Zhe Song
GAN
SSL
91
62
0
25 Mar 2021
Previous
1
2
3
...
22
23
24
...
69
70
71
Next