Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
Multimodal Learning for Hateful Memes Detection
Yi Zhou
Zhenhao Chen
102
61
0
25 Nov 2020
Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction
Anzhu Yu
Wenyue Guo
Bing Liu
Xin Chen
Xin Eric Wang
Xuefeng Cao
Bingchuan Jiang
3DV
85
64
0
25 Nov 2020
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
116
125
0
25 Nov 2020
Energy-Based Models for Continual Learning
Shuang Li
Yilun Du
Gido M. van de Ven
Igor Mordatch
91
42
0
24 Nov 2020
Gender bias in magazines oriented to men and women: a computational approach
Diego Kozlowski
Gabriela Lozano
Carla M Felcher
F. González
Edgar Altszyler
14
1
0
24 Nov 2020
Exploring Alternatives to Softmax Function
K. Banerjee
Vishak C.
R. Gupta
Kartik Vyas
Anushree H.
Biswajit Mishra
57
50
0
23 Nov 2020
PS-DeVCEM: Pathology-sensitive deep learning model for video capsule endoscopy based on weakly labeled data
A. Mohammed
I. Farup
Marius Pedersen
Sule YAYILGAN YILDIRIM
Ø. Hovde
77
18
0
22 Nov 2020
Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning
Weixia Zhang
Chao Ma
Qi Wu
Xiaokang Yang
109
46
0
22 Nov 2020
On the Dynamics of Training Attention Models
Haoye Lu
Yongyi Mao
A. Nayak
45
8
0
19 Nov 2020
Using Text to Teach Image Retrieval
Haoyu Dong
Ze Wang
Qiang Qiu
Guillermo Sapiro
3DV
75
4
0
19 Nov 2020
Unmixing Convolutional Features for Crisp Edge Detection
Linxi Huan
Nan Xue
Xianwei Zheng
Wei He
J. Gong
Guisong Xia
117
71
0
19 Nov 2020
Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language
Hassan Akbari
Hamid Palangi
Jianwei Yang
Sudha Rao
Asli Celikyilmaz
Roland Fernandez
P. Smolensky
Jianfeng Gao
Shih-Fu Chang
106
3
0
18 Nov 2020
End-to-End Object Detection with Adaptive Clustering Transformer
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
ViT
211
199
0
18 Nov 2020
Master Thesis: Neural Sign Language Translation by Learning Tokenization
Alptekin Orbay
SLR
26
0
0
18 Nov 2020
Sequence-Level Mixed Sample Data Augmentation
Demi Guo
Yoon Kim
Alexander M. Rush
78
102
0
18 Nov 2020
Improving Calibration in Deep Metric Learning With Cross-Example Softmax
Andreas Veit
Kimberly Wilber
26
2
0
17 Nov 2020
Structural and Functional Decomposition for Personality Image Captioning in a Communication Game
Minh-Thu Nguyen
Duy Phung
Minh Hoai
Thien Huu Nguyen
65
4
0
17 Nov 2020
A Survey on the Explainability of Supervised Machine Learning
Nadia Burkart
Marco F. Huber
FaML
XAI
77
784
0
16 Nov 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Joey Tianyi Zhou
OffRL
72
7
0
15 Nov 2020
G-RCN: Optimizing the Gap between Classification and Localization Tasks for Object Detection
Yufan Luo
Li Xiao
ObjD
64
2
0
14 Nov 2020
Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies
Jonas Andrulis
Ole Meyer
Grégory Schott
Samuel Weinbach
V. Gruhn
39
4
0
12 Nov 2020
SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots
M. Islam
R. Wang
Junaed Sattar
78
47
0
12 Nov 2020
Improving Multimodal Accuracy Through Modality Pre-training and Attention
Aya Abdelsalam Ismail
Mahmudul Hasan
F. Ishtiaq
72
17
0
11 Nov 2020
DoLFIn: Distributions over Latent Features for Interpretability
Phong Le
Willem H. Zuidema
FAtt
30
0
0
10 Nov 2020
Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition
T. Ayral
M. Pedersoli
Simon L Bacon
Eric Granger
CVBM
3DH
53
11
0
10 Nov 2020
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze
Ece Takmaz
Sandro Pezzelle
Lisa Beinborn
Raquel Fernández
85
24
0
09 Nov 2020
MLAS: Metric Learning on Attributed Sequences
Zhongfang Zhuang
Xiangnan Kong
Elke A. Rundensteiner
Jihane Zouaoui
Aditya Arora
46
1
0
08 Nov 2020
Integrating Human Gaze into Attention for Egocentric Activity Recognition
Kyle Min
Jason J. Corso
68
43
0
08 Nov 2020
Channel Pruning Guided by Spatial and Channel Attention for DNNs in Intelligent Edge Computing
Mengran Liu
Weiwei Fang
Xiaodong Ma
Wenyuan Xu
N. Xiong
Qiankun Li
AAML
100
21
0
08 Nov 2020
A Multi-Channel Temporal Attention Convolutional Neural Network Model for Environmental Sound Classification
You Wang
Chuyao Feng
David V. Anderson
48
17
0
04 Nov 2020
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
55
45
0
04 Nov 2020
Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework
Johanes Effendi
Andros Tjandra
S. Sakti
Satoshi Nakamura
30
3
0
04 Nov 2020
Attention Beam: An Image Captioning Approach
Anubhav Shrimal
Tanmoy Chakraborty
3DV
25
4
0
03 Nov 2020
Dual Attention on Pyramid Feature Maps for Image Captioning
Litao Yu
Jian Zhang
Qiang Wu
110
50
0
02 Nov 2020
Perceive, Attend, and Drive: Learning Spatial Attention for Safe Self-Driving
Bob Wei
Mengye Ren
Wenyuan Zeng
Ming Liang
Binh Yang
R. Urtasun
3DPC
105
44
0
02 Nov 2020
Diverse Image Captioning with Context-Object Split Latent Spaces
Shweta Mahajan
Stefan Roth
64
42
0
02 Nov 2020
Boost Image Captioning with Knowledge Reasoning
Feicheng Huang
Zhixin Li
Haiyang Wei
Canlong Zhang
Huifang Ma
48
25
0
02 Nov 2020
Multimodal Continuous Emotion Recognition using Deep Multi-Task Learning with Correlation Loss
Berkay Köprü
E. Erzin
CVBM
50
5
0
02 Nov 2020
Personalized Multimodal Feedback Generation in Education
Haochen Liu
Zitao Liu
Zhongqin Wu
Jiliang Tang
59
9
0
31 Oct 2020
Generating Radiology Reports via Memory-driven Transformer
Zhihong Chen
Yan Song
Tsung-Hui Chang
Xiang Wan
MedIm
84
486
0
30 Oct 2020
Interpretable Representation Learning for Speech and Audio Signals Based on Relevance Weighting
Purvi Agrawal
Sriram Ganapathy
45
21
0
29 Oct 2020
Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations
Purvi Agrawal
Sriram Ganapathy
23
2
0
29 Oct 2020
Fusion Models for Improved Visual Captioning
M. Kalimuthu
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
60
0
0
28 Oct 2020
Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale
Ozan Caglayan
Pranava Madhyastha
Lucia Specia
ELM
105
36
0
26 Oct 2020
Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network
Haibo Su
Peng Wang
Lingqiao Liu
Hui Li
Zhuguo Li
Yanning Zhang
75
27
0
26 Oct 2020
Learning Multi-Agent Coordination for Enhancing Target Coverage in Directional Sensor Networks
Jing Xu
Fangwei Zhong
Yizhou Wang
83
50
0
25 Oct 2020
Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions
Radhika Dua
Sai Srinivas Kancheti
V. Balasubramanian
LRM
88
22
0
24 Oct 2020
Attention-Guided Network for Iris Presentation Attack Detection
Cunjian Chen
Arun Ross
AAML
23
0
0
23 Oct 2020
FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations
Lukas Lange
Heike Adel
Jannik Strötgen
Dietrich Klakow
49
7
0
23 Oct 2020
Show and Speak: Directly Synthesize Spoken Description of Images
Xinsheng Wang
Siyuan Feng
Jihua Zhu
M. Hasegawa-Johnson
O. Scharenborg
164
4
0
23 Oct 2020
Previous
1
2
3
...
26
27
28
...
69
70
71
Next