Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,510 papers shown
Title
Code-DKT: A Code-based Knowledge Tracing Model for Programming Tasks
Yang Shi
Min Chi
Tiffany Barnes
T. Price
AI4Ed
37
24
0
07 Jun 2022
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
15
0
0
07 Jun 2022
Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Wei Ji
Tat-Seng Chua
OOD
23
95
0
06 Jun 2022
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViT
OOD
MedIm
23
24
0
02 Jun 2022
CLIP4IDC: CLIP for Image Difference Captioning
Zixin Guo
T. Wang
Jorma T. Laaksonen
VLM
29
27
0
01 Jun 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
16
6
0
28 May 2022
Personalized PageRank Graph Attention Networks
Julie Choi
GNN
13
5
0
27 May 2022
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
27
31
0
26 May 2022
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
134
77
0
26 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
123
36
0
25 May 2022
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
34
10
0
25 May 2022
Face2Text revisited: Improved data set and baseline results
Marc Tanti
Shaun Abdilla
A. Muscat
Claudia Borg
R. Farrugia
Albert Gatt
CVBM
19
3
0
24 May 2022
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Y. Yun
Weisi Lin
ViT
60
28
0
23 May 2022
Explanatory machine learning for sequential human teaching
L. Ai
Johannes Langer
Stephen Muggleton
Ute Schmid
31
5
0
20 May 2022
Explainable Supervised Domain Adaptation
V. Kamakshi
N. C. Krishnan
40
2
0
20 May 2022
Let's Talk! Striking Up Conversations via Conversational Visual Question Generation
Shih-Han Chan
Tsai-Lun Yang
Yun-Wei Chu
Chi-Yang Hsu
Ting-Hao 'Kenneth' Huang
Yu-Shian Chiu
Lun-Wei Ku
21
1
0
19 May 2022
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
24
4
0
19 May 2022
It Isn't Sh!tposting, It's My CAT Posting
Parthsarthi Rawat
Sayan Das
Jorge Aguirre
Akhil Daphara
ViT
25
0
0
18 May 2022
Measuring Alignment Bias in Neural Seq2Seq Semantic Parsers
Davide Locatelli
A. Quattoni
25
2
0
17 May 2022
Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection
Wei Ji
Jingjing Li
Qi Bi
Chuan Guo
Jieyan Liu
Li Cheng
36
45
0
15 May 2022
Video-based assessment of intraoperative surgical skill
Sanchit Hira
Digvijay Singh
Tae Soo Kim
Shobhit Gupta
Gregory D. Hager
S. Sikder
S. Vedula
30
16
0
13 May 2022
Explainable Deep Learning Methods in Medical Image Classification: A Survey
Cristiano Patrício
João C. Neves
Luís F. Teixeira
XAI
24
53
0
10 May 2022
A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing
Michael Neely
Stefan F. Schouten
Maurits J. R. Bleeker
Ana Lucic
XAI
27
16
0
09 May 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
27
52
0
09 May 2022
Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Zhipeng Zhang
Xinglin Hou
K. Niu
Zhongzhen Huang
T. Ge
Yuning Jiang
Qi Wu
Peifeng Wang
31
4
0
07 May 2022
From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data
Zhenghang Yuan
Lichao Mou
Q. Wang
Xiao Xiang Zhu
27
60
0
06 May 2022
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization
Wei Wei
Hengguan Huang
Xiangming Gu
Hao Wang
Ye Wang
BDL
27
0
0
05 May 2022
Language Models Can See: Plugging Visual Controls in Text Generation
Yixuan Su
Tian Lan
Yahui Liu
Fangyu Liu
Dani Yogatama
Yan Wang
Lingpeng Kong
Nigel Collier
VLM
MLLM
56
97
0
05 May 2022
Diverse Image Captioning with Grounded Style
Franz Klein
Shweta Mahajan
S. Roth
22
7
0
03 May 2022
Inducing and Using Alignments for Transition-based AMR Parsing
Andrew Drozdov
Jiawei Zhou
Radu Florian
Andrew McCallum
Tahira Naseem
Yoon Kim
Ramón Fernández Astudillo
39
27
0
03 May 2022
Cracking White-box DNN Watermarks via Invariant Neuron Transforms
Yifan Yan
Xudong Pan
Yining Wang
Mi Zhang
Min Yang
AAML
27
14
0
30 Apr 2022
Controllable Image Captioning
Luka Maxwell
35
0
0
28 Apr 2022
Cross-modal Memory Networks for Radiology Report Generation
Zhihong Chen
Yaling Shen
Yan Song
Xiang Wan
MedIm
47
248
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
46
150
0
27 Apr 2022
A survey on attention mechanisms for medical applications: are we moving towards better algorithms?
Tiago Gonçalves
Isabel Rio-Torto
Luís F. Teixeira
J. S. Cardoso
OOD
MedIm
37
36
0
26 Apr 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
40
29
0
25 Apr 2022
Recurrent Affine Transformation for Text-to-image Synthesis
Senmao Ye
Fei Liu
Mingkui Tan
25
26
0
22 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
29
132
0
21 Apr 2022
Attention in Reasoning: Dataset, Analysis, and Modeling
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
36
3
0
20 Apr 2022
Behind the Machine's Gaze: Neural Networks with Biologically-inspired Constraints Exhibit Human-like Visual Attention
Leo Schwinn
Doina Precup
Bjoern M. Eskofier
Dario Zanca
20
7
0
19 Apr 2022
Missingness Bias in Model Debugging
Saachi Jain
Hadi Salman
E. Wong
Pengchuan Zhang
Vibhav Vineet
Sai H. Vemprala
A. Madry
27
37
0
19 Apr 2022
Causal Intervention for Subject-Deconfounded Facial Action Unit Recognition
Yingjie Chen
Diqi Chen
Tao Wang
Yizhou Wang
Yun Liang
CVBM
CML
30
28
0
17 Apr 2022
Visual Attention Methods in Deep Learning: An In-Depth Survey
Mohammed Hassanin
Saeed Anwar
Ibrahim Radwan
Fahad Shahbaz Khan
Ajmal Mian
34
145
0
16 Apr 2022
It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection
Youssef Mohamed
Faizan Farooq Khan
Kilichbek Haydarov
Mohamed Elhoseiny
20
30
0
15 Apr 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
21
4
0
15 Apr 2022
Image Captioning In the Transformer Age
Yangliu Xu
Li Li
Haiyang Xu
Songfang Huang
Fei Huang
Jianfei Cai
ViT
27
5
0
15 Apr 2022
Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning
Feilong Chen
Xiuyi Chen
Shuang Xu
Bo Xu
VLM
40
18
0
15 Apr 2022
Interpretability of Machine Learning Methods Applied to Neuroimaging
Elina Thibeau-Sutre
S. Collin
Ninon Burgos
O. Colliot
16
4
0
14 Apr 2022
A Review of Machine Learning Methods Applied to Structural Dynamics and Vibroacoustic
Barbara Z Cunha
C. Droz
A. Zine
Stéphane Foulard
M. Ichchou
AI4CE
37
84
0
13 Apr 2022
Video Captioning: a comparative review of where we are and which could be the route
Daniela Moctezuma
Tania A. Ramirez-delreal
Guillermo Ruiz
Othón González-Chávez
27
11
0
12 Apr 2022
Previous
1
2
3
...
13
14
15
...
69
70
71
Next