ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings
A. Bhunia
Subhadeep Koley
Amandeep Kumar
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
143
20
0
20 Mar 2023
Multi-modal reward for visual relationships-based image captioning
Multi-modal reward for visual relationships-based image captioning
Ali Abedi
Hossein Karshenas
Peyman Adibi
131
2
0
19 Mar 2023
Blind Multimodal Quality Assessment of Low-light Images
Blind Multimodal Quality Assessment of Low-light Images
Miaohui Wang
Zhuowei Xu
Mai Xu
Weisi Lin
85
2
0
18 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
51
1
0
17 Mar 2023
Rethinking White-Box Watermarks on Deep Learning Models under Neural
  Structural Obfuscation
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation
Yifan Yan
Xudong Pan
Mi Zhang
Min Yang
AAML
153
17
0
17 Mar 2023
Cross-Modal Causal Intervention for Medical Report Generation
Cross-Modal Causal Intervention for Medical Report Generation
Weixing Chen
Yang-Yang Liu
Ce Wang
Jiarui Zhu
Shen Zhao
Guanbin Li
Cheng-Lin Liu
82
7
0
16 Mar 2023
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning
Yongil Kim
Yerin Hwang
Hyeongu Yun
Seunghyun Yoon
Trung Bui
Kyomin Jung
70
6
0
15 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLMLRMReLM
136
469
0
14 Mar 2023
Interventional Bag Multi-Instance Learning On Whole-Slide Pathological
  Images
Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
Tiancheng Lin
Zhimiao Yu
Hongyu Hu
Yi Xu
Changyi Chen
121
88
0
13 Mar 2023
Focus on Change: Mood Prediction by Learning Emotion Changes via
  Spatio-Temporal Attention
Focus on Change: Mood Prediction by Learning Emotion Changes via Spatio-Temporal Attention
S. Narayana
Subramanian Ramanathan
Ibrahim Radwan
Roland Göcke
65
2
0
12 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and
  Multilingual Natural Language Generation
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
88
9
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image
  Captioning
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
112
5
0
11 Mar 2023
Comparative study of Transformer and LSTM Network with attention
  mechanism on Image Captioning
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
101
9
0
05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDLDiffM
61
33
0
04 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
92
25
0
02 Mar 2023
Inseq: An Interpretability Toolkit for Sequence Generation Models
Inseq: An Interpretability Toolkit for Sequence Generation Models
Gabriele Sarti
Nils Feldhus
Ludwig Sickert
Oskar van der Wal
Malvina Nissim
Arianna Bisazza
123
70
0
27 Feb 2023
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Chunpu Xu
Hanzhuo Tan
Jing Li
Piji Li
84
8
0
26 Feb 2023
Parallel Sentence-Level Explanation Generation for Real-World
  Low-Resource Scenarios
Parallel Sentence-Level Explanation Generation for Real-World Low-Resource Scenarios
Yang Liu
Xiaokang Chen
Qianwen Dai
LRM
51
4
0
21 Feb 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
80
29
0
16 Feb 2023
Large Scale Multi-Lingual Multi-Modal Summarization Dataset
Large Scale Multi-Lingual Multi-Modal Summarization Dataset
Yash Verma
Anubhav Jangra
Raghvendra Kumar
S. Saha
32
14
0
13 Feb 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
100
79
0
13 Feb 2023
See Your Heart: Psychological states Interpretation through Visual
  Creations
See Your Heart: Psychological states Interpretation through Visual Creations
Likun Yang
Xiaokun Feng
Xiaotang Chen
Shiyu Zhang
Kaiqi Huang
20
0
0
11 Feb 2023
Sketch Less Face Image Retrieval: A New Challenge
Sketch Less Face Image Retrieval: A New Challenge
Dawei Dai
Yutang Li
Liang Wang
Shiyu Fu
Shuyin Xia
Guo-Zhen Wang
3DHCVBM
68
7
0
11 Feb 2023
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Feng Hong
Jiangchao Yao
Zhihan Zhou
Ya Zhang
Yanfeng Wang
71
27
0
10 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
86
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
Brandon Birmingham
A. Muscat
54
1
0
07 Feb 2023
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image
  Captioning
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning
Jingqiang Chen
71
4
0
04 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Yucheng Zhou
Guodong Long
68
23
0
26 Jan 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
115
2
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled
  Data
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSLVLM
45
4
0
26 Jan 2023
A two stages Deep Learning Architecture for Model Reduction of
  Parametric Time-Dependent Problems
A two stages Deep Learning Architecture for Model Reduction of Parametric Time-Dependent Problems
Isabella Carla Gonnella
M. Hess
G. Stabile
G. Rozza
AI4CE
82
2
0
24 Jan 2023
Explaining Deep Learning Hidden Neuron Activations using Concept
  Induction
Explaining Deep Learning Hidden Neuron Activations using Concept Induction
Abhilekha Dalal
Md Kamruzzaman Sarker
Adrita Barua
Pascal Hitzler
FAtt
24
2
0
23 Jan 2023
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial
  Images
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images
Kun Li
G. Vosselman
M. Yang
85
7
0
23 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
169
14
0
22 Jan 2023
Joint Representation Learning for Text and 3D Point Cloud
Joint Representation Learning for Text and 3D Point Cloud
Rui Huang
Xuran Pan
Henry Zheng
Haojun Jiang
Zhifeng Xie
S. Song
Gao Huang
93
21
0
18 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
Embodied Agents for Efficient Exploration and Smart Scene Description
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
71
7
0
17 Jan 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
UATVR: Uncertainty-Adaptive Text-Video Retrieval
Bo Fang
Wenhao Wu
Chang-rui Liu
Yu Zhou
Yuxin Song
Weiping Wang
Min Yang
Xiang Ji
Jingdong Wang
107
57
0
16 Jan 2023
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor
  Complex Scenes
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor Complex Scenes
Zongmin Liu
Jirui Wang
Jie Li
Peng Liu
Kai Ren
37
2
0
07 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
119
10
0
06 Jan 2023
An Empirical Investigation into the Use of Image Captioning for
  Automated Software Documentation
An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation
Kevin Moran
Ali Yachnes
George Purnell
Juanyed Mahmud
Michele Tufano
Carlos Bernal-Cárdenas
Denys Poshyvanyk
Zach H’Doubler
85
11
0
03 Jan 2023
Knowledge-guided Causal Intervention for Weakly-supervised Object
  Localization
Knowledge-guided Causal Intervention for Weakly-supervised Object Localization
Feifei Shao
Yawei Luo
Fei Gao
Yezhou Yang
Jun Xiao
WSOL
129
4
0
03 Jan 2023
Unpacking the "Black Box" of AI in Education
Unpacking the "Black Box" of AI in Education
Nabeel Gillani
R. Eynon
Catherine Chiabaut
Kelsey Finkel
76
59
0
31 Dec 2022
On the Interpretability of Attention Networks
On the Interpretability of Attention Networks
L. N. Pandey
Rahul Vashisht
H. G. Ramaswamy
73
5
0
30 Dec 2022
Noise-aware Learning from Web-crawled Image-Text Data for Image
  Captioning
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
Woohyun Kang
Jonghwan Mun
Sungjun Lee
Byungseok Roh
VLM
97
20
0
27 Dec 2022
Multi-Projection Fusion and Refinement Network for Salient Object
  Detection in 360° Omnidirectional Image
Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image
Runmin Cong
Ke Huang
Jianjun Lei
Yao-Min Zhao
Qingming Huang
Sam Kwong
71
13
0
23 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
82
12
0
23 Dec 2022
Towards Cooperative Flight Control Using Visual-Attention
Towards Cooperative Flight Control Using Visual-Attention
Lianhao Yin
Makram Chahine
Tsun-Hsuan Wang
Tim Seyde
Chao Liu
Mathias Lechner
Ramin Hasani
Daniela Rus
95
5
0
21 Dec 2022
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Martha Lewis
Nihal V. Nayak
Peilin Yu
Qinan Yu
Jack Merullo
Stephen H. Bach
Ellie Pavlick
VLMOCLCoGe
134
68
0
20 Dec 2022
A Survey of Deep Learning for Mathematical Reasoning
A Survey of Deep Learning for Mathematical Reasoning
Pan Lu
Liang Qiu
Wenhao Yu
Sean Welleck
Kai-Wei Chang
ReLMLRM
133
150
0
20 Dec 2022
Design-time Fashion Popularity Forecasting in VR Environments
Design-time Fashion Popularity Forecasting in VR Environments
Stefanos-Iordanis Papadopoulos
C. Koutlis
Anastasios Papazoglou-Chalikias
Symeon Papadopoulos
S. Nikolopoulos
61
0
0
14 Dec 2022
Previous
123...91011...697071
Next