ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXivPDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,509 papers shown
Title
Learning Combinatorial Prompts for Universal Controllable Image
  Captioning
Learning Combinatorial Prompts for Universal Controllable Image Captioning
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
60
5
0
11 Mar 2023
Comparative study of Transformer and LSTM Network with attention
  mechanism on Image Captioning
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
14
8
0
05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
19
33
0
04 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
25
24
0
02 Mar 2023
Inseq: An Interpretability Toolkit for Sequence Generation Models
Inseq: An Interpretability Toolkit for Sequence Generation Models
Gabriele Sarti
Nils Feldhus
Ludwig Sickert
Oskar van der Wal
Malvina Nissim
Arianna Bisazza
32
64
0
27 Feb 2023
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Understanding Social Media Cross-Modality Discourse in Linguistic Space
Chunpu Xu
Hanzhuo Tan
Jing Li
Piji Li
24
5
0
26 Feb 2023
Parallel Sentence-Level Explanation Generation for Real-World
  Low-Resource Scenarios
Parallel Sentence-Level Explanation Generation for Real-World Low-Resource Scenarios
Yong-Jin Liu
Xiaokang Chen
Qianwen Dai
LRM
22
4
0
21 Feb 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
32
29
0
16 Feb 2023
Large Scale Multi-Lingual Multi-Modal Summarization Dataset
Large Scale Multi-Lingual Multi-Modal Summarization Dataset
Yash Verma
Anubhav Jangra
Raghvendra Kumar
S. Saha
17
13
0
13 Feb 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Rongrong Ji
ViT
21
71
0
13 Feb 2023
See Your Heart: Psychological states Interpretation through Visual
  Creations
See Your Heart: Psychological states Interpretation through Visual Creations
Likun Yang
Xiaokun Feng
Xiaotang Chen
Shiyu Zhang
Kaiqi Huang
13
0
0
11 Feb 2023
Sketch Less Face Image Retrieval: A New Challenge
Sketch Less Face Image Retrieval: A New Challenge
Dawei Dai
Yutang Li
Liang Wang
Shiyu Fu
Shuyin Xia
Guo-Zhen Wang
3DH
CVBM
18
6
0
11 Feb 2023
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Feng Hong
Jiangchao Yao
Zhihan Zhou
Ya Zhang
Yanfeng Wang
26
26
0
10 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
31
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
Brandon Birmingham
A. Muscat
27
1
0
07 Feb 2023
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image
  Captioning
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning
Jingqiang Chen
25
3
0
04 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Yucheng Zhou
Guodong Long
25
22
0
26 Jan 2023
Open Problems in Applied Deep Learning
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled
  Data
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSL
VLM
27
4
0
26 Jan 2023
A two stages Deep Learning Architecture for Model Reduction of
  Parametric Time-Dependent Problems
A two stages Deep Learning Architecture for Model Reduction of Parametric Time-Dependent Problems
Isabella Carla Gonnella
M. Hess
G. Stabile
G. Rozza
AI4CE
32
2
0
24 Jan 2023
Explaining Deep Learning Hidden Neuron Activations using Concept
  Induction
Explaining Deep Learning Hidden Neuron Activations using Concept Induction
Abhilekha Dalal
Md Kamruzzaman Sarker
Adrita Barua
Pascal Hitzler
FAtt
6
2
0
23 Jan 2023
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial
  Images
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images
Kun Li
G. Vosselman
M. Yang
31
5
0
23 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
33
14
0
22 Jan 2023
Joint Representation Learning for Text and 3D Point Cloud
Joint Representation Learning for Text and 3D Point Cloud
Rui Huang
Xuran Pan
Henry Zheng
Haojun Jiang
Zhifeng Xie
S. Song
Gao Huang
33
19
0
18 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
Embodied Agents for Efficient Exploration and Smart Scene Description
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
12
7
0
17 Jan 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
UATVR: Uncertainty-Adaptive Text-Video Retrieval
Bo Fang
Wenhao Wu
Chang-rui Liu
Yu Zhou
Yuxin Song
Weiping Wang
Min Yang
Xiang Ji
Jingdong Wang
26
46
0
16 Jan 2023
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor
  Complex Scenes
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor Complex Scenes
Zongmin Liu
Jirui Wang
Jie Li
Peng Liu
Kai Ren
23
2
0
07 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
26
10
0
06 Jan 2023
An Empirical Investigation into the Use of Image Captioning for
  Automated Software Documentation
An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation
Kevin Moran
Ali Yachnes
George Purnell
Juanyed Mahmud
Michele Tufano
Carlos Bernal-Cárdenas
Denys Poshyvanyk
Zach H’Doubler
39
10
0
03 Jan 2023
Knowledge-guided Causal Intervention for Weakly-supervised Object
  Localization
Knowledge-guided Causal Intervention for Weakly-supervised Object Localization
Feifei Shao
Yawei Luo
Fei Gao
Yezhou Yang
Jun Xiao
WSOL
39
4
0
03 Jan 2023
Unpacking the "Black Box" of AI in Education
Unpacking the "Black Box" of AI in Education
Nabeel Gillani
R. Eynon
Catherine Chiabaut
Kelsey Finkel
36
58
0
31 Dec 2022
On the Interpretability of Attention Networks
On the Interpretability of Attention Networks
L. N. Pandey
Rahul Vashisht
H. G. Ramaswamy
19
4
0
30 Dec 2022
Noise-aware Learning from Web-crawled Image-Text Data for Image
  Captioning
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
Woohyun Kang
Jonghwan Mun
Sungjun Lee
Byungseok Roh
VLM
14
18
0
27 Dec 2022
Multi-Projection Fusion and Refinement Network for Salient Object
  Detection in 360° Omnidirectional Image
Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image
Runmin Cong
Ke Huang
Jianjun Lei
Yao-Min Zhao
Qingming Huang
Sam Kwong
42
12
0
23 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
21
12
0
23 Dec 2022
Towards Cooperative Flight Control Using Visual-Attention
Towards Cooperative Flight Control Using Visual-Attention
Lianhao Yin
Makram Chahine
Tsun-Hsuan Wang
Tim Seyde
Chao Liu
Mathias Lechner
Ramin Hasani
Daniela Rus
22
5
0
21 Dec 2022
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Martha Lewis
Nihal V. Nayak
Peilin Yu
Qinan Yu
Jack Merullo
Stephen H. Bach
Ellie Pavlick
VLM
OCL
CoGe
23
59
0
20 Dec 2022
A Survey of Deep Learning for Mathematical Reasoning
A Survey of Deep Learning for Mathematical Reasoning
Pan Lu
Liang Qiu
Wenhao Yu
Sean Welleck
Kai-Wei Chang
ReLM
LRM
50
139
0
20 Dec 2022
Design-time Fashion Popularity Forecasting in VR Environments
Design-time Fashion Popularity Forecasting in VR Environments
Stefanos-Iordanis Papadopoulos
C. Koutlis
Anastasios Papazoglou-Chalikias
Symeon Papadopoulos
S. Nikolopoulos
24
0
0
14 Dec 2022
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved
  Visio-Linguistic Models in 3D Scenes
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
Ahmed Abdelreheem
Kyle Olszewski
Hsin-Ying Lee
Peter Wonka
Panos Achlioptas
3DPC
22
28
0
12 Dec 2022
CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised
  Video Anomaly Detection
CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly Detection
Kevin Hyekang Joo
Khoa T. Vo
Kashu Yamazaki
Ngan Le
27
39
0
09 Dec 2022
SLAM for Visually Impaired People: a Survey
SLAM for Visually Impaired People: a Survey
Banafshe Marziyeh Bamdad
Davide Scaramuzza
Alireza Darvishy
10
8
0
09 Dec 2022
Modularity through Attention: Efficient Training and Transfer of
  Language-Conditioned Policies for Robot Manipulation
Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation
Yifan Zhou
Shubham D. Sonawani
Mariano Phielipp
Simon Stepputtis
H. B. Amor
LM&Ro
33
27
0
08 Dec 2022
A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated
  Classification
A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated Classification
Alan Q. Wang
M. Sabuncu
30
5
0
07 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck
  of Reinforcement Learning
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
13
9
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
30
62
0
06 Dec 2022
Document-Level Abstractive Summarization
Document-Level Abstractive Summarization
Gonçalo Raposo
Afonso Raposo
Ana Sofia Carmo
27
1
0
06 Dec 2022
Generalizing Multiple Object Tracking to Unseen Domains by Introducing
  Natural Language Representation
Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
En Yu
Songtao Liu
Zhuoling Li
Jinrong Yang
Zeming Li
Shoudong Han
Wenbing Tao
29
12
0
03 Dec 2022
Focus! Relevant and Sufficient Context Selection for News Image
  Captioning
Focus! Relevant and Sufficient Context Selection for News Image Captioning
Mingyang Zhou
Grace Luo
Anna Rohrbach
Zhou Yu
CLIP
27
13
0
01 Dec 2022
Convolution, aggregation and attention based deep neural networks for
  accelerating simulations in mechanics
Convolution, aggregation and attention based deep neural networks for accelerating simulations in mechanics
Saurabh Deshpande
Raúl I. Sosa
Stéphane P. A. Bordas
J. Lengiewicz
AI4CE
36
18
0
01 Dec 2022
Previous
123...91011...697071
Next