ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
TI-JEPA: An Innovative Energy-based Joint Embedding Strategy for Text-Image Multimodal Systems
Khang H. N. Vo
D. Q. Nguyen
T. Nguyen
Tho Quan
129
1
0
09 Mar 2025
MSConv: Multiplicative and Subtractive Convolution for Face Recognition
Si Zhou
Yain-Whar Si
Xiaochen Yuan
Xiaofan Li
Xiaoxiang Liu
Xinyuan Zhang
Cong Lin
Xueyuan Gong
CVBM
154
0
0
08 Mar 2025
Extracting Symbolic Sequences from Visual Representations via Self-Supervised Learning
Victor Sebastian Martinez Pozos
Ivan Vladimir Meza Ruiz
69
0
0
06 Mar 2025
Cross-modal Causal Relation Alignment for Video Question Grounding
Weixing Chen
Yang Liu
Binglin Chen
Jiandong Su
Yongsen Zheng
Liang Lin
BDLVGenCML
126
2
0
05 Mar 2025
AC-Lite : A Lightweight Image Captioning Model for Low-Resource Assamese Language
AC-Lite : A Lightweight Image Captioning Model for Low-Resource Assamese Language
Pankaj Choudhury
Yogesh Aggarwal
Prabhanjan Jadhav
Prithwijit Guha
Sukumar Nandi
203
0
0
03 Mar 2025
Abn-BLIP: Abnormality-aligned Bootstrapping Language-Image Pre-training for Pulmonary Embolism Diagnosis and Report Generation from CTPA
Z. Zhong
Yuli Wang
Lulu Bi
Zhuoqi Ma
S. H. Ahn
...
Webster Stayman
Todd M. Kolb
I. Kamel
Harrison X. Bai
Zhicheng Jiao
LM&MA
97
0
0
03 Mar 2025
A Survey of Link Prediction in Temporal Networks
A Survey of Link Prediction in Temporal Networks
Jiafeng Xiong
Ahmad Zareie
Rizos Sakellariou
AI4TSAI4CE
83
2
0
28 Feb 2025
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Chenyang Zhao
Kun Wang
J. H. Hsiao
Antoni B. Chan
CLIP
110
0
0
26 Feb 2025
Beyond RNNs: Benchmarking Attention-Based Image Captioning Models
Beyond RNNs: Benchmarking Attention-Based Image Captioning Models
Hemanth Teja Yanambakkam
Rahul Chinthala
53
0
0
26 Feb 2025
Omni-SILA: Towards Omni-scene Driven Visual Sentiment Identifying, Locating and Attributing in Videos
Jiamin Luo
Jingjing Wang
Junxiao Ma
Yujie Jin
Shoushan Li
Guodong Zhou
92
0
0
26 Feb 2025
Good Representation, Better Explanation: Role of Convolutional Neural Networks in Transformer-Based Remote Sensing Image Captioning
Good Representation, Better Explanation: Role of Convolutional Neural Networks in Transformer-Based Remote Sensing Image Captioning
Swadhin Das
Saarthak Gupta
and Kamal Kumar
Raksha Sharma
52
1
0
22 Feb 2025
A Comprehensive Survey on Composed Image Retrieval
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
131
3
0
19 Feb 2025
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
Zhihang Liu
Chen-Wei Xie
Bin Wen
Feiwu Yu
Jixuan Chen
...
Pandeng Li
Yinglu Li
Zuan Gao
Yun Zheng
Hongtao Xie
VLMCoGe
178
0
0
19 Feb 2025
Performance Analysis of Traditional VQA Models Under Limited Computational Resources
Jihao Gu
155
0
0
09 Feb 2025
Using Large Language Models for education managements in Vietnamese with low resources
Duc Do Minh
Vinh Nguyen Van
Thang Dam Cong
102
1
0
28 Jan 2025
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
Duc Hau Nguyen
Duc Hau Nguyen
Pascale Sébillot
128
5
0
23 Jan 2025
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
Anupam Pandey
Deepjyoti Bodo
Arpan Phukan
Asif Ekbal
150
0
0
13 Jan 2025
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving
Tian Jin
Yuxiao Luo
Yue Ma
Yu Qiao
Yali Wang
Mamba
120
1
0
08 Jan 2025
GIT-CXR: End-to-End Transformer for Chest X-Ray Report Generation
Iustin Sîrbu
Iulia-Renata Sîrbu
Jasmina Bogojeska
Traian Rebedea
MedImViTLM&MA
72
1
0
05 Jan 2025
Classifier-Guided Captioning Across Modalities
Ariel Shaulov
Tal Shaharabany
E. Shaar
Gal Chechik
Lior Wolf
94
0
0
03 Jan 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffMVLM
139
0
0
03 Jan 2025
Real-time Bangla Sign Language Translator
Real-time Bangla Sign Language Translator
Rotan Hawlader Pranto
Shahnewaz Siddique
SLR
88
0
0
21 Dec 2024
Reframing Image Difference Captioning with BLIP2IDC and Synthetic
  Augmentation
Reframing Image Difference Captioning with BLIP2IDC and Synthetic Augmentation
Gautier Evennou
Antoine Chaffin
Vivien Chappelier
Ewa Kijak
DiffM
125
0
0
20 Dec 2024
Automated Image Captioning with CNNs and Transformers
Automated Image Captioning with CNNs and Transformers
Joshua Adrian Cahyono
Jeremy Nathan Jusuf
VLMViT
92
0
0
13 Dec 2024
Advancing Attribution-Based Neural Network Explainability through
  Relative Absolute Magnitude Layer-Wise Relevance Propagation and
  Multi-Component Evaluation
Advancing Attribution-Based Neural Network Explainability through Relative Absolute Magnitude Layer-Wise Relevance Propagation and Multi-Component Evaluation
Davor Vukadin
Petar Afrić
Marin Šilić
Goran Delač
FAtt
137
2
0
12 Dec 2024
Automated Medical Report Generation for ECG Data: Bridging Medical Text
  and Signal Processing with Deep Learning
Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning
Amnon Bleich
A. Linnemann
B. Diem
Tim Conrad
MedIm
124
3
0
05 Dec 2024
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large
  Vision-Language Model via Causality Analysis
Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis
Po-Hsuan Huang
Jeng-Lin Li
Chin-Po Chen
Ming-Ching Chang
Wei-Chao Chen
LRM
142
1
0
04 Dec 2024
Was that Sarcasm?: A Literature Survey on Sarcasm Detection
Was that Sarcasm?: A Literature Survey on Sarcasm Detection
Harleen Kaur Bagga
Jasmine Bernard
Sahil Shaheen
Sarthak Arora
86
0
0
30 Nov 2024
VLM-HOI: Vision Language Models for Interpretable Human-Object
  Interaction Analysis
VLM-HOI: Vision Language Models for Interpretable Human-Object Interaction Analysis
Donggoo Kang
Dasol Jeong
Hyunmin Lee
Sangwoo Park
Hasil Park
Sunkyu Kwon
Yeongjoon Kim
Joonki Paik
MLLMVLM
150
0
0
27 Nov 2024
GeoFormer: A Multi-Polygon Segmentation Transformer
GeoFormer: A Multi-Polygon Segmentation Transformer
Maxim Khomiakov
Michael Riis Andersen
J. Frellsen
112
1
0
25 Nov 2024
Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal
  Approach
Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach
Vaishnavi Khindkar
V. Balasubramanian
Chetan Arora
A. Subramanian
C. V. Jawahar
116
0
0
20 Nov 2024
CUE-M: Contextual Understanding and Enhanced Search with Multimodal Large Language Model
CUE-M: Contextual Understanding and Enhanced Search with Multimodal Large Language Model
Dongyoung Go
Taesun Whang
Chanhee Lee
Hwayeon Kim
Sunghoon Park
Seunghwan Ji
Dongchan Kim
Young-Bum Kim
Young-Bum Kim
LRM
529
1
0
19 Nov 2024
Anatomy-Guided Radiology Report Generation with Pathology-Aware Regional Prompts
Yijian Gao
D. C. Marshall
Xiaodan Xing
Junzhi Ning
G. Papanastasiou
G. Yang
M. Komorowski
MedIm
58
0
0
16 Nov 2024
SASE: A Searching Architecture for Squeeze and Excitation Operations
SASE: A Searching Architecture for Squeeze and Excitation Operations
Hanming Wang
Yunlong Li
Zijun Wu
Huifen Wang
Yuan Zhang
3DPC
64
0
0
13 Nov 2024
Multi-Modal interpretable automatic video captioning
Multi-Modal interpretable automatic video captioning
Antoine Hanna-Asaad
Decky Aspandi
Titus Zaharia
72
0
0
11 Nov 2024
Extended multi-stream temporal-attention module for skeleton-based human
  action recognition (HAR)
Extended multi-stream temporal-attention module for skeleton-based human action recognition (HAR)
Faisal Mehmood
Xin Guo
Enqing Chen
Muhammad Azeem Akbar
A. Khan
Sami Ullah
88
4
0
10 Nov 2024
Generalization and Risk Bounds for Recurrent Neural Networks
Generalization and Risk Bounds for Recurrent Neural Networks
Xuewei Cheng
Ke Huang
Shujie Ma
127
1
0
05 Nov 2024
FactorizePhys: Matrix Factorization for Multidimensional Attention in
  Remote Physiological Sensing
FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing
Jitesh Joshi
Sos S. Agaian
Youngjun Cho
AI4TS
75
2
0
03 Nov 2024
Semi-supervised Chinese Poem-to-Painting Generation via Cycle-consistent
  Adversarial Networks
Semi-supervised Chinese Poem-to-Painting Generation via Cycle-consistent Adversarial Networks
Zhengyang Lu
Tianhao Guo
Feng Wang
GAN
53
1
0
25 Oct 2024
Anomaly Resilient Temporal QoS Prediction using Hypergraph Convoluted
  Transformer Network
Anomaly Resilient Temporal QoS Prediction using Hypergraph Convoluted Transformer Network
Suraj Kumar
S. Chattopadhyay
Chandranath Adak
31
0
0
23 Oct 2024
PromptExp: Multi-granularity Prompt Explanation of Large Language Models
PromptExp: Multi-granularity Prompt Explanation of Large Language Models
Ximing Dong
Shaowei Wang
Dayi Lin
Gopi Krishnan Rajbahadur
Boquan Zhou
Shichao Liu
Ahmed E. Hassan
AAMLLRM
84
1
0
16 Oct 2024
HASN: Hybrid Attention Separable Network for Efficient Image
  Super-resolution
HASN: Hybrid Attention Separable Network for Efficient Image Super-resolution
Weifeng Cao
Xiaoyan Lei
Jun Shi
Wanyong Liang
Jie Liu
Zongfei Bai
SupR
90
1
0
13 Oct 2024
Multimodal Clickbait Detection by De-confounding Biases Using Causal
  Representation Inference
Multimodal Clickbait Detection by De-confounding Biases Using Causal Representation Inference
Jianxing Yu
Shiqi Wang
Han Yin
Zhenlong Sun
Ruobing Xie
Bo Zhang
Yanghui Rao
CML
67
0
0
10 Oct 2024
Positive-Augmented Contrastive Learning for Vision-and-Language
  Evaluation and Training
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
80
4
0
09 Oct 2024
Demonstration Based Explainable AI for Learning from Demonstration
  Methods
Demonstration Based Explainable AI for Learning from Demonstration Methods
Morris Gu
Elizabeth Croft
Dana Kulic
40
0
0
08 Oct 2024
CoVLM: Leveraging Consensus from Vision-Language Models for
  Semi-supervised Multi-modal Fake News Detection
CoVLM: Leveraging Consensus from Vision-Language Models for Semi-supervised Multi-modal Fake News Detection
Devank
Jayateja Kalla
Soma Biswas
72
2
0
06 Oct 2024
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning
Zheng Zhang
Xu Yuan
Lei Zhu
Jingkuan Song
Liqiang Nie
AAML
85
12
0
03 Oct 2024
Facial Action Unit Detection by Adaptively Constraining Self-Attention
  and Causally Deconfounding Sample
Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample
Zhiwen Shao
Hancheng Zhu
Yong Zhou
Xiang Xiang
Bing-Quan Liu
Rui Yao
Lizhuang Ma
CML
61
3
0
02 Oct 2024
Softmax is not Enough (for Sharp Size Generalisation)
Softmax is not Enough (for Sharp Size Generalisation)
Petar Velickovic
Christos Perivolaropoulos
Federico Barbero
Razvan Pascanu
114
17
0
01 Oct 2024
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data
  Generation
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng
Faria Huq
Yue Jiang
Jason Wu
Amanda Li
Jeffrey P. Bigham
Amy Pavel
DiffM
86
5
0
30 Sep 2024
Previous
12345...697071
Next