v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown

Title
Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings A. Bhunia Subhadeep Koley Amandeep Kumar Aneeshan Sain Pinaki Nath Chowdhury Tao Xiang Yi-Zhe Song 143 20 0 20 Mar 2023
Multi-modal reward for visual relationships-based image captioning Ali Abedi Hossein Karshenas Peyman Adibi 131 2 0 19 Mar 2023
Blind Multimodal Quality Assessment of Low-light Images Miaohui Wang Zhuowei Xu Mai Xu Weisi Lin 85 2 0 18 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation Yangqiaoyu Zhou Kai-Lang Yao Wusuo Li MedIm 51 1 0 17 Mar 2023
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation Yifan Yan Xudong Pan Mi Zhang Min Yang AAML 153 17 0 17 Mar 2023
Cross-Modal Causal Intervention for Medical Report Generation Weixing Chen Yang-Yang Liu Ce Wang Jiarui Zhu Shen Zhao Guanbin Li Cheng-Lin Liu 82 7 0 16 Mar 2023
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning Yongil Kim Yerin Hwang Hyeongu Yun Seunghyun Yoon Trung Bui Kyomin Jung 70 6 0 15 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning Dídac Surís Sachit Menon Carl Vondrick MLLM LRM ReLM 136 469 0 14 Mar 2023
Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images Tiancheng Lin Zhimiao Yu Hongyu Hu Yi Xu Changyi Chen 121 88 0 13 Mar 2023
Focus on Change: Mood Prediction by Learning Emotion Changes via Spatio-Temporal Attention S. Narayana Subramanian Ramanathan Ibrahim Radwan Roland Göcke 65 2 0 12 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation Bang-ju Yang Fenglin Liu Yuexian Zou Xian Wu Yaowei Wang David Clifton 88 9 0 11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning Zhen Wang Jun Xiao Yueting Zhuang Fei Gao Jian Shao Long Chen 112 5 0 11 Mar 2023
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning Pranav Dandwate Chaitanya Shahane V. Jagtap Shridevi C. Karande 101 9 0 05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing Zequn Zeng Hao Zhang Zhengjue Wang Ruiying Lu Dongsheng Wang Bo Chen BDL DiffM 61 33 0 04 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention Paria Mehrani John K. Tsotsos 92 25 0 02 Mar 2023
Inseq: An Interpretability Toolkit for Sequence Generation Models Gabriele Sarti Nils Feldhus Ludwig Sickert Oskar van der Wal Malvina Nissim Arianna Bisazza 123 70 0 27 Feb 2023
Understanding Social Media Cross-Modality Discourse in Linguistic Space Chunpu Xu Hanzhuo Tan Jing Li Piji Li 84 8 0 26 Feb 2023
Parallel Sentence-Level Explanation Generation for Real-World Low-Resource Scenarios Yang Liu Xiaokang Chen Qianwen Dai LRM 51 4 0 21 Feb 2023
Retrieval-augmented Image Captioning R. Ramos Desmond Elliott Bruno Martins VLM 80 29 0 16 Feb 2023
Large Scale Multi-Lingual Multi-Modal Summarization Dataset Yash Verma Anubhav Jangra Raghvendra Kumar S. Saha 32 14 0 13 Feb 2023
Towards Local Visual Modeling for Image Captioning Yiwei Ma Jiayi Ji Xiaoshuai Sun Yiyi Zhou Rongrong Ji ViT 100 79 0 13 Feb 2023
See Your Heart: Psychological states Interpretation through Visual Creations Likun Yang Xiaokun Feng Xiaotang Chen Shiyu Zhang Kaiqi Huang 20 0 0 11 Feb 2023
Sketch Less Face Image Retrieval: A New Challenge Dawei Dai Yutang Li Liang Wang Shiyu Fu Shuyin Xia Guo-Zhen Wang 3DH CVBM 68 7 0 11 Feb 2023
Long-Tailed Partial Label Learning via Dynamic Rebalancing Feng Hong Jiangchao Yao Zhihan Zhou Ya Zhang Yanfeng Wang 71 27 0 10 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning Mozhgan Pourkeshavarz Shahabedin Nabavi Mohsen Moghaddam M. Shamsfard 86 4 0 08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning Brandon Birmingham A. Muscat 54 1 0 07 Feb 2023
Transform, Contrast and Tell: Coherent Entity-Aware Multi-Image Captioning Jingqiang Chen 71 4 0 04 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning Yucheng Zhou Guodong Long 68 23 0 26 Jan 2023
Open Problems in Applied Deep Learning M. Raissi AI4CE 115 2 0 26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data Dong-Jin Kim Tae-Hyun Oh Jinsoo Choi In So Kweon SSL VLM 45 4 0 26 Jan 2023
A two stages Deep Learning Architecture for Model Reduction of Parametric Time-Dependent Problems Isabella Carla Gonnella M. Hess G. Stabile G. Rozza AI4CE 82 2 0 24 Jan 2023
Explaining Deep Learning Hidden Neuron Activations using Concept Induction Abhilekha Dalal Md Kamruzzaman Sarker Adrita Barua Pascal Hitzler FAtt 24 2 0 23 Jan 2023
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images Kun Li G. Vosselman M. Yang 85 7 0 23 Jan 2023
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation Razvan-George Pasca Alexey Gavryushin Muhammad Hamza Yen-Ling Kuo Kaichun Mo Luc Van Gool Otmar Hilliges Xi Wang 169 14 0 22 Jan 2023
Joint Representation Learning for Text and 3D Point Cloud Rui Huang Xuran Pan Henry Zheng Haojun Jiang Zhifeng Xie S. Song Gao Huang 93 21 0 18 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description Roberto Bigazzi Marcella Cornia S. Cascianelli Lorenzo Baraldi Rita Cucchiara LM&Ro 71 7 0 17 Jan 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval Bo Fang Wenhao Wu Chang-rui Liu Yu Zhou Yuxin Song Weiping Wang Min Yang Xiang Ji Jingdong Wang 107 57 0 16 Jan 2023
A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor Complex Scenes Zongmin Liu Jirui Wang Jie Li Peng Liu Kai Ren 37 2 0 07 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU) Rana Adnan Ahmad Muhammad Azhar Hina Sattar 119 10 0 06 Jan 2023
An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation Kevin Moran Ali Yachnes George Purnell Juanyed Mahmud Michele Tufano Carlos Bernal-Cárdenas Denys Poshyvanyk Zach H’Doubler 85 11 0 03 Jan 2023
Knowledge-guided Causal Intervention for Weakly-supervised Object Localization Feifei Shao Yawei Luo Fei Gao Yezhou Yang Jun Xiao WSOL 129 4 0 03 Jan 2023
Unpacking the "Black Box" of AI in Education Nabeel Gillani R. Eynon Catherine Chiabaut Kelsey Finkel 76 59 0 31 Dec 2022
On the Interpretability of Attention Networks L. N. Pandey Rahul Vashisht H. G. Ramaswamy 73 5 0 30 Dec 2022
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning Woohyun Kang Jonghwan Mun Sungjun Lee Byungseok Roh VLM 97 20 0 27 Dec 2022
Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image Runmin Cong Ke Huang Jianjun Lei Yao-Min Zhao Qingming Huang Sam Kwong 71 13 0 23 Dec 2022
Do DALL-E and Flamingo Understand Each Other? Hang Li Jindong Gu Rajat Koner Sahand Sharifzadeh Volker Tresp MLLM 82 12 0 23 Dec 2022
Towards Cooperative Flight Control Using Visual-Attention Lianhao Yin Makram Chahine Tsun-Hsuan Wang Tim Seyde Chao Liu Mathias Lechner Ramin Hasani Daniela Rus 95 5 0 21 Dec 2022
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models Martha Lewis Nihal V. Nayak Peilin Yu Qinan Yu Jack Merullo Stephen H. Bach Ellie Pavlick VLM OCL CoGe 134 68 0 20 Dec 2022
A Survey of Deep Learning for Mathematical Reasoning Pan Lu Liang Qiu Wenhao Yu Sean Welleck Kai-Wei Chang ReLM LRM 133 150 0 20 Dec 2022
Design-time Fashion Popularity Forecasting in VR Environments Stefanos-Iordanis Papadopoulos C. Koutlis Anastasios Papazoglou-Chalikias Symeon Papadopoulos S. Nikolopoulos 61 0 0 14 Dec 2022