v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown

Title
VLGrammar: Grounded Grammar Induction of Vision and Language Yining Hong Qing Li Song-Chun Zhu Siyuan Huang VLM 89 25 0 24 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones Ashish Vaswani Prajit Ramachandran A. Srinivas Niki Parmar Blake A. Hechtman Jonathon Shlens 137 404 0 23 Mar 2021
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers Dheeraj Rajagopal Vidhisha Balachandran Eduard H. Hovy Yulia Tsvetkov MILM SSL FAtt AI4TS 100 68 0 23 Mar 2021
Human-like Controllable Image Captioning with Verb-specific Semantic Roles Long Chen Zhihong Jiang Jun Xiao Wei Liu 104 77 0 22 Mar 2021
Handling Missing Observations with an RNN-based Prediction-Update Cycle S. Becker Ronny Hug Wolfgang Hubner Michael Arens B. Morris 57 1 0 22 Mar 2021
3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model Chengxi Li Brent Harrison 132 6 0 20 Mar 2021
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 120 52 0 20 Mar 2021
Let Your Heart Speak in its Mother Tongue: Multilingual Captioning of Cardiac Signals Dani Kiyasseh T. Zhu David Clifton 124 0 0 19 Mar 2021
Decoupled Spatial Temporal Graphs for Generic Visual Grounding Qi Feng Yunchao Wei Mingming Cheng Yi Yang 64 5 0 18 Mar 2021
Which to Match? Selecting Consistent GT-Proposal Assignment for Pedestrian Detection Yan Luo Chongyang Zhang Muming Zhao Hao Zhou Jun Sun 51 0 0 18 Mar 2021
Set-to-Sequence Methods in Machine Learning: a Review Mateusz Jurewicz Leon Derczynski BDL 65 10 0 17 Mar 2021
CACTUS: Detecting and Resolving Conflicts in Objective Functions Subhajit Das Alex Endert 55 0 0 13 Mar 2021
Dual Attention-in-Attention Model for Joint Rain Streak and Raindrop Removal Kaihao Zhang Dongxu Li Wenhan Luo Wenqi Ren 85 78 0 12 Mar 2021
Full Page Handwriting Recognition via Image to Sequence Extraction Sumeet S. Singh Sergey Karayev 81 55 0 11 Mar 2021
Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning Mingjie Sun Jimin Xiao Eng Gee Lim ObjD 84 35 0 09 Mar 2021
Analysis of Convolutional Decoder for Image Caption Generation Sulabh Katiyar S. Borgohain 57 0 0 08 Mar 2021
Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks George Dasoulas Kevin Scaman Aladin Virmaux GNN 85 39 0 08 Mar 2021
Relationship-based Neural Baby Talk Fan Fu Tingting Xie Ioannis Patras Sepehr Jalali 34 0 0 08 Mar 2021
Contextual Dropout: An Efficient Sample-Dependent Dropout Module Xinjie Fan Shujian Zhang Korawat Tanwisuth Xiaoning Qian Mingyuan Zhou OOD BDL UQCV 92 31 0 06 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision Andrew Shin Masato Ishii T. Narihira 142 39 0 06 Mar 2021
Causal Attention for Vision-Language Tasks Xu Yang Hanwang Zhang Guojun Qi Jianfei Cai CML 105 158 0 05 Mar 2021
Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions Ruixu Liu Ju Shen He Wang Chong Chen S. Cheung V. Asari 3DH 72 31 0 04 Mar 2021
Coordinate Attention for Efficient Mobile Network Design Qibin Hou Daquan Zhou Jiashi Feng 114 3,140 0 04 Mar 2021
End-to-end acoustic modelling for phone recognition of young readers Lucile Gelin Morgane Daniel J. Pinquier Thomas Pellegrini 60 13 0 04 Mar 2021
Video Sentiment Analysis with Bimodal Information-augmented Multi-Head Attention Ting-Wei Wu Jun-jie Peng Wenqiang Zhang Huiran Zhang Chuan Ma Yansong Huang 77 88 0 03 Mar 2021
Dual Reinforcement-Based Specification Generation for Image De-Rendering Ramakanth Pasunuru David B. Rosenberg Gideon Mann Joey Tianyi Zhou 107 0 0 02 Mar 2021
Deep Learning Based Decision Support for Medicine -- A Case Study on Skin Cancer Diagnosis Adriano Lucieri Andreas Dengel Sheraz Ahmed 122 7 0 02 Mar 2021
Listening to the city, attentively: A Spatio-Temporal Attention Boosted Autoencoder for the Short-Term Flow Prediction Problem Stefano Fiorini Michele Ciavotta A. Maurino 34 9 0 01 Mar 2021
Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control Chen Wang Rui Wang Ajay Mandlekar Li Fei-Fei Silvio Savarese Danfei Xu 92 31 0 28 Feb 2021
A Universal Model for Cross Modality Mapping by Relational Reasoning Zun Li Congyan Lang Liqian Liang Tao Wang Songhe Feng Jun Wu Yidong Li 66 2 0 26 Feb 2021
Benchmarking and Survey of Explanation Methods for Black Box Models F. Bodria F. Giannotti Riccardo Guidotti Francesca Naretto D. Pedreschi S. Rinzivillo XAI 129 234 0 25 Feb 2021
Retrieval Augmentation for Deep Neural Networks R. Ramos Patrícia Pereira Helena Moniz Joao Paulo Carvalho Bruno Martins VLM 32 0 0 25 Feb 2021
Multichannel LSTM-CNN for Telugu Technical Domain Identification Sunil Gundapu R. Mamidi 25 7 0 24 Feb 2021
Characterization and recognition of handwritten digits using Julia Md Asifuzzaman Jishan M. Alam A. Islam I. R. Mazumder K. Mahmud A. K. Azad 31 0 0 24 Feb 2021
Enhanced Modality Transition for Image Captioning Ziwei Wang Yadan Luo Zi Huang 30 0 0 23 Feb 2021
Comparative evaluation of CNN architectures for Image Caption Generation Sulabh Katiyar S. Borgohain 81 24 0 23 Feb 2021
Model-Attentive Ensemble Learning for Sequence Modeling Victor D. Bourgin Ioana Bica M. Schaar AI4TS 45 0 0 23 Feb 2021
Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation Sulabh Katiyar S. Borgohain VLM 64 14 0 22 Feb 2021
A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification Yixin Li Xinran Wu Chen Li Changhao Sun M. Rahaman Hao Chen Yudong Yao Xiaoyan Li Yong Zhang Tao Jiang 87 25 0 21 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Jun Chen Han Guo Kai Yi Boyang Albert Li Mohamed Elhoseiny VLM 166 227 0 20 Feb 2021
Hard-Attention for Scalable Image Classification Athanasios Papadopoulos Pawel Korus N. Memon 125 25 0 20 Feb 2021
Progressive Transformer-Based Generation of Radiology Reports Farhad Nooralahzadeh Nicolas Andres Perez Gonzalez T. Frauenfelder Koji Fujimoto Michael Krauthammer ViT MedIm 117 89 0 19 Feb 2021
Trends in Vehicle Re-identification Past, Present, and Future: A Comprehensive Review Zakria Jianhua Deng Muhammad Saddam Khokhar Muhammad Umar Aftab Jingye Cai Rajesh Kumar Jay Kumar 64 34 0 19 Feb 2021
I Want This Product but Different : Multimodal Retrieval with Synthetic Query Expansion Ivona Tautkute Tomasz Trzciñski 77 4 0 17 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention Irwan Bello 359 181 0 17 Feb 2021
A Context-Enhanced De-identification System Kahyun Lee M. Kayaalp Sam Henry Özlem Uzuner 68 3 0 17 Feb 2021
Learning Intra-Batch Connections for Deep Metric Learning Jenny Seidenschwarz Ismail Elezi Laura Leal-Taixé FedML 81 54 0 15 Feb 2021
A Gated Fusion Network for Dynamic Saliency Prediction Aysun Kocak Erkut Erdem Aykut Erdem 50 7 0 15 Feb 2021
Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model Mohammad Faiyaz Khan S. M. S. Shifath Md. Saiful Islam VLM 65 21 0 14 Feb 2021
Image Captioning using Multiple Transformers for Self-Attention Mechanism Farrukh Olimov Shikha Dubey Labina Shrestha Tran Trung Tin M. Jeon ViT 48 2 0 14 Feb 2021