Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.4555
Cited By
Show and Tell: A Neural Image Caption Generator
17 November 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show and Tell: A Neural Image Caption Generator"
50 / 2,022 papers shown
Title
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
85
1,442
0
06 Dec 2016
Multi-Label Image Classification with Regional Latent Semantic Dependencies
Junjie Zhang
Qi Wu
Chunhua Shen
Jian Zhang
Jianfeng Lu
25
165
0
04 Dec 2016
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
33
205
0
03 Dec 2016
Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework
Yuankai Wu
Huachun Tan
AI4TS
34
248
0
03 Dec 2016
Commonly Uncommon: Semantic Sparsity in Situation Recognition
Mark Yatskar
Vicente Ordonez
Luke Zettlemoyer
Ali Farhadi
VLM
9
42
0
03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
125
3,126
0
02 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
21
232
0
02 Dec 2016
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
11
1,877
0
02 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
39
440
0
01 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
24
88
0
01 Dec 2016
Towards Robust Deep Neural Networks with BANG
Andras Rozsa
Manuel Günther
Terrance E. Boult
AAML
OOD
19
76
0
01 Dec 2016
Sequential Person Recognition in Photo Albums with a Recurrent Network
Yao Li
Guosheng Lin
Bohan Zhuang
Lingqiao Liu
Chunhua Shen
Anton Van Den Hengel
24
29
0
30 Nov 2016
Getting Closer to the Essence of Music: The Con Espressione Manifesto
Gerhard Widmer
13
14
0
29 Nov 2016
Deep Quantization: Encoding Convolutional Activations with Deep Generative Model
Zhaofan Qiu
Ting Yao
Tao Mei
DRL
MQ
26
58
0
29 Nov 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi
C. Grana
Rita Cucchiara
28
191
0
28 Nov 2016
Image Based Appraisal of Real Estate Properties
Quanzeng You
Ran Pang
Liangliang Cao
Jiebo Luo
18
68
0
28 Nov 2016
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
69
990
0
26 Nov 2016
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao
Jiajing Xu
Yushi Jing
Alan Yuille
11
48
0
24 Nov 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Zhe Gan
Chunyuan Li
Changyou Chen
Yunchen Pu
Qinliang Su
Lawrence Carin
BDL
UQCV
53
41
0
23 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
47
425
0
23 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
32
426
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
41
14
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
36
373
0
20 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
38
10
0
20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
24
1,650
0
17 Nov 2016
Multimodal Memory Modelling for Video Captioning
Junbo Wang
Wei Wang
Yan Huang
Liang Wang
Tieniu Tan
32
142
0
17 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
26
222
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
34
103
0
16 Nov 2016
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
34
9
0
16 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives
Mohit Iyyer
Varun Manjunatha
Anupam Guha
Yogarshi Vyas
Jordan L. Boyd-Graber
Hal Daumé
L. Davis
30
95
0
16 Nov 2016
Zero-resource Machine Translation by Multimodal Encoder-decoder Network with Multimedia Pivot
Hideki Nakayama
Noriki Nishida
29
62
0
14 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering
Kuo-Hao Zeng
Tseng-Hung Chen
Ching-Yao Chuang
Yuan-Hong Liao
Juan Carlos Niebles
Min Sun
32
175
0
12 Nov 2016
Unsupervised Pretraining for Sequence to Sequence Learning
Prajit Ramachandran
Peter J. Liu
Quoc V. Le
SSL
AIMat
32
281
0
08 Nov 2016
Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents
Rui Zhang
Honglak Lee
Dragomir R. Radev
16
123
0
08 Nov 2016
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
48
20
0
07 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
48
620
0
05 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
34
664
0
02 Nov 2016
Inference Compilation and Universal Probabilistic Programming
T. Le
A. G. Baydin
Frank Wood
UQCV
52
142
0
31 Oct 2016
Clinical Text Prediction with Numerically Grounded Conditional Language Models
Georgios P. Spithourakis
S. Petersen
Sebastian Riedel
30
7
0
20 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
35
50
0
17 Oct 2016
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
22
16
0
12 Oct 2016
Neural Paraphrase Generation with Stacked Residual LSTM Networks
Aaditya (Adi) Prakash
Sadid A. Hasan
Kathy Lee
Vivek Datla
Ashequl Qadir
Joey Liu
Oladimeji Farri
8
264
0
10 Oct 2016
Latent Sequence Decompositions
William Chan
Yu Zhang
Quoc V. Le
Navdeep Jaitly
16
62
0
10 Oct 2016
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models
Ashwin K. Vijayakumar
Michael Cogswell
Ramprasaath R. Selvaraju
Q. Sun
Stefan Lee
David J. Crandall
Dhruv Batra
17
541
0
07 Oct 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
50
19,576
0
07 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
27
235
0
05 Oct 2016
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi
Graham Neubig
Ryohei Sasano
Hiroya Takamura
Manabu Okumura
19
242
0
30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu
Zhe Gan
Ricardo Henao
Xin Yuan
Chunyuan Li
Andrew Stevens
Lawrence Carin
BDL
CoGe
30
746
0
28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
22
97
0
26 Sep 2016
Previous
1
2
3
...
34
35
36
...
39
40
41
Next