Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.4555
Cited By
Show and Tell: A Neural Image Caption Generator
17 November 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show and Tell: A Neural Image Caption Generator"
50 / 2,022 papers shown
Title
An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning
Fan Wu
Zhongwen Xu
Yi Yang
ObjD
34
11
0
22 Mar 2017
The Use of Autoencoders for Discovering Patient Phenotypes
Harini Suresh
Peter Szolovits
Marzyeh Ghassemi
DRL
16
28
0
20 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
Abhishek Das
Satwik Kottur
J. M. F. Moura
Stefan Lee
Dhruv Batra
OffRL
31
423
0
20 Mar 2017
VQABQ: Visual Question Answering by Basic Questions
Jia-Hong Huang
Modar Alfadly
Guohao Li
27
24
0
19 Mar 2017
Recurrent Models for Situation Recognition
Arun Mallya
Svetlana Lazebnik
20
30
0
18 Mar 2017
Towards Diverse and Natural Image Descriptions via a Conditional GAN
Bo Dai
Sanja Fidler
R. Urtasun
Dahua Lin
GAN
22
450
0
17 Mar 2017
Learning Robust Visual-Semantic Embeddings
Yao-Hung Hubert Tsai
Liang-Kang Huang
Ruslan Salakhutdinov
SSL
AI4TS
27
166
0
17 Mar 2017
Massive Exploration of Neural Machine Translation Architectures
D. Britz
Anna Goldie
Minh-Thang Luong
Quoc V. Le
29
516
0
11 Mar 2017
Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
De-An Huang
Joseph J. Lim
Li Fei-Fei
Juan Carlos Niebles
24
56
0
07 Mar 2017
Neural Machine Translation and Sequence-to-sequence Models: A Tutorial
Graham Neubig
AIMat
37
171
0
05 Mar 2017
Machine Learning on Sequential Data Using a Recurrent Weighted Average
Jared Ostmeyer
L. Cowell
22
32
0
03 Mar 2017
Toward Controlled Generation of Text
Zhiting Hu
Zichao Yang
Xiaodan Liang
Ruslan Salakhutdinov
Eric Xing
61
984
0
02 Mar 2017
Using Synthetic Data to Train Neural Networks is Model-Based Reasoning
T. Le
A. G. Baydin
R. Zinkov
Frank Wood
SyDa
OOD
25
89
0
02 Mar 2017
Evolving Deep Neural Networks
Risto Miikkulainen
J. Liang
Elliot Meyerson
Aditya Rawal
Daniel Fink
...
B. Raju
H. Shahrzad
Arshak Navruzyan
Nigel P. Duffy
B. Hodjat
21
884
0
01 Mar 2017
Asymmetric Tri-training for Unsupervised Domain Adaptation
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
49
582
0
27 Feb 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
24
386
0
19 Feb 2017
MAT: A Multimodal Attentive Translator for Image Captioning
Chang Liu
F. Sun
Changhu Wang
Feng Wang
Alan Yuille
20
58
0
18 Feb 2017
Dataset Augmentation in Feature Space
Terrance Devries
Graham W. Taylor
23
423
0
17 Feb 2017
End-to-End Interpretation of the French Street Name Signs Dataset
Raymond W. Smith
Chunhui Gu
Dar-Shyang Lee
Huiyi Hu
Ranjith Unnikrishnan
Julian Ibarz
Sacha Arnoud
Sophia Lin
11
42
0
13 Feb 2017
Parallel Long Short-Term Memory for Multi-stream Classification
Mohamed Bouaziz
Mohamed Morchid
Richard Dufour
G. Linarès
R. Mori
12
11
0
11 Feb 2017
A Hybrid Convolutional Variational Autoencoder for Text Generation
Stanislau Semeniuta
Aliaksei Severyn
Erhardt Barth
26
251
0
08 Feb 2017
Gated Multimodal Units for Information Fusion
John Arevalo
Thamar Solorio
Manuel Montes-y-Gómez
Fabio Gonzalez
33
371
0
07 Feb 2017
Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models
Junpei Zhong
Angelo Cangelosi
T. Ogata
14
14
0
07 Feb 2017
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation
Iacer Calixto
Qun Liu
N. Campbell
40
179
0
04 Feb 2017
YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
Esteban Real
Jonathon Shlens
S. Mazzocchi
Xin Pan
Vincent Vanhoucke
VOS
ObjD
40
534
0
02 Feb 2017
Symbolic, Distributed and Distributional Representations for Natural Language Processing in the Era of Deep Learning: a Survey
L. Ferrone
Fabio Massimo Zanzotto
39
37
0
02 Feb 2017
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
N. Mostafazadeh
Chris Brockett
W. Dolan
Michel Galley
Jianfeng Gao
Georgios P. Spithourakis
Lucy Vanderwende
21
181
0
28 Jan 2017
Learning Word-Like Units from Joint Audio-Visual Analysis
David Harwath
James R. Glass
32
106
0
25 Jan 2017
Incorporating Global Visual Features into Attention-Based Neural Machine Translation
Iacer Calixto
Qun Liu
Nick Campbell
32
154
0
23 Jan 2017
dna2vec: Consistent vector representations of variable-length k-mers
Patrick Ng
32
173
0
23 Jan 2017
Comprehension-guided referring expressions
Ruotian Luo
Gregory Shakhnarovich
ObjD
29
171
0
12 Jan 2017
Attention-Based Multimodal Fusion for Video Description
Chiori Hori
Takaaki Hori
Teng-Yok Lee
Kazuhiro Sumi
J. Hershey
Tim K. Marks
41
359
0
11 Jan 2017
Context-aware Captions from Context-agnostic Supervision
Ramakrishna Vedantam
Samy Bengio
Kevin Patrick Murphy
Devi Parikh
Gal Chechik
22
152
0
11 Jan 2017
Learning From Noisy Large-Scale Datasets With Minimal Supervision
Andreas Veit
N. Alldrin
Gal Chechik
Ivan Krasin
Abhinav Gupta
Serge J. Belongie
34
476
0
06 Jan 2017
End-to-End Attention based Text-Dependent Speaker Verification
Shi-Xiong Zhang
Zhuo Chen
Yong Zhao
Jinyu Li
Jiawei Liu
18
177
0
03 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
46
273
0
30 Dec 2016
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
L. V. D. van der Maaten
VLM
20
136
0
29 Dec 2016
Image-Text Multi-Modal Representation Learning by Adversarial Backpropagation
Gwangbeen Park
Woobin Im
GAN
16
25
0
26 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
27
9
0
22 Dec 2016
Structured Sequence Modeling with Graph Convolutional Recurrent Networks
Youngjoo Seo
M. Defferrard
P. Vandergheynst
Xavier Bresson
GNN
36
758
0
22 Dec 2016
Re-evaluating Automatic Metrics for Image Captioning
Mert Kilickaya
Aykut Erdem
Nazli Ikizler-Cinbis
Erkut Erdem
17
180
0
22 Dec 2016
Top-down Visual Saliency Guided by Captions
Vasili Ramanishka
Abir Das
Jianming Zhang
Kate Saenko
21
142
0
21 Dec 2016
An Empirical Study of Language CNN for Image Captioning
Jiuxiang Gu
G. Wang
Jianfei Cai
Tsuhan Chen
31
132
0
21 Dec 2016
Temporal Tessellation: A Unified Approach for Video Analysis
Dotan Kaufman
Gil Levi
Tal Hassner
Lior Wolf
19
16
0
21 Dec 2016
Automatic Generation of Grounded Visual Questions
Shijie Zhang
Lizhen Qu
Shaodi You
Zhenglu Yang
Jiawan Zhang
OOD
27
79
0
20 Dec 2016
Few-Shot Object Recognition from Machine-Labeled Web Images
Zhongwen Xu
Linchao Zhu
Yi Yang
VLM
18
66
0
19 Dec 2016
Beyond Holistic Object Recognition: Enriching Image Understanding with Part States
Cewu Lu
Hao Su
Yongyi Lu
L. Yi
Chi-Keung Tang
Leonidas J. Guibas
15
33
0
15 Dec 2016
Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering
Hao Liu
Yang Yang
Fumin Shen
Lixin Duan
Heng Tao Shen
38
9
0
15 Dec 2016
Learning to Hash-tag Videos with Tag2Vec
A. Singh
Saurabh Saini
R. Shah
P. J. Narayanan
22
1
0
13 Dec 2016
Text-guided Attention Model for Image Captioning
Jonghwan Mun
Minsu Cho
Bohyung Han
VLM
15
92
0
12 Dec 2016
Previous
1
2
3
...
33
34
35
...
39
40
41
Next