Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1411.4555
Cited By
Show and Tell: A Neural Image Caption Generator
17 November 2014
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show and Tell: A Neural Image Caption Generator"
50 / 2,023 papers shown
Title
Semantic Refinement GRU-based Neural Language Generation for Spoken Dialogue Systems
Van-Khanh Tran
Le-Minh Nguyen
28
20
0
01 Jun 2017
Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols
Serhii Havrylov
Ivan Titov
LLMAG
41
286
0
31 May 2017
Listen, Interact and Talk: Learning to Speak via Interaction
Haichao Zhang
Haonan Yu
Wenyuan Xu
31
13
0
28 May 2017
Human Trajectory Prediction using Spatially aware Deep Attention Models
Daksh Varshneya
G. Srinivasaraghavan
HAI
40
91
0
26 May 2017
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
15
2,867
0
26 May 2017
Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks
Wenjian Hu
Krishna Kumar Singh
Fanyi Xiao
Jinyoung Han
Chen-Nee Chuah
Yong Jae Lee
GNN
DiffM
19
1
0
25 May 2017
Neural Attribute Machines for Program Generation
Matthew Amodio
Swarat Chaudhuri
Thomas W. Reps
19
35
0
25 May 2017
Deep image representations using caption generators
Konda Reddy Mopuri
Vishal B. Athreya
R. Venkatesh Babu
VLM
SSL
21
1
0
25 May 2017
How a General-Purpose Commonsense Ontology can Improve Performance of Learning-Based Image Retrieval
Rodrigo Toro Icarte
Jorge A. Baier
Cristian Ruz
Á. Soto
9
24
0
24 May 2017
Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
Q. Sun
Stefan Lee
Dhruv Batra
BDL
33
43
0
24 May 2017
Better Text Understanding Through Image-To-Text Transfer
Karol Kurach
Sylvain Gelly
M. Jastrzebski
Philip Häusser
O. Teytaud
Damien Vincent
Olivier Bousquet
VLM
17
6
0
23 May 2017
pix2code: Generating Code from a Graphical User Interface Screenshot
Tony Beltramelli
33
267
0
22 May 2017
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
H. Ben-younes
Rémi Cadène
Matthieu Cord
Nicolas Thome
67
578
0
18 May 2017
Learning a bidirectional mapping between human whole-body motion and natural language using deep recurrent neural networks
Matthias Plappert
Christian Mandery
Tamim Asfour
3DH
32
129
0
18 May 2017
Re3 : Real-Time Recurrent Regression Networks for Visual Tracking of Generic Objects
Daniel Gordon
Ali Farhadi
Dieter Fox
VOT
21
48
0
17 May 2017
Object-Level Context Modeling For Scene Classification with Context-CNN
Syed Ashar Javed
A. Nelakanti
VLM
30
10
0
11 May 2017
You said that?
Joon Son Chung
A. Jamaludin
Andrew Zisserman
CVBM
23
258
0
08 May 2017
Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks
Haiyang Yu
Zhihai Wu
Shuqin Wang
Yunpeng Wang
Xiaolei Ma
AI4TS
GNN
30
540
0
07 May 2017
Image Annotation using Multi-Layer Sparse Coding
Amara Tariq
H. Foroosh
14
2
0
06 May 2017
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
M. Bagheri
Ronald M. Summers
LM&MA
66
2,474
0
05 May 2017
FOIL it! Find One mismatch between Image and Language caption
Ravi Shekhar
Sandro Pezzelle
Yauhen Klimovich
Aurélie Herbelot
Moin Nabi
E. Sangineto
Raffaella Bernardi
25
137
0
03 May 2017
Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner
Tseng-Hung Chen
Yuan-Hong Liao
Ching-Yao Chuang
W. Hsu
Jianlong Fu
Min Sun
31
141
0
02 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
30
118
0
02 May 2017
Speech-Based Visual Question Answering
Ted Zhang
Dengxin Dai
Tinne Tuytelaars
Marie-Francine Moens
Luc Van Gool
40
24
0
01 May 2017
Punny Captions: Witty Wordplay in Image Descriptions
Arjun Chandrasekaran
Devi Parikh
Joey Tianyi Zhou
13
13
0
26 Apr 2017
Paying Attention to Descriptions Generated by Image Captioning Models
Hamed R. Tavakoli
Rakshith Shetty
Ali Borji
Jorma T. Laaksonen
29
79
0
24 Apr 2017
Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition
Yufei Wang
Zhe-nan Lin
Xiaohui Shen
Scott D. Cohen
G. Cottrell
21
105
0
23 Apr 2017
Affect-LM: A Neural Language Model for Customizable Affective Text Generation
Sayan Ghosh
Mathieu Chollet
Eugene Laksana
Louis-Philippe Morency
Stefan Scherer
KELM
CVBM
24
190
0
22 Apr 2017
AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive Features For Semantic Matching
David Novotny
Diane Larlus
Andrea Vedaldi
3DPC
31
65
0
16 Apr 2017
Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal Attentions
Amir Mazaheri
Dong-Ming Zhang
M. Shah
17
12
0
15 Apr 2017
Spatial Memory for Context Reasoning in Object Detection
Xinlei Chen
Abhinav Gupta
ObjD
25
164
0
13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
28
57
0
12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
34
324
0
12 Apr 2017
What's in a Question: Using Visual Questions as a Form of Supervision
Siddha Ganju
Olga Russakovsky
Abhinav Gupta
19
16
0
12 Apr 2017
Creativity: Generating Diverse Questions using Variational Autoencoders
Unnat Jain
Ziyu Zhang
Alex Schwing
25
152
0
11 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
28
183
0
11 Apr 2017
Action Unit Detection with Region Adaptation, Multi-labeling Learning and Optimal Temporal Fusing
Wei Li
Farnaz Abtahi
Zhigang Zhu
8
155
0
10 Apr 2017
Learning Human Motion Models for Long-term Predictions
Partha Ghosh
Mingli Song
Emre Aksan
Otmar Hilliges
3DH
28
239
0
10 Apr 2017
Generating Descriptions with Grounded and Co-Referenced People
Anna Rohrbach
Marcus Rohrbach
Siyu Tang
Seong Joon Oh
Bernt Schiele
330
72
0
05 Apr 2017
Weakly Supervised Dense Video Captioning
Zhiqiang Shen
Jianguo Li
Zhou Su
Minjun Li
Yurong Chen
Yu-Gang Jiang
Xiangyang Xue
32
134
0
05 Apr 2017
A Genetic Programming Approach to Designing Convolutional Neural Network Architectures
Masanori Suganuma
Shinichi Shirakawa
T. Nagao
27
587
0
03 Apr 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Amrita Saha
Mitesh Khapra
Karthik Sankaranarayanan
26
8
0
01 Apr 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images
Rakshith Shetty
Bernt Schiele
Mario Fritz
35
223
0
30 Mar 2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
Rakshith Shetty
Marcus Rohrbach
Lisa Anne Hendricks
Mario Fritz
Bernt Schiele
19
142
0
30 Mar 2017
Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding
Will Monroe
Robert D. Hawkins
Noah D. Goodman
Christopher Potts
37
122
0
29 Mar 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
27
810
0
29 Mar 2017
Towards Automatic Learning of Procedures from Web Instructional Videos
Luowei Zhou
Chenliang Xu
Jason J. Corso
EgoV
36
804
0
28 Mar 2017
Where to put the Image in an Image Caption Generator
Marc Tanti
Albert Gatt
K. Camilleri
47
96
0
27 Mar 2017
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
Ron J. Weiss
J. Chorowski
Navdeep Jaitly
Yonghui Wu
Zhehuai Chen
33
341
0
24 Mar 2017
Previous
1
2
3
...
32
33
34
...
39
40
41
Next