Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1406.5679
Cited By
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
22 June 2014
A. Karpathy
Armand Joulin
Li Fei-Fei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Fragment Embeddings for Bidirectional Image Sentence Mapping"
50 / 153 papers shown
Title
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
34
324
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
27
494
0
11 Apr 2017
Detecting Visual Relationships with Deep Relational Networks
Bo Dai
Yuqi Zhang
Dahua Lin
GNN
59
500
0
11 Apr 2017
Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
Dan Xu
Wanli Ouyang
Elisa Ricci
Xiaogang Wang
N. Sebe
30
191
0
08 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
27
810
0
29 Mar 2017
Perception Driven Texture Generation
Yanhai Gan
Huifang Chi
Ying Gao
Jun Liu
G. Zhong
Junyu Dong
OOD
31
12
0
24 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huatian Zhang
Chuang Gan
Eric Xing
GAN
27
200
0
21 Mar 2017
I2T2I: Learning Text to Image Synthesis with Textual Data Augmentation
Hao Dong
Jingqing Zhang
Douglas McIlwraith
Yike Guo
35
58
0
20 Mar 2017
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
Laurens van der Maaten
VLM
20
136
0
29 Dec 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
30
169
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
36
373
0
20 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
26
222
0
17 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
45
664
0
02 Nov 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
851
0
21 Sep 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
22
29
0
20 Aug 2016
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task
Ashkan Mokarian
Mateusz Malinowski
Mario Fritz
27
5
0
09 Aug 2016
A Comprehensive Survey on Cross-modal Retrieval
Kun Wang
Qiyue Yin
Wei Wang
Shu Wu
Liang Wang
42
294
0
21 Jul 2016
Multilingual Visual Sentiment Concept Matching
Nikolaos Pappas
Miriam Redi
Mercan Topkara
Brendan Jou
Hongyi Liu
Tao Chen
Shih-Fu Chang
CVBM
29
14
0
07 Jun 2016
Deep Image Retrieval: Learning global representations for image search
Albert Gordo
Jon Almazán
Jérôme Revaud
Diane Larlus
23
802
0
05 Apr 2016
Multi-Cue Zero-Shot Learning with Strong Supervision
Zeynep Akata
Mateusz Malinowski
Mario Fritz
Bernt Schiele
40
148
0
29 Mar 2016
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
27
360
0
09 Mar 2016
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Raffaella Bernardi
Ruken Cakici
Desmond Elliott
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
Frank Keller
A. Muscat
Barbara Plank
EGVM
VLM
27
363
0
15 Jan 2016
Multi-Instance Visual-Semantic Embedding
Zhou Ren
Hailin Jin
Zhe-nan Lin
Chen Fang
Alan Yuille
VLM
27
38
0
22 Dec 2015
Analyzing Classifiers: Fisher Vectors and Deep Neural Networks
Sebastian Bach
Alexander Binder
G. Montavon
K. Müller
Wojciech Samek
35
198
0
01 Dec 2015
Images Don't Lie: Transferring Deep Visual Semantic Features to Large-Scale Multimodal Learning to Rank
Corey Lynch
Kamelia Aryafar
Josh Attenberg
31
45
0
20 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
36
761
0
17 Nov 2015
Sherlock: Scalable Fact Learning in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
19
26
0
16 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
46
551
0
13 Nov 2015
Deep Multimodal Semantic Embeddings for Speech and Images
David Harwath
James R. Glass
18
155
0
11 Nov 2015
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
44
875
0
11 Nov 2015
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
48
1,062
0
09 Nov 2015
Learning Visual Features from Large Weakly Supervised Data
Armand Joulin
Laurens van der Maaten
Allan Jabri
Nicolas Vasilache
SSL
24
406
0
06 Nov 2015
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
22
101
0
16 Sep 2015
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
33
443
0
03 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
27
261
0
01 Jun 2015
Text to 3D Scene Generation with Rich Lexical Grounding
Angel X. Chang
Will Monroe
Manolis Savva
Christopher Potts
Christopher D. Manning
3DV
15
106
0
23 May 2015
Weakly-Supervised Alignment of Video With Text
Piotr Bojanowski
Rémi Lajugie
Edouard Grave
Francis R. Bach
Ivan Laptev
Jean Ponce
Cordelia Schmid
41
134
0
22 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning
Jacob Devlin
Saurabh Gupta
Ross B. Girshick
Margaret Mitchell
C. L. Zitnick
27
195
0
17 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
41
596
0
05 May 2015
Learning Temporal Embeddings for Complex Video Analysis
Vignesh Ramanathan
K. Tang
Greg Mori
Li Fei-Fei
34
71
0
02 May 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Lin Ma
Zhengdong Lu
Lifeng Shang
Hang Li
38
337
0
23 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
97
2,434
0
01 Apr 2015
Text Understanding from Scratch
Xiang Zhang
Yann LeCun
46
557
0
05 Feb 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
86
1,235
0
20 Dec 2014
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan
Huijuan Xu
Jeff Donahue
Marcus Rohrbach
Raymond J. Mooney
Kate Saenko
47
951
0
15 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
24
5,559
0
07 Dec 2014
Cross-Modal Learning via Pairwise Constraints
Ran He
Man Zhang
Liang Wang
Ye Ji
Qiyue Yin
24
60
0
28 Nov 2014
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
103
4,412
0
20 Nov 2014
Learning a Recurrent Visual Representation for Image Caption Generation
Xinlei Chen
C. L. Zitnick
SSL
GAN
35
195
0
20 Nov 2014
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
92
5,996
0
17 Nov 2014
Previous
1
2
3
4
Next