Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.05588
Cited By
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
17 November 2016
Yan Huang
Wei Wang
Liang Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Instance-aware Image and Sentence Matching with Selective Multimodal LSTM"
21 / 21 papers shown
Title
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
Guanqi Zhan
Yuanpei Liu
Kai Han
Weidi Xie
Andrew Zisserman
VLM
518
0
0
21 Feb 2025
Contextual LSTM (CLSTM) models for Large scale NLP tasks
Shalini Ghosh
Oriol Vinyals
B. Strope
Scott Roy
Tom Dean
Larry Heck
70
213
0
19 Feb 2016
RNN Fisher Vectors for Action Recognition and Image Annotation
Guy Lev
Gil Sadeh
Benjamin Klein
Lior Wolf
55
164
0
12 Dec 2015
Order-Embeddings of Images and Language
Ivan Vendrov
Ryan Kiros
Sanja Fidler
R. Urtasun
120
548
0
19 Nov 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
86
783
0
19 Nov 2015
Skip-Thought Vectors
Ryan Kiros
Yukun Zhu
Ruslan Salakhutdinov
R. Zemel
Antonio Torralba
R. Urtasun
Sanja Fidler
SSL
228
2,412
0
22 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
222
2,074
0
19 May 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence
Lin Ma
Zhengdong Lu
Lifeng Shang
Hang Li
121
337
0
23 Apr 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
352
10,091
0
10 Feb 2015
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
154
5,599
0
07 Dec 2014
From Captions to Visual Concepts and Back
Hao Fang
Saurabh Gupta
F. Iandola
R. Srivastava
Li Deng
...
Xiaodong He
Margaret Mitchell
John C. Platt
C. L. Zitnick
Geoffrey Zweig
VLM
134
1,312
0
18 Nov 2014
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
3DV
270
6,042
0
17 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
173
6,060
0
17 Nov 2014
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
Ryan Kiros
Ruslan Salakhutdinov
R. Zemel
VLM
135
1,401
0
10 Nov 2014
Explain Images with Multimodal Recurrent Neural Networks
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Alan Yuille
VLM
GAN
118
385
0
04 Oct 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.7K
100,575
0
04 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
589
27,345
0
01 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
A. Karpathy
Armand Joulin
Li Fei-Fei
VLM
116
937
0
22 Jun 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
444
43,875
0
01 May 2014
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
311
26,247
0
11 Nov 2013
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
712
31,571
0
16 Jan 2013
1