Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.11185
Cited By
Unsupervised Textual Grounding: Linking Words to Image Concepts
29 March 2018
Raymond A. Yeh
Minh Do
Alex Schwing
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unsupervised Textual Grounding: Linking Words to Image Concepts"
38 / 38 papers shown
Title
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
205
4
0
31 Dec 2024
Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts
Raymond A. Yeh
Jinjun Xiong
Wen-mei W. Hwu
Minh Do
Alex Schwing
47
57
0
29 Mar 2018
High-Order Attention Models for Visual Question Answering
Idan Schwartz
Alex Schwing
Tamir Hazan
43
102
0
12 Nov 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
79
184
0
11 Apr 2017
Structured Attention Networks
Yoon Kim
Carl Denton
Luong Hoang
Alexander M. Rush
106
463
0
03 Feb 2017
YOLO9000: Better, Faster, Stronger
Joseph Redmon
Ali Farhadi
VLM
ObjD
181
15,616
0
25 Dec 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
93
666
0
02 Nov 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
296
1,465
0
06 Jun 2016
Adversarial Feature Learning
Jiasen Lu
Philipp Krahenbuhl
Trevor Darrell
GAN
109
1,609
0
31 May 2016
Dynamic Memory Networks for Visual and Textual Question Answering
Caiming Xiong
Stephen Merity
R. Socher
74
755
0
04 Mar 2016
Learning to Compose Neural Networks for Question Answering
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
NAI
KELM
BDL
CoGe
99
566
0
07 Jan 2016
Where To Look: Focus Regions for Visual Question Answering
Kevin J. Shih
Saurabh Singh
Derek Hoiem
71
460
0
23 Nov 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
81
781
0
19 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
71
763
0
17 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
94
553
0
13 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
75
497
0
12 Nov 2015
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
91
883
0
11 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
118
1,345
0
07 Nov 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
107
1,882
0
07 Nov 2015
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
80
262
0
01 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
196
2,056
0
19 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
106
600
0
05 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
337
10,069
0
10 Feb 2015
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
176
4,892
0
22 Dec 2014
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
Object Detectors Emerge in Deep Scene CNNs
Bolei Zhou
A. Khosla
Àgata Lapedriza
A. Oliva
Antonio Torralba
ObjD
137
1,283
0
22 Dec 2014
Striving for Simplicity: The All Convolutional Net
Jost Tobias Springenberg
Alexey Dosovitskiy
Thomas Brox
Martin Riedmiller
FAtt
248
4,667
0
21 Dec 2014
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
Wenyuan Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
168
1,240
0
20 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
124
5,584
0
07 Dec 2014
Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation
Benjamin Klein
Guy Lev
Gil Sadeh
Lior Wolf
84
102
0
26 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
162
6,052
0
17 Nov 2014
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
Ryan Kiros
Ruslan Salakhutdinov
R. Zemel
VLM
125
1,399
0
10 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,348
0
04 Sep 2014
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
A. Karpathy
Armand Joulin
Li Fei-Fei
VLM
101
937
0
22 Jun 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
394
33,521
0
16 Oct 2013
A Joint Model of Language and Perception for Grounded Attribute Learning
Cynthia Matuszek
Nicholas FitzGerald
Luke Zettlemoyer
Liefeng Bo
Dieter Fox
LM&Ro
78
316
0
27 Jun 2012
1