Unsupervised Textual Grounding: Linking Words to Image Concepts

29 March 2018

Papers citing "Unsupervised Textual Grounding: Linking Words to Image Concepts"

38 / 38 papers shown

Title
Towards Visual Grounding: A Survey Linhui Xiao Xiaoshan Yang X. Lan Yaowei Wang Changsheng Xu ObjD 205 4 0 31 Dec 2024
Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts Raymond A. Yeh Jinjun Xiong Wen-mei W. Hwu Minh Do Alex Schwing 47 57 0 29 Mar 2018
High-Order Attention Models for Visual Question Answering Idan Schwartz Alex Schwing Tamir Hazan 43 102 0 12 Nov 2017
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering V. Kazemi Ali Elqursh OOD 79 184 0 11 Apr 2017
Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush 106 463 0 03 Feb 2017
YOLO9000: Better, Faster, Stronger Joseph Redmon Ali Farhadi VLM ObjD 181 15,616 0 25 Dec 2016
Dual Attention Networks for Multimodal Reasoning and Matching Hyeonseob Nam Jung-Woo Ha Jeonghee Kim 93 666 0 02 Nov 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding Akira Fukui Dong Huk Park Daylen Yang Anna Rohrbach Trevor Darrell Marcus Rohrbach 296 1,465 0 06 Jun 2016
Adversarial Feature Learning Jiasen Lu Philipp Krahenbuhl Trevor Darrell GAN 109 1,609 0 31 May 2016
Dynamic Memory Networks for Visual and Textual Question Answering Caiming Xiong Stephen Merity R. Socher 74 755 0 04 Mar 2016
Learning to Compose Neural Networks for Question Answering Jacob Andreas Marcus Rohrbach Trevor Darrell Dan Klein NAI KELM BDL CoGe 99 566 0 07 Jan 2016
Where To Look: Focus Regions for Visual Question Answering Kevin J. Shih Saurabh Singh Derek Hoiem 71 460 0 23 Nov 2015
Learning Deep Structure-Preserving Image-Text Embeddings Liwei Wang Yin Li Svetlana Lazebnik 81 781 0 19 Nov 2015
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering Huijuan Xu Kate Saenko 71 763 0 17 Nov 2015
Natural Language Object Retrieval Ronghang Hu Huazhe Xu Marcus Rohrbach Jiashi Feng Kate Saenko Trevor Darrell ObjD 94 553 0 13 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction Anna Rohrbach Marcus Rohrbach Ronghang Hu Trevor Darrell Bernt Schiele 75 497 0 12 Nov 2015
Visual7W: Grounded Question Answering in Images Yuke Zhu Oliver Groth Michael S. Bernstein Li Fei-Fei 91 883 0 11 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions Junhua Mao Jonathan Huang Alexander Toshev Oana-Maria Camburu Alan Yuille Kevin Patrick Murphy ObjD 118 1,345 0 07 Nov 2015
Stacked Attention Networks for Image Question Answering Zichao Yang Xiaodong He Jianfeng Gao Li Deng Alex Smola BDL 107 1,882 0 07 Nov 2015
Learning to Answer Questions From Image Using Convolutional Neural Network Lin Ma Zhengdong Lu Hang Li 80 262 0 01 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models Bryan A. Plummer Liwei Wang Christopher M. Cervantes Juan C. Caicedo Julia Hockenmaier Svetlana Lazebnik 196 2,056 0 19 May 2015
Ask Your Neurons: A Neural-based Approach to Answering Questions about Images Mateusz Malinowski Marcus Rohrbach Mario Fritz 106 600 0 05 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 337 10,069 0 10 Feb 2015
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs Liang-Chieh Chen George Papandreou Iasonas Kokkinos Kevin Patrick Murphy Alan Yuille SSeg 176 4,892 0 22 Dec 2014
Adam: A Method for Stochastic Optimization Diederik P. Kingma Jimmy Ba ODL 1.8K 150,039 0 22 Dec 2014
Object Detectors Emerge in Deep Scene CNNs Bolei Zhou A. Khosla Àgata Lapedriza A. Oliva Antonio Torralba ObjD 137 1,283 0 22 Dec 2014
Striving for Simplicity: The All Convolutional Net Jost Tobias Springenberg Alexey Dosovitskiy Thomas Brox Martin Riedmiller FAtt 248 4,667 0 21 Dec 2014
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Zhiheng Huang Alan Yuille VLM 168 1,240 0 20 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions A. Karpathy Li Fei-Fei 124 5,584 0 07 Dec 2014
Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation Benjamin Klein Guy Lev Gil Sadeh Lior Wolf 84 102 0 26 Nov 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and Description Jeff Donahue Lisa Anne Hendricks Marcus Rohrbach Subhashini Venugopalan S. Guadarrama Kate Saenko Trevor Darrell VLM 162 6,052 0 17 Nov 2014
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models Ryan Kiros Ruslan Salakhutdinov R. Zemel VLM 125 1,399 0 10 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 1.6K 100,348 0 04 Sep 2014
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 1.7K 39,525 0 01 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping A. Karpathy Armand Joulin Li Fei-Fei VLM 101 937 0 22 Jun 2014
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 413 43,638 0 01 May 2014
Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov Ilya Sutskever Kai Chen G. Corrado J. Dean NAI OCL 394 33,521 0 16 Oct 2013
A Joint Model of Language and Perception for Grounded Attribute Learning Cynthia Matuszek Nicholas FitzGerald Luke Zettlemoyer Liefeng Bo Dieter Fox LM&Ro 78 316 0 27 Jun 2012