v1v2 (latest)

What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs

19 June 2022

Tal Shaharabany

Yoad Tewel

Lior Wolf

ObjD

ArXiv (abs)PDF HTML Github (24★)

Papers citing "What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs"

22 / 72 papers shown

Title
Densely Connected Convolutional Networks Gao Huang Zhuang Liu Laurens van der Maaten Kilian Q. Weinberger PINN 3DV 775 36,881 0 25 Aug 2016
Top-down Neural Attention by Excitation Backprop Jianming Zhang Zhe Lin Jonathan Brandt Xiaohui Shen Stan Sclaroff 92 948 0 01 Aug 2016
Not Just a Black Box: Learning Important Features Through Propagating Activation Differences Avanti Shrikumar Peyton Greenside A. Shcherbina A. Kundaje FAtt 85 791 0 05 May 2016
The Cityscapes Dataset for Semantic Urban Scene Understanding Marius Cordts Mohamed Omran Sebastian Ramos Timo Rehfeld Markus Enzweiler Rodrigo Benenson Uwe Franke Stefan Roth Bernt Schiele 1.1K 11,641 0 06 Apr 2016
Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers Alexander Binder G. Montavon Sebastian Lapuschkin K. Müller Wojciech Samek FAtt 77 462 0 04 Apr 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations Ranjay Krishna Yuke Zhu Oliver Groth Justin Johnson Kenji Hata ... Yannis Kalantidis Li Li David A. Shamma Michael S. Bernstein Fei-Fei Li 225 5,762 0 23 Feb 2016
Learning Deep Features for Discriminative Localization Bolei Zhou A. Khosla Àgata Lapedriza A. Oliva Antonio Torralba SSL SSeg FAtt 253 9,338 0 14 Dec 2015
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,426 0 10 Dec 2015
SSD: Single Shot MultiBox Detector Wen Liu Dragomir Anguelov D. Erhan Christian Szegedy Scott E. Reed Cheng-Yang Fu Alexander C. Berg ObjD BDL 244 29,859 0 08 Dec 2015
Rethinking the Inception Architecture for Computer Vision Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jonathon Shlens Z. Wojna 3DV BDL 886 27,412 0 02 Dec 2015
Named Entity Recognition with Bidirectional LSTM-CNNs Jason P. C. Chiu Eric Nichols 89 1,900 0 26 Nov 2015
Towards Open Set Deep Networks Abhijit Bendale Terrance Boult BDL EDL 104 1,433 0 19 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross B. Girshick Jian Sun AIMat ObjD 525 62,360 0 04 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models Bryan A. Plummer Liwei Wang Christopher M. Cervantes Juan C. Caicedo Julia Hockenmaier Svetlana Lazebnik 202 2,072 0 19 May 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger Philipp Fischer Thomas Brox SSeg 3DV 1.9K 77,341 0 18 May 2015
Fast R-CNN Ross B. Girshick ObjD 309 25,081 0 30 Apr 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Ke Xu Jimmy Ba Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov R. Zemel Yoshua Bengio DiffM 348 10,079 0 10 Feb 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) Junhua Mao Wenyuan Xu Yi Yang Jiang Wang Zhiheng Huang Alan Yuille VLM 176 1,240 0 20 Dec 2014
Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation Benjamin Klein Guy Lev Gil Sadeh Lior Wolf 87 102 0 26 Nov 2014
From Captions to Visual Concepts and Back Hao Fang Saurabh Gupta F. Iandola R. Srivastava Li Deng ... Xiaodong He Margaret Mitchell John C. Platt C. L. Zitnick Geoffrey Zweig VLM 116 1,312 0 18 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 1.7K 100,508 0 04 Sep 2014
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 424 43,814 0 01 May 2014