Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.07912
Cited By
YORO -- Lightweight End to End Visual Grounding
15 November 2022
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"YORO -- Lightweight End to End Visual Grounding"
27 / 77 papers shown
Title
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
74
596
0
01 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
882
93,936
0
11 Oct 2018
Deep Learning for Generic Object Detection: A Survey
Li Liu
Wanli Ouyang
Xiaogang Wang
Paul Fieguth
Jie Chen
Xinwang Liu
M. Pietikäinen
ObjD
VLM
OOD
143
2,438
0
06 Sep 2018
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Zhou Zhao
Q. Tian
Dacheng Tao
ObjD
37
139
0
09 May 2018
YOLOv3: An Incremental Improvement
Joseph Redmon
Ali Farhadi
ObjD
83
21,306
0
08 Apr 2018
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
59
1,148
0
21 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
85
822
0
24 Jan 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
129
19,124
0
13 Jan 2018
Grounding Referring Expressions in Images by Variational Context
Hanwang Zhang
Yulei Niu
Shih-Fu Chang
BDL
ObjD
42
220
0
05 Dec 2017
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
37
118
0
22 Nov 2017
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
ObjD
43
134
0
17 Nov 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
427
129,831
0
12 Jun 2017
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
1.0K
20,692
0
17 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
51
496
0
11 Apr 2017
Comprehension-guided referring expressions
Ruotian Luo
Gregory Shakhnarovich
ObjD
70
171
0
12 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
73
275
0
30 Dec 2016
YOLO9000: Better, Faster, Stronger
Joseph Redmon
Ali Farhadi
VLM
ObjD
147
15,535
0
25 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
285
3,187
0
02 Dec 2016
Speed/accuracy trade-offs for modern convolutional object detectors
Jonathan Huang
V. Rathod
Chen Sun
Menglong Zhu
Anoop Korattikara Balan
...
Ian S. Fischer
Z. Wojna
Yang Song
S. Guadarrama
Kevin Patrick Murphy
3DH
3DV
80
2,567
0
30 Nov 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
55
402
0
30 Nov 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
95
1,250
0
31 Jul 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
256
1,466
0
06 Jun 2016
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
65
552
0
13 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
53
497
0
12 Nov 2015
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
543
36,643
0
08 Jun 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
396
61,900
0
04 Jun 2015
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
237
43,290
0
01 May 2014
Previous
1
2