Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.06619
Cited By
TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer
14 June 2022
Jiajun Deng
Zhengyuan Yang
Daqing Liu
Tianlang Chen
Wen-gang Zhou
Yanyong Zhang
Houqiang Li
Wanli Ouyang
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer"
30 / 80 papers shown
Title
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
Universal Transformers
Mostafa Dehghani
Stephan Gouws
Oriol Vinyals
Jakob Uszkoreit
Lukasz Kaiser
87
755
0
10 Jul 2018
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Zhou Zhao
Q. Tian
Dacheng Tao
ObjD
60
139
0
09 May 2018
YOLOv3: An Incremental Improvement
Joseph Redmon
Ali Farhadi
ObjD
126
21,470
0
08 Apr 2018
Image Transformer
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
ViT
138
1,684
0
15 Feb 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
109
831
0
24 Jan 2018
Grounding Referring Expressions in Images by Variational Context
Hanwang Zhang
Yulei Niu
Shih-Fu Chang
BDL
ObjD
61
222
0
05 Dec 2017
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
52
118
0
22 Nov 2017
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
ObjD
57
134
0
17 Nov 2017
Query-guided Regression Network with Context Policy for Phrase Grounding
Kan Chen
Rama Kovvuri
Ram Nevatia
65
142
0
04 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
121
4,220
0
25 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
728
132,199
0
12 Jun 2017
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
64
57
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
73
498
0
11 Apr 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
352
27,230
0
20 Mar 2017
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
82
406
0
30 Nov 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
129
1,273
0
31 Jul 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
416
10,526
0
21 Jul 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,322
0
10 Dec 2015
Learning Deep Structure-Preserving Image-Text Embeddings
Liwei Wang
Yin Li
Svetlana Lazebnik
83
782
0
19 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
97
554
0
13 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
128
1,357
0
07 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
520
62,360
0
04 Jun 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
202
2,071
0
19 May 2015
Fast R-CNN
Ross B. Girshick
ObjD
309
25,081
0
30 Apr 2015
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Kai Sheng Tai
R. Socher
Christopher D. Manning
AIMat
142
3,122
0
28 Feb 2015
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Liang-Chieh Chen
George Papandreou
Iasonas Kokkinos
Kevin Patrick Murphy
Alan Yuille
SSeg
196
4,896
0
22 Dec 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
419
43,777
0
01 May 2014
High-Speed Tracking with Kernelized Correlation Filters
João F. Henriques
Rui Caseiro
P. Martins
Jorge Batista
89
5,138
0
30 Apr 2014
Rich feature hierarchies for accurate object detection and semantic segmentation
Ross B. Girshick
Jeff Donahue
Trevor Darrell
Jitendra Malik
ObjD
289
26,211
0
11 Nov 2013
Previous
1
2