Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.10191
Cited By
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
18 March 2021
Qi Feng
Yunchao Wei
Mingming Cheng
Yi Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoupled Spatial Temporal Graphs for Generic Visual Grounding"
12 / 12 papers shown
Title
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
Tianrui Hui
Si Liu
Shaofei Huang
Guanbin Li
Sansi Yu
Faxi Zhang
Jizhong Han
58
150
0
01 Oct 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
113
12,007
0
13 Nov 2019
Dynamic Graph Attention for Referring Expression Comprehension
Sibei Yang
Guanbin Li
Yizhou Yu
OCL
54
216
0
18 Sep 2019
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
91
940
0
04 Aug 2017
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu
Chen Sun
David A. Ross
Carl Vondrick
C. Pantofaru
...
G. Toderici
Susanna Ricco
Rahul Sukthankar
Cordelia Schmid
Jitendra Malik
VGen
87
1,021
0
23 May 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
199
7,961
0
22 May 2017
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
105
813
0
05 May 2017
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
77
1,238
0
06 Apr 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
211
5,502
0
23 Nov 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
93
1,875
0
07 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
177
2,033
0
19 May 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
286
10,034
0
10 Feb 2015
1