Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.08814
Cited By
Visual Relation Grounding in Videos
17 July 2020
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Relation Grounding in Videos"
35 / 35 papers shown
Title
Scene-Text Grounding for Text-Based Video Question Answering
Sheng Zhou
Junbin Xiao
Xun Yang
Peipei Song
Dan Guo
Angela Yao
Meng Wang
Tat-Seng Chua
199
1
0
22 Sep 2024
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
Zhu Zhang
Zhou Zhao
Yang Zhao
Qi. Wang
Huasheng Liu
Lianli Gao
60
115
0
19 Jan 2020
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qingming Huang
ObjD
41
78
0
28 Aug 2019
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
Zhenfang Chen
Lin Ma
Wenhan Luo
Kwan-Yee K. Wong
78
103
0
06 Jun 2019
Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph
Yao-Hung Hubert Tsai
S. Divvala
Louis-Philippe Morency
Ruslan Salakhutdinov
Ali Farhadi
51
103
0
25 Mar 2019
Videos as Space-Time Region Graphs
Xinyu Wang
Abhinav Gupta
83
755
0
05 Jun 2018
R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu
Lei Ji
Wei Zhang
Nan Duan
M. Zhou
Jianyong Wang
CoGe
51
79
0
24 May 2018
Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction
Luowei Zhou
Nathan Louis
Jason J. Corso
77
94
0
08 May 2018
Large-Scale Visual Relationship Understanding
Ji Zhang
Yannis Kalantidis
Marcus Rohrbach
Manohar Paluri
Ahmed Elgammal
Mohamed Elhoseiny
45
168
0
27 Apr 2018
Referring Relationships
Ranjay Krishna
Ines Chami
Michael S. Bernstein
Li Fei-Fei
59
94
0
28 Mar 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
97
825
0
24 Jan 2018
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
53
75
0
04 Jan 2018
Grounding Referring Expressions in Images by Variational Context
Hanwang Zhang
Yulei Niu
Shih-Fu Chang
BDL
ObjD
53
220
0
05 Dec 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
219
7,989
0
22 May 2017
Spatio-temporal Person Retrieval via Natural Language Queries
Masataka Yamaguchi
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
64
58
0
26 Apr 2017
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
73
406
0
30 Nov 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
125
1,261
0
31 Jul 2016
Visual Relationship Detection with Language Priors
Cewu Lu
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
VLM
73
1,138
0
31 Jul 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
194
5,726
0
23 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
1.9K
193,426
0
10 Dec 2015
Structural-RNN: Deep Learning on Spatio-Temporal Graphs
Ashesh Jain
Amir Zamir
Silvio Savarese
Ashutosh Saxena
GNN
128
1,090
0
17 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
87
552
0
13 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
73
496
0
12 Nov 2015
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
Pingbo Pan
Zhongwen Xu
Yi Yang
Leilei Gan
Yueting Zhuang
43
385
0
11 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
112
1,345
0
07 Nov 2015
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
467
62,122
0
04 Jun 2015
Sequence to Sequence -- Video to Text
Subhashini Venugopalan
Marcus Rohrbach
Jeff Donahue
Raymond J. Mooney
Trevor Darrell
Kate Saenko
116
1,417
0
03 May 2015
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng
Matthew J. Hausknecht
Sudheendra Vijayanarasimhan
Oriol Vinyals
R. Monga
G. Toderici
127
2,336
0
31 Mar 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.5K
149,842
0
22 Dec 2014
Deep Visual-Semantic Alignments for Generating Image Descriptions
A. Karpathy
Li Fei-Fei
95
5,578
0
07 Dec 2014
Finding Action Tubes
Georgia Gkioxari
Jitendra Malik
54
598
0
21 Nov 2014
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.5K
39,472
0
01 Sep 2014
Deep Fragment Embeddings for Bidirectional Image Sentence Mapping
A. Karpathy
Armand Joulin
Li Fei-Fei
VLM
85
936
0
22 Jun 2014
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
237
7,526
0
09 Jun 2014
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
373
43,524
0
01 May 2014
1