Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.07332
Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"
50 / 1,644 papers shown
Title
Situation Recognition with Graph Neural Networks
Ruiyu Li
Makarand Tapaswi
Renjie Liao
Jiaya Jia
R. Urtasun
Sanja Fidler
GNN
70
132
0
14 Aug 2017
Deep Object-Centric Representations for Generalizable Robot Learning
Coline Devin
Pieter Abbeel
Trevor Darrell
Sergey Levine
SSL
OCL
111
108
0
14 Aug 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
101
462
0
10 Aug 2017
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Damien Teney
Peter Anderson
Xiaodong He
Anton Van Den Hengel
132
383
0
09 Aug 2017
Structured Attentions for Visual Question Answering
Chen Zhu
Yanpeng Zhao
Shuaiyi Huang
Kewei Tu
Yi-An Ma
FAtt
87
107
0
07 Aug 2017
Identity-Aware Textual-Visual Matching with Latent Co-attention
Shuang Li
Tong Xiao
Hongsheng Li
Wei Yang
Xiaogang Wang
103
230
0
07 Aug 2017
PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
Hanwang Zhang
Zawlin Kyaw
Jinyang Yu
Shih-Fu Chang
67
141
0
07 Aug 2017
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu
Jun-chen Yu
Jianping Fan
Dacheng Tao
82
669
0
04 Aug 2017
Dual-Glance Model for Deciphering Social Relationships
Junnan Li
Yongkang Wong
Qi Zhao
Mohan Kankanhalli
62
81
0
02 Aug 2017
Scene Graph Generation from Objects, Phrases and Region Captions
Yikang Li
Wanli Ouyang
Bolei Zhou
Kun Wang
Xiaogang Wang
116
505
0
31 Jul 2017
Weakly-supervised learning of visual relations
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
80
195
0
29 Jul 2017
Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation
Ruichi Yu
Ang Li
Vlad I. Morariu
L. Davis
64
312
0
28 Jul 2017
Video Highlight Prediction Using Audience Chat Reactions
Cheng-Yang Fu
Joon Lee
Joey Tianyi Zhou
Alexander C. Berg
62
37
0
26 Jul 2017
SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data Set
William N. Havard
Laurent Besacier
O. Rosec
87
28
0
26 Jul 2017
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
55
7
0
26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
230
4,231
0
25 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
Xuwang Yin
Vicente Ordonez
VLM
100
55
0
22 Jul 2017
Few-Example Object Detection with Model Communication
Xuanyi Dong
Liang Zheng
Fan Ma
Yi Yang
Deyu Meng
ObjD
VLM
266
160
0
26 Jun 2017
Pixels to Graphs by Associative Embedding
Alejandro Newell
Jia Deng
GNN
VOS
101
232
0
22 Jun 2017
Where and Who? Automatic Semantic-Aware Person Composition
Fuwen Tan
Crispin Bernier
Benjamin Cohen
Vicente Ordonez
Connelly Barnes
3DH
85
51
0
04 Jun 2017
See, Hear, and Read: Deep Aligned Representations
Y. Aytar
Carl Vondrick
Antonio Torralba
VLM
AI4TS
105
136
0
03 Jun 2017
Care about you: towards large-scale human-centric visual relationship detection
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
52
21
0
28 May 2017
Logic Tensor Networks for Semantic Image Interpretation
Ivan Donadello
Luciano Serafini
Artur Garcez
110
211
0
24 May 2017
Learning Convolutional Text Representations for Visual Question Answering
Zhengyang Wang
Shuiwang Ji
FAtt
71
15
0
18 May 2017
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
H. Ben-younes
Rémi Cadène
Matthieu Cord
Nicolas Thome
171
584
0
18 May 2017
Inferring and Executing Programs for Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Judy Hoffman
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
NAI
124
545
0
10 May 2017
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Fanyi Xiao
Leonid Sigal
Yong Jae Lee
87
139
0
03 May 2017
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
3DV
69
118
0
02 May 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
200
1,257
0
02 May 2017
The Promise of Premise: Harnessing Question Premises in Visual Question Answering
Aroma Mahendru
Viraj Prabhu
Akrit Mohapatra
Dhruv Batra
Stefan Lee
NAI
108
38
0
01 May 2017
C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset
Aishwarya Agrawal
Aniruddha Kembhavi
Dhruv Batra
Devi Parikh
CoGe
70
80
0
26 Apr 2017
An Analysis of Action Recognition Datasets for Language and Vision Tasks
Spandana Gella
Frank Keller
ObjD
48
11
0
24 Apr 2017
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Wei-Lun Chao
Hexiang Hu
Fei Sha
89
37
0
24 Apr 2017
Spatial Memory for Context Reasoning in Object Detection
Xinlei Chen
Abhinav Gupta
ObjD
101
166
0
13 Apr 2017
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Y. Zhang
Luyao Yuan
Yijie Guo
Zhiyuan He
I-An Huang
Honglak Lee
ObjD
92
57
0
12 Apr 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li Li
65
324
0
12 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
Liwei Wang
Yin Li
Jing-ling Huang
Svetlana Lazebnik
VLM
110
498
0
11 Apr 2017
Detecting Visual Relationships with Deep Relational Networks
Bo Dai
Yuqi Zhang
Dahua Lin
GNN
104
504
0
11 Apr 2017
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
87
234
0
28 Mar 2017
Recurrent Topic-Transition GAN for Visual Paragraph Generation
Xiaodan Liang
Zhiting Hu
Huatian Zhang
Chuang Gan
Eric Xing
GAN
85
203
0
21 Mar 2017
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
Xiaodan Liang
Lisa Lee
Eric Xing
88
252
0
08 Mar 2017
Visual Translation Embedding Network for Visual Relation Detection
Hanwang Zhang
Zawlin Kyaw
Shih-Fu Chang
Tat-Seng Chua
ViT
249
563
0
27 Feb 2017
On the Origin of Deep Learning
Haohan Wang
Bhiksha Raj
MedIm
3DV
VLM
145
225
0
24 Feb 2017
ViP-CNN: Visual Phrase Guided Convolutional Neural Network
Yikang Li
Wanli Ouyang
Xiaogang Wang
Xiaoóu Tang
ObjD
69
48
0
23 Feb 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
105
396
0
19 Feb 2017
Learning to Detect Human-Object Interactions
Yu-Wei Chao
Yunfan Liu
Michael Xieyang Liu
Huayi Zeng
Jia Deng
78
512
0
17 Feb 2017
Scene Graph Generation by Iterative Message Passing
Danfei Xu
Yuke Zhu
Chris Choy
Li Fei-Fei
GNN
3DV
150
1,228
0
10 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
98
275
0
30 Dec 2016
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
Laurens van der Maaten
VLM
85
138
0
29 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
80
9
0
22 Dec 2016
Previous
1
2
3
...
31
32
33
Next