Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.07332
Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"
50 / 1,644 papers shown
Title
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Duy-Kien Nguyen
Takayuki Okatani
84
282
0
03 Apr 2018
Iterative Visual Reasoning Beyond Convolutions
Xinlei Chen
Li Li
Li Fei-Fei
Abhinav Gupta
LRM
GNN
213
216
0
29 Mar 2018
Referring Relationships
Ranjay Krishna
Ines Chami
Michael S. Bernstein
Li Fei-Fei
102
95
0
28 Mar 2018
Scene Graph Parsing as Dependency Parsing
Yu-Siang Wang
Chenxi Liu
Fangyin Wei
Alan Yuille
GNN
3DV
52
53
0
25 Mar 2018
Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data
Xihui Liu
Hongsheng Li
Jing Shao
Dapeng Chen
Xiaogang Wang
93
133
0
22 Mar 2018
Stacked Cross Attention for Image-Text Matching
Kuang-Huei Lee
Xi Chen
G. Hua
Houdong Hu
Xiaodong He
122
1,163
0
21 Mar 2018
Video Object Segmentation with Language Referring Expressions
Anna Khoreva
Anna Rohrbach
Bernt Schiele
VOS
74
197
0
21 Mar 2018
Learning Unsupervised Visual Grounding Through Semantic Self-Supervision
Syed Ashar Javed
Shreyas Saxena
Vineet Gandhi
SSL
67
25
0
17 Mar 2018
Learning to Segment via Cut-and-Paste
Tal Remez
Jonathan Huang
Matthew A. Brown
89
99
0
16 Mar 2018
Discriminability objective for training descriptive captions
Ruotian Luo
Brian L. Price
Scott D. Cohen
Gregory Shakhnarovich
134
203
0
12 Mar 2018
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
58
39
0
28 Feb 2018
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari
Qing Li
Abigale Stangl
Anhong Guo
Chi Lin
Kristen Grauman
Jiebo Luo
Jeffrey P. Bigham
CoGe
171
864
0
22 Feb 2018
Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction
Roei Herzig
Moshiko Raboh
Gal Chechik
Jonathan Berant
Amir Globerson
GNN
OCL
101
135
0
15 Feb 2018
Generating Triples with Adversarial Networks for Scene Graph Construction
Matthew Klawonn
Eric Heim
GAN
GNN
63
22
0
07 Feb 2018
VISER: Visual Self-Regularization
Hamid Izadinia
Pierre Garrigues
SSL
75
4
0
07 Feb 2018
Dual Recurrent Attention Units for Visual Question Answering
Ahmed Osman
Wojciech Samek
51
32
0
01 Feb 2018
From BoW to CNN: Two Decades of Texture Representation for Texture Classification
Li Liu
Jie Chen
Paul Fieguth
Guoying Zhao
Rama Chellappa
M. Pietikäinen
3DV
120
335
0
31 Jan 2018
Object-based reasoning in VQA
Mikyas T. Desta
Larry Chen
Tomasz Kornuta
67
33
0
29 Jan 2018
DVQA: Understanding Data Visualizations via Question Answering
Kushal Kafle
Brian L. Price
Scott D. Cohen
Christopher Kanan
AIMat
116
397
0
24 Jan 2018
Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification
K. Sugiura
Hisashi Kawai
45
7
0
16 Jan 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
MedIm
76
469
0
12 Jan 2018
Interpretable Counting for Visual Question Answering
Alexander R. Trott
Caiming Xiong
R. Socher
106
71
0
23 Dec 2017
CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication
Jin-Hwa Kim
Nikita Kitaev
Xinlei Chen
Marcus Rohrbach
Byoung-Tak Zhang
Yuandong Tian
Dhruv Batra
Devi Parikh
DiffM
VGen
91
25
0
15 Dec 2017
IQA: Visual Question Answering in Interactive Environments
Daniel Gordon
Aniruddha Kembhavi
Mohammad Rastegari
Joseph Redmon
Dieter Fox
Ali Farhadi
LM&Ro
129
391
0
09 Dec 2017
Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks
Guohao Li
Hang Su
Wenwu Zhu
100
46
0
03 Dec 2017
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
Aniruddha Kembhavi
OOD
182
587
0
01 Dec 2017
Relation Networks for Object Detection
Han Hu
Jiayuan Gu
Zheng Zhang
Jifeng Dai
Yichen Wei
ObjD
150
1,230
0
30 Nov 2017
DOTA: A Large-scale Dataset for Object Detection in Aerial Images
Gui-Song Xia
X. Bai
Jian Ding
Zhen Zhu
Serge J. Belongie
Jiebo Luo
Mihai Datcu
Marcello Pelillo
Liangpei Zhang
ObjD
145
2,203
0
28 Nov 2017
Learning to Segment Every Thing
Ronghang Hu
Piotr Dollár
Kaiming He
Trevor Darrell
Ross B. Girshick
ISeg
VLM
95
296
0
28 Nov 2017
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
100
118
0
22 Nov 2017
Acquiring Common Sense Spatial Knowledge through Implicit Spatial Templates
Guillem Collell
Luc Van Gool
Marie-Francine Moens
55
42
0
18 Nov 2017
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
Keren Ye
Adriana Kovashka
79
51
0
17 Nov 2017
Neural Motifs: Scene Graph Parsing with Global Context
Rowan Zellers
Mark Yatskar
Sam Thomson
Yejin Choi
GNN
135
1,003
0
17 Nov 2017
Grounding Visual Explanations (Extended Abstract)
Lisa Anne Hendricks
Ronghang Hu
Trevor Darrell
Zeynep Akata
FAtt
59
3
0
17 Nov 2017
Natural Language Guided Visual Relationship Detection
Wentong Liao
Shuai Lin
Bodo Rosenhahn
M. Yang
92
63
0
16 Nov 2017
Investigating Inner Properties of Multimodal Representation and Semantic Compositionality with Brain-based Componential Semantics
Shaonan Wang
Jiajun Zhang
Nan Lin
Chengqing Zong
76
8
0
15 Nov 2017
Learning Multi-Modal Word Representation Grounded in Visual Context
Éloi Zablocki
Benjamin Piwowarski
Laure Soulier
Patrick Gallinari
SSL
74
30
0
09 Nov 2017
Deep Learning from Noisy Image Labels with Quality Embedding
Jiangchao Yao
Jiajie Wang
Ivor Tsang
Ya Zhang
Jun-wei Sun
Chengqi Zhang
Rui Zhang
NoLa
100
121
0
02 Nov 2017
Semantic Image Retrieval via Active Grounding of Visual Situations
Max H. Quinn
E. Conser
Jordan M. Witte
Melanie Mitchell
69
9
0
31 Oct 2017
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
78
161
0
17 Oct 2017
Natural Language Inference from Multiple Premises
Alice Lai
Yonatan Bisk
Julia Hockenmaier
LRM
83
59
0
09 Oct 2017
Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge
Ryota Hinami
Tao Mei
Shiníchi Satoh
258
229
0
26 Sep 2017
Region-Based Image Retrieval Revisited
Ryota Hinami
Yusuke Matsui
Shiníchi Satoh
23
24
0
26 Sep 2017
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Xiaojun Xu
Xinyun Chen
Chang-rui Liu
Anna Rohrbach
Trevor Darrell
Basel Alomair
AAML
99
41
0
25 Sep 2017
Inferring Generative Model Structure with Static Analysis
P. Varma
Bryan D. He
Payal Bajaj
Imon Banerjee
Nishith Khandwala
D. Rubin
Christopher Ré
83
58
0
07 Sep 2017
Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach
Timnit Gebru
Judy Hoffman
Li Fei-Fei
98
157
0
07 Sep 2017
Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs
Daniel Oñoro-Rubio
Mathias Niepert
Alberto García-Durán
Roberto Gonzalez
Roberto J. López-Sastre
98
15
0
07 Sep 2017
Automatic Dataset Augmentation
Yalong Bai
Kuiyuan Yang
Tao Mei
Wei-Ying Ma
Tiejun Zhao
3DV
37
2
0
28 Aug 2017
Dynamic Input Structure and Network Assembly for Few-Shot Learning
Nathan Hilliard
Nathan Oken Hodas
Court D. Corley
67
5
0
22 Aug 2017
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan
Yandong Li
Haoxiang Li
Chen Sun
Boqing Gong
106
127
0
15 Aug 2017
Previous
1
2
3
...
30
31
32
33
Next