Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.07332
Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"
50 / 1,650 papers shown
Title
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
88
83
0
01 Mar 2019
Differentiable Scene Graphs
Moshiko Raboh
Roei Herzig
Gal Chechik
Jonathan Berant
Amir Globerson
OCL
99
34
0
26 Feb 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
87
138
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
88
277
0
25 Feb 2019
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
Gi-Cheon Kang
Jaeseo Lim
Byoung-Tak Zhang
56
73
0
25 Feb 2019
Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection
N. Gkanatsios
Vassilis Pitsikalis
Petros Koutras
Athanasia Zlatintsi
Petros Maragos
74
18
0
15 Feb 2019
Cycle-Consistency for Robust Visual Question Answering
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
85
190
0
15 Feb 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
Zhe Gan
Yu Cheng
Ahmed El Kholy
Linjie Li
Jingjing Liu
Jianfeng Gao
108
105
0
01 Feb 2019
VrR-VG: Refocusing Visually-Relevant Relationships
Yuanzhi Liang
Yalong Bai
Wei Zhang
Xueming Qian
Li Zhu
Tao Mei
3DH
136
8
0
01 Feb 2019
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection
H. Ben-younes
Rémi Cadène
Nicolas Thome
Matthieu Cord
64
218
0
31 Jan 2019
Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues
Shanu Kumar
Shubham Atreja
Anjali Singh
Mohit Jain
60
12
0
29 Jan 2019
Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey
W. Zhang
Quan Z. Sheng
A. Alhazmi
Chenliang Li
AAML
125
57
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
127
327
0
20 Jan 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
Laurens van der Maaten
89
22
0
19 Jan 2019
Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks
Oleksandr Bailo
D. Ham
Y. Shin
GAN
MedIm
80
60
0
18 Jan 2019
Using Scene Graph Context to Improve Image Generation
Subarna Tripathi
Anahita Bhiwandiwalla
A. Bastidas
Hanlin Tang
GNN
102
32
0
11 Jan 2019
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Jingkuan Song
Xiangpeng Li
Lianli Gao
Heng Tao Shen
104
223
0
26 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering
Zhuoqian Yang
Zengchang Qin
Jing Yu
Yue Hu
GNN
80
16
0
23 Dec 2018
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition
Devesh Walawalkar
Yihui He
R. Pillai
50
1
0
21 Dec 2018
nocaps: novel object captioning at scale
Harsh Agrawal
Karan Desai
Yufei Wang
Xinlei Chen
Rishabh Jain
Mark Johnson
Dhruv Batra
Devi Parikh
Stefan Lee
Peter Anderson
VLM
148
488
0
20 Dec 2018
Grounded Video Description
Luowei Zhou
Yannis Kalantidis
Xinlei Chen
Jason J. Corso
Marcus Rohrbach
92
193
0
17 Dec 2018
Detecting unseen visual relations using analogies
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
58
18
0
13 Dec 2018
Adversarial Inference for Multi-Sentence Video Description
J. S. Park
Marcus Rohrbach
Trevor Darrell
Anna Rohrbach
81
80
0
13 Dec 2018
Visual Social Relationship Recognition
Junnan Li
Yongkang Wong
Qi Zhao
Mohan Kankanhalli
57
27
0
13 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
106
368
0
13 Dec 2018
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
239
481
0
12 Dec 2018
Attend More Times for Image Captioning
Jiajun Du
Yu Qin
Hongtao Lu
Yonghua Zhang
VLM
66
5
0
08 Dec 2018
Recursive Visual Attention in Visual Dialog
Yulei Niu
Hanwang Zhang
Manli Zhang
Jianhong Zhang
Zhiwu Lu
Ji-Rong Wen
109
119
0
06 Dec 2018
Auto-Encoding Scene Graphs for Image Captioning
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
182
704
0
06 Dec 2018
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Long Chen
Hanwang Zhang
Jun Xiao
Xiangnan He
Shiliang Pu
Shih-Fu Chang
100
159
0
06 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts
Kaihua Tang
Hanwang Zhang
Baoyuan Wu
Wenhan Luo
Wen Liu
108
505
0
05 Dec 2018
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
Anton Van Den Hengel
62
41
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
66
83
0
28 Nov 2018
Image Generation from Layout
Bo Zhao
Lili Meng
Weidong Yin
Leonid Sigal
106
210
0
28 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
231
885
0
27 Nov 2018
Attentive Relational Networks for Mapping Images to Scene Graphs
Mengshi Qi
Weijian Li
Zhengyuan Yang
Yunhong Wang
Jiebo Luo
3DPC
3DH
GNN
84
169
0
26 Nov 2018
Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation
Matteo Tomei
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
155
77
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
109
176
0
26 Nov 2018
What and Where: A Context-based Recommendation System for Object Insertion
Song-Hai Zhang
Zhengping Zhou
Bin Liu
Xin Dong
Dun Liang
P. Hall
Shimin Hu
VLM
79
23
0
24 Nov 2018
An Interpretable Model for Scene Graph Generation
Ji Zhang
Kevin J. Shih
Andrew Tao
Bryan Catanzaro
Ahmed Elgammal
GNN
66
22
0
21 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
108
13
0
20 Nov 2018
Scene Graph Generation via Conditional Random Fields
Weilin Cong
Wenjie Wang
Wang-Chien Lee
GNN
79
22
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
69
93
0
19 Nov 2018
Intention Oriented Image Captions with Guiding Objects
Yue Zheng
Yali Li
Shengjin Wang
62
55
0
19 Nov 2018
SEIGAN: Towards Compositional Image Generation by Simultaneously Learning to Segment, Enhance, and Inpaint
Pavel Ostyakov
Roman Suvorov
Elizaveta Logacheva
Oleg Khomenko
Sergey I. Nikolenko
GAN
71
23
0
19 Nov 2018
Integrating domain knowledge: using hierarchies to improve deep classifiers
C. Brust
Joachim Denzler
65
39
0
17 Nov 2018
Not just a matter of semantics: the relationship between visual similarity and semantic similarity
C. Brust
Joachim Denzler
50
9
0
17 Nov 2018
Exploiting Class Learnability in Noisy Data
Matthew Klawonn
Eric Heim
James A. Hendler
NoLa
61
7
0
15 Nov 2018
LinkNet: Relational Embedding for Scene Graph
Sanghyun Woo
Dahun Kim
Donghyeon Cho
In So Kweon
GNN
66
147
0
15 Nov 2018
No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques
Tanmay Gupta
Alex Schwing
Derek Hoiem
62
24
0
14 Nov 2018
Previous
1
2
3
...
28
29
30
31
32
33
Next