Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.07332
Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"
50 / 1,650 papers shown
Title
Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Wenbin Wang
Ruiping Wang
Shiguang Shan
Xilin Chen
3DH
102
53
0
17 Jul 2020
Detecting Human-Object Interactions with Action Co-occurrence Priors
Dong-Jin Kim
Xiao Sun
Jinsoo Choi
Stephen Lin
In So Kweon
76
125
0
17 Jul 2020
RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
Hung-Yu Tseng
Hsin-Ying Lee
Lu Jiang
Ming-Hsuan Yang
Weilong Yang
DiffM
3DV
159
54
0
16 Jul 2020
Explore and Explain: Self-supervised Navigation and Recounting
Roberto Bigazzi
Federico Landi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
EgoV
LM&Ro
78
17
0
14 Jul 2020
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
98
79
0
13 Jul 2020
Generative Compositional Augmentations for Scene Graph Prediction
Boris Knyazev
H. D. Vries
Cătălina Cangea
Graham W. Taylor
Aaron Courville
Eugene Belilovsky
106
26
0
11 Jul 2020
Image Captioning with Compositional Neural Module Networks
Junjiao Tian
Jean Oh
44
11
0
10 Jul 2020
IQ-VQA: Intelligent Visual Question Answering
Vatsal Goel
Mohit Chandak
A. Anand
Prithwijit Guha
64
5
0
08 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
101
11
0
08 Jul 2020
Modality Shifting Attention Network for Multi-modal Video Question Answering
Junyeong Kim
Minuk Ma
T. Pham
Kyungsu Kim
Chang D. Yoo
84
72
0
04 Jul 2020
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation
Liwei Wang
Jing-ling Huang
Yin Li
Kun Xu
Zhengyuan Yang
Dong Yu
ObjD
74
84
0
03 Jul 2020
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
169
748
0
01 Jul 2020
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Fei Yu
Jiji Tang
Weichong Yin
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
128
382
0
30 Jun 2020
Learning Physical Graph Representations from Visual Scenes
Daniel M. Bear
Chaofei Fan
Damian Mrowca
Yunzhu Li
S. Alter
...
Jeremy Schwartz
Li Fei-Fei
Jiajun Wu
J. Tenenbaum
Daniel L. K. Yamins
SSL
GNN
SSeg
AI4CE
108
79
0
22 Jun 2020
Improving Image Captioning with Better Use of Captions
Zhan Shi
Xu Zhou
Xipeng Qiu
Xiao-Dan Zhu
66
128
0
21 Jun 2020
Learning Visual Commonsense for Robust Scene Graph Generation
Alireza Zareian
Zhecan Wang
Haoxuan You
Shih-Fu Chang
102
311
0
17 Jun 2020
Modeling Graph Structure via Relative Position for Text Generation from Knowledge Graphs
Martin Schmitt
Leonardo F. R. Ribeiro
Philipp Dufter
Iryna Gurevych
Hinrich Schütze
GNN
50
8
0
16 Jun 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
Jiahao Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
103
129
0
16 Jun 2020
Exploiting Visual Semantic Reasoning for Video-Text Retrieval
Zerun Feng
Zhimin Zeng
Caili Guo
Zheng Li
79
36
0
16 Jun 2020
Generative 3D Part Assembly via Dynamic Graph Learning
Jialei Huang
Guanqi Zhan
Qingnan Fan
Kaichun Mo
Lin Shao
Baoquan Chen
Leonidas Guibas
Hao Dong
126
89
0
14 Jun 2020
Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation
Tao He
Lianli Gao
Jingkuan Song
Jianfei Cai
Yuan-Fang Li
80
31
0
13 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
173
437
0
11 Jun 2020
Exploring Weaknesses of VQA Models through Attribution Driven Insights
Shaunak Halbe
37
2
0
11 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
133
501
0
11 Jun 2020
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
OOD
88
90
0
09 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
147
403
0
08 Jun 2020
Pick-Object-Attack: Type-Specific Adversarial Attack for Object Detection
Omid Mohamad Nezami
Akshay Chaturvedi
Mark Dras
Utpal Garain
AAML
ObjD
61
19
0
05 Jun 2020
Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge
Peng Wang
Dongyang Liu
Hui Li
Qi Wu
ObjD
70
19
0
02 Jun 2020
Structured Multimodal Attentions for TextVQA
Chenyu Gao
Qi Zhu
Peng Wang
Hui Li
Yuliang Liu
Anton Van Den Hengel
Qi Wu
99
60
0
01 Jun 2020
FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval
D. Gao
Linbo Jin
Ben Chen
Minghui Qiu
Peng Li
Yi Wei
Yitao Hu
Haozhe Jasper Wang
OOD
84
134
0
20 May 2020
Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation
Boris Knyazev
H. D. Vries
Cătălina Cangea
Graham W. Taylor
Aaron Courville
Eugene Belilovsky
74
56
0
17 May 2020
Visual Relationship Detection using Scene Graphs: A Survey
Aniket Agarwal
Ayush Mangal
Vipul
GNN
70
21
0
16 May 2020
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
115
130
0
15 May 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
88
36
0
12 May 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
109
612
0
10 May 2020
History for Visual Dialog: Do we really need it?
Shubham Agarwal
Trung Bui
Joon-Young Lee
Ioannis Konstas
Verena Rieser
VLM
38
71
0
08 May 2020
Diagnosing the Environment Bias in Vision-and-Language Navigation
Yubo Zhang
Hao Tan
Joey Tianyi Zhou
73
57
0
06 May 2020
What are the Goals of Distributional Semantics?
Guy Edward Toh Emerson
91
26
0
06 May 2020
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Arjun Reddy Akula
Spandana Gella
Yaser Al-Onaizan
Song-Chun Zhu
Siva Reddy
ObjD
69
52
0
04 May 2020
Visually Grounded Continual Learning of Compositional Phrases
Xisen Jin
Junyi Du
Arka Sadhu
Ram Nevatia
Xiang Ren
CLL
61
4
0
02 May 2020
Obtaining Faithful Interpretations from Compositional Neural Networks
Sanjay Subramanian
Ben Bogin
Nitish Gupta
Tomer Wolfson
Sameer Singh
Jonathan Berant
Matt Gardner
75
42
0
02 May 2020
Probing Contextual Language Models for Common Ground with Visual Representations
Gabriel Ilharco
Rowan Zellers
Ali Farhadi
Hannaneh Hajishirzi
118
14
0
01 May 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
196
236
0
30 Apr 2020
Image Captioning through Image Transformer
Sen He
Wentong Liao
Hamed R. Tavakoli
M. Yang
Bodo Rosenhahn
N. Pugeault
ViT
95
94
0
29 Apr 2020
Unifying Neural Learning and Symbolic Reasoning for Spinal Medical Report Generation
Zhongyi Han
B. Wei
Yilong Yin
Shuo Li
MedIm
59
42
0
28 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq Joty
Michael R. Lyu
Irwin King
Caiming Xiong
Guosheng Lin
114
104
0
28 Apr 2020
Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Menglin Jia
Mengyun Shi
Mikhail Sirotenko
Huayu Chen
Claire Cardie
B. Hariharan
Hartwig Adam
Serge J. Belongie
93
97
0
26 Apr 2020
Deep Multimodal Neural Architecture Search
Zhou Yu
Yuhao Cui
Jun-chen Yu
Meng Wang
Dacheng Tao
Qi Tian
70
100
0
25 Apr 2020
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Duy-Kien Nguyen
Vedanuj Goswami
Xinlei Chen
71
23
0
24 Apr 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
124
361
0
21 Apr 2020
Previous
1
2
3
...
22
23
24
...
31
32
33
Next