Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.07332
Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"
50 / 1,647 papers shown
Title
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
Jaemin Cho
Jiasen Lu
Dustin Schwenk
Hannaneh Hajishirzi
Aniruddha Kembhavi
VLM
MLLM
95
102
0
23 Sep 2020
Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering
Tuong Khanh Long Do
Binh X. Nguyen
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
Thanh-Toan Do
40
2
0
23 Sep 2020
ALICE: Active Learning with Contrastive Natural Language Explanations
Weixin Liang
James Zou
Zhou Yu
VLM
110
51
0
22 Sep 2020
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
80
25
0
21 Sep 2020
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition
Tianshui Chen
Liang Lin
Riquan Chen
X. Hui
Hefeng Wu
96
157
0
20 Sep 2020
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes
Raeid Saqur
Ameet Deshpande
GNN
NAI
22
0
0
19 Sep 2020
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
Jiahao Yu
Yuan Chai
Yujing Wang
Yue Hu
Qi Wu
CML
117
114
0
16 Sep 2020
Simultaneous Machine Translation with Visual Context
Ozan Caglayan
Julia Ive
Veneta Haralampieva
Pranava Madhyastha
Loïc Barrault
Lucia Specia
45
30
0
15 Sep 2020
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
Yi Zhou
Shuyang Sun
Chao Zhang
Yikang Li
Wanli Ouyang
66
7
0
12 Sep 2020
Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models
Khyathi Chandu
Piyush Sharma
Soravit Changpinyo
Ashish V. Thapliyal
Radu Soricut
DiffM
VLM
84
3
0
10 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
220
627
0
10 Sep 2020
Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations
Meng-Jiun Chiou
Roger Zimmermann
Jiashi Feng
109
1
0
10 Sep 2020
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
69
37
0
08 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
70
67
0
03 Sep 2020
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation
Shaotian Yan
Chen Shen
Zhongming Jin
Jianqiang Huang
Rongxin Jiang
Yao-wu Chen
Xiansheng Hua
109
134
0
02 Sep 2020
Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning
Liqi Yan
Dongfang Liu
Yaoxian Song
Changbin (Brad) Yu
68
14
0
01 Sep 2020
Practical Cross-modal Manifold Alignment for Grounded Language
A. Nguyen
Luke E. Richards
Gaoussou Youssouf Kebe
Edward Raff
Kasra Darvish
Frank Ferraro
Cynthia Matuszek
20
4
0
01 Sep 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
74
100
0
31 Aug 2020
A Survey of Deep Active Learning
Pengzhen Ren
Yun Xiao
Xiaojun Chang
Po-Yao (Bernie) Huang
Zhihui Li
Brij B. Gupta
Xiaojiang Chen
Xin Wang
147
1,161
0
30 Aug 2020
Person-in-Context Synthesiswith Compositional Structural Space
Weidong Yin
Ziwei Liu
Leonid Sigal
36
2
0
28 Aug 2020
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
154
44
0
27 Aug 2020
VisualSem: A High-quality Knowledge Graph for Vision and Language
Houda Alberts
Teresa Huang
Y. Deshpande
Yibo Liu
Kyunghyun Cho
Clara Vania
Iacer Calixto
VLM
58
46
0
20 Aug 2020
Commonsense Knowledge in Wikidata
Filip Ilievski
Pedro A. Szekely
D. Schwabe
CML
KELM
55
18
0
18 Aug 2020
Linguistically-aware Attention for Reducing the Semantic-Gap in Vision-Language Tasks
K. Gouthaman
Athira M. Nambiar
K. Srinivas
Anurag Mittal
VLM
63
13
0
18 Aug 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
Tong Wang
Selen Pehlivan
Jorma T. Laaksonen
107
34
0
18 Aug 2020
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
Tomu Tahara
Takashi Seno
Gaku Narita
T. Ishikawa
85
48
0
18 Aug 2020
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
Shengyu Zhang
Tan Jiang
Tan Wang
Kun Kuang
Zhou Zhao
Jianke Zhu
Jin Yu
Hongxia Yang
Leilei Gan
OOD
81
88
0
16 Aug 2020
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation
Meng Wei
C. Yuan
Xiaoyu Yue
Kuo Zhong
119
18
0
12 Aug 2020
Assisting Scene Graph Generation with Self-Supervision
Sandeep Inuganti
V. Balasubramanian
SSL
49
7
0
08 Aug 2020
Polysemy Deciphering Network for Robust Human-Object Interaction Detection
Xubin Zhong
Changxing Ding
X. Qu
Dacheng Tao
124
59
0
07 Aug 2020
Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards
Xuewen Yang
Heming Zhang
Di Jin
Yingru Liu
Chi-Hao Wu
Jianchao Tan
Dongliang Xie
Jue Wang
Xin Wang
100
68
0
06 Aug 2020
Learning Visual Representations with Caption Annotations
Mert Bulent Sariyildiz
J. Perez
Diane Larlus
VLM
SSL
119
162
0
04 Aug 2020
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
70
115
0
03 Aug 2020
Presentation and Analysis of a Multimodal Dataset for Grounded Language Learning
Patrick Jenkins
Rishabh Sachdeva
Gaoussou Youssouf Kebe
Padraig Higgins
Kasra Darvish
Edward Raff
Don Engel
J. Winder
Francis Ferraro
Cynthia Matuszek
30
5
0
29 Jul 2020
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
53
36
0
28 Jul 2020
Representation Learning with Video Deep InfoMax
R. Devon Hjelm
Philip Bachman
SSL
MDE
107
28
0
27 Jul 2020
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Peng Gao
Zuohui Fu
Gerard de Melo
Sen Su
VLM
SSL
CLIP
105
29
0
26 Jul 2020
Spatially Aware Multimodal Transformers for TextVQA
Yash Kant
Dhruv Batra
Peter Anderson
Alex Schwing
Devi Parikh
Jiasen Lu
Harsh Agrawal
100
86
0
23 Jul 2020
The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation
Tao Wang
Yu Li
Bingyi Kang
Junnan Li
Jun Hao Liew
Sheng Tang
Guosheng Lin
Jiashi Feng
ISeg
117
182
0
23 Jul 2020
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
135
128
0
23 Jul 2020
Fine-Grained Image Captioning with Global-Local Discriminative Objective
Jie Wu
Tianshui Chen
Hefeng Wu
Zhi Yang
Guangchun Luo
Liang Lin
70
59
0
21 Jul 2020
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
Medhini Narasimhan
Erik Wijmans
Xinlei Chen
Trevor Darrell
Dhruv Batra
Devi Parikh
Amanpreet Singh
71
56
0
20 Jul 2020
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering
Ruixue Tang
Chao Ma
W. Zhang
Qi Wu
Xiaokang Yang
OOD
72
49
0
19 Jul 2020
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
81
57
0
19 Jul 2020
Understanding Spatial Relations through Multiple Modalities
Soham Dan
Hangfeng He
Dan Roth
28
6
0
19 Jul 2020
AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
Wenshuo Ma
Tingzhong Tian
Hang Xu
Yimin Huang
Zhenguo Li
60
16
0
18 Jul 2020
Visual Relation Grounding in Videos
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
80
40
0
17 Jul 2020
Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Wenbin Wang
Ruiping Wang
Shiguang Shan
Xilin Chen
3DH
102
53
0
17 Jul 2020
Detecting Human-Object Interactions with Action Co-occurrence Priors
Dong-Jin Kim
Xiao Sun
Jinsoo Choi
Stephen Lin
In So Kweon
76
125
0
17 Jul 2020
RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
Hung-Yu Tseng
Hsin-Ying Lee
Lu Jiang
Ming-Hsuan Yang
Weilong Yang
DiffM
3DV
159
54
0
16 Jul 2020
Previous
1
2
3
...
21
22
23
...
31
32
33
Next