ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07332
  4. Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
ArXiv (abs)PDFHTML

Papers citing "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"

50 / 1,648 papers shown
Title
Machine Knowledge: Creation and Curation of Comprehensive Knowledge
  Bases
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
Gerhard Weikum
Luna Dong
Simon Razniewski
Fabian M. Suchanek
144
128
0
24 Sep 2020
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal
  Transformers
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
Jaemin Cho
Jiasen Lu
Dustin Schwenk
Hannaneh Hajishirzi
Aniruddha Kembhavi
VLMMLLM
95
102
0
23 Sep 2020
Multiple interaction learning with question-type prior knowledge for
  constraining answer search space in visual question answering
Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering
Tuong Khanh Long Do
Binh X. Nguyen
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
Thanh-Toan Do
40
2
0
23 Sep 2020
ALICE: Active Learning with Contrastive Natural Language Explanations
ALICE: Active Learning with Contrastive Natural Language Explanations
Weixin Liang
James Zou
Zhou Yu
VLM
110
51
0
22 Sep 2020
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image
  Classification and Retrieval
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
80
25
0
21 Sep 2020
Knowledge-Guided Multi-Label Few-Shot Learning for General Image
  Recognition
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition
Tianshui Chen
Liang Lin
Riquan Chen
X. Hui
Hefeng Wu
96
157
0
20 Sep 2020
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language
  Grounded Image Scenes
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes
Raeid Saqur
Ameet Deshpande
GNNNAI
22
0
0
19 Sep 2020
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
Jiahao Yu
Yuan Chai
Yujing Wang
Yue Hu
Qi Wu
CML
117
114
0
16 Sep 2020
Simultaneous Machine Translation with Visual Context
Simultaneous Machine Translation with Visual Context
Ozan Caglayan
Julia Ive
Veneta Haralampieva
Pranava Madhyastha
Loïc Barrault
Lucia Specia
45
30
0
15 Sep 2020
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
Yi Zhou
Shuyang Sun
Chao Zhang
Yikang Li
Wanli Ouyang
66
7
0
12 Sep 2020
Denoising Large-Scale Image Captioning from Alt-text Data using Content
  Selection Models
Denoising Large-Scale Image Captioning from Alt-text Data using Content Selection Models
Khyathi Chandu
Piyush Sharma
Soravit Changpinyo
Ashish V. Thapliyal
Radu Soricut
DiffMVLM
84
3
0
10 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
220
627
0
10 Sep 2020
Visual Relationship Detection with Visual-Linguistic Knowledge from
  Multimodal Representations
Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations
Meng-Jiun Chiou
Roger Zimmermann
Jiashi Feng
111
1
0
10 Sep 2020
Towards Unique and Informative Captioning of Images
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
69
37
0
08 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal
  Representation Learning across Medical Images and Reports
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
70
67
0
03 Sep 2020
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph
  Generation
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation
Shaotian Yan
Chen Shen
Zhongming Jin
Jianqiang Huang
Rongxin Jiang
Yao-wu Chen
Xiansheng Hua
109
134
0
02 Sep 2020
Multimodal Aggregation Approach for Memory Vision-Voice Indoor
  Navigation with Meta-Learning
Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning
Liqi Yan
Dongfang Liu
Yaoxian Song
Changbin (Brad) Yu
68
14
0
01 Sep 2020
Practical Cross-modal Manifold Alignment for Grounded Language
Practical Cross-modal Manifold Alignment for Grounded Language
A. Nguyen
Luke E. Richards
Gaoussou Youssouf Kebe
Edward Raff
Kasra Darvish
Frank Ferraro
Cynthia Matuszek
22
4
0
01 Sep 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question
  Answering
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
74
100
0
31 Aug 2020
A Survey of Deep Active Learning
A Survey of Deep Active Learning
Pengzhen Ren
Yun Xiao
Xiaojun Chang
Po-Yao (Bernie) Huang
Zhihui Li
Brij B. Gupta
Xiaojiang Chen
Xin Wang
153
1,161
0
30 Aug 2020
Person-in-Context Synthesiswith Compositional Structural Space
Person-in-Context Synthesiswith Compositional Structural Space
Weidong Yin
Ziwei Liu
Leonid Sigal
36
2
0
28 Aug 2020
Visual Question Answering on Image Sets
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
154
44
0
27 Aug 2020
VisualSem: A High-quality Knowledge Graph for Vision and Language
VisualSem: A High-quality Knowledge Graph for Vision and Language
Houda Alberts
Teresa Huang
Y. Deshpande
Yibo Liu
Kyunghyun Cho
Clara Vania
Iacer Calixto
VLM
58
46
0
20 Aug 2020
Commonsense Knowledge in Wikidata
Commonsense Knowledge in Wikidata
Filip Ilievski
Pedro A. Szekely
D. Schwabe
CMLKELM
55
18
0
18 Aug 2020
Linguistically-aware Attention for Reducing the Semantic-Gap in
  Vision-Language Tasks
Linguistically-aware Attention for Reducing the Semantic-Gap in Vision-Language Tasks
K. Gouthaman
Athira M. Nambiar
K. Srinivas
Anurag Mittal
VLM
63
13
0
18 Aug 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced
  Models
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
Tong Wang
Selen Pehlivan
Jorma T. Laaksonen
107
34
0
18 Aug 2020
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based
  on 3D Scene Graph
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
Tomu Tahara
Takashi Seno
Gaku Narita
T. Ishikawa
85
48
0
18 Aug 2020
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
Shengyu Zhang
Tan Jiang
Tan Wang
Kun Kuang
Zhou Zhao
Jianke Zhu
Jin Yu
Hongxia Yang
Leilei Gan
OOD
81
88
0
16 Aug 2020
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph
  Generation
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation
Meng Wei
C. Yuan
Xiaoyu Yue
Kuo Zhong
119
18
0
12 Aug 2020
Assisting Scene Graph Generation with Self-Supervision
Assisting Scene Graph Generation with Self-Supervision
Sandeep Inuganti
V. Balasubramanian
SSL
49
7
0
08 Aug 2020
Polysemy Deciphering Network for Robust Human-Object Interaction
  Detection
Polysemy Deciphering Network for Robust Human-Object Interaction Detection
Xubin Zhong
Changxing Ding
X. Qu
Dacheng Tao
124
59
0
07 Aug 2020
Fashion Captioning: Towards Generating Accurate Descriptions with
  Semantic Rewards
Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards
Xuewen Yang
Heming Zhang
Di Jin
Yingru Liu
Chi-Hao Wu
Jianchao Tan
Dongliang Xie
Jue Wang
Xin Wang
100
68
0
06 Aug 2020
Learning Visual Representations with Caption Annotations
Learning Visual Representations with Caption Annotations
Mert Bulent Sariyildiz
J. Perez
Diane Larlus
VLMSSL
119
162
0
04 Aug 2020
PhraseCut: Language-based Image Segmentation in the Wild
PhraseCut: Language-based Image Segmentation in the Wild
Chenyun Wu
Zhe Lin
Scott D. Cohen
Trung Bui
Subhransu Maji
VLM
70
115
0
03 Aug 2020
Presentation and Analysis of a Multimodal Dataset for Grounded Language
  Learning
Presentation and Analysis of a Multimodal Dataset for Grounded Language Learning
Patrick Jenkins
Rishabh Sachdeva
Gaoussou Youssouf Kebe
Padraig Higgins
Kasra Darvish
Edward Raff
Don Engel
J. Winder
Francis Ferraro
Cynthia Matuszek
30
5
0
29 Jul 2020
AiR: Attention with Reasoning Capability
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
53
36
0
28 Jul 2020
Representation Learning with Video Deep InfoMax
Representation Learning with Video Deep InfoMax
R. Devon Hjelm
Philip Bachman
SSLMDE
107
28
0
27 Jul 2020
Contrastive Visual-Linguistic Pretraining
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Peng Gao
Zuohui Fu
Gerard de Melo
Sen Su
VLMSSLCLIP
105
29
0
26 Jul 2020
Spatially Aware Multimodal Transformers for TextVQA
Spatially Aware Multimodal Transformers for TextVQA
Yash Kant
Dhruv Batra
Peter Anderson
Alex Schwing
Devi Parikh
Jiasen Lu
Harsh Agrawal
100
86
0
23 Jul 2020
The Devil is in Classification: A Simple Framework for Long-tail Object
  Detection and Instance Segmentation
The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation
Tao Wang
Yu Li
Bingyi Kang
Junnan Li
Jun Hao Liew
Sheng Tang
Guosheng Lin
Jiashi Feng
ISeg
117
182
0
23 Jul 2020
Comprehensive Image Captioning via Scene Graph Decomposition
Comprehensive Image Captioning via Scene Graph Decomposition
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
135
128
0
23 Jul 2020
Fine-Grained Image Captioning with Global-Local Discriminative Objective
Fine-Grained Image Captioning with Global-Local Discriminative Objective
Jie Wu
Tianshui Chen
Hefeng Wu
Zhi Yang
Guangchun Luo
Liang Lin
70
59
0
21 Jul 2020
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
Medhini Narasimhan
Erik Wijmans
Xinlei Chen
Trevor Darrell
Dhruv Batra
Devi Parikh
Amanpreet Singh
71
56
0
20 Jul 2020
Semantic Equivalent Adversarial Data Augmentation for Visual Question
  Answering
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering
Ruixue Tang
Chao Ma
W. Zhang
Qi Wu
Xiaokang Yang
OOD
72
49
0
19 Jul 2020
Length-Controllable Image Captioning
Length-Controllable Image Captioning
Chaorui Deng
Ning Ding
Mingkui Tan
Qi Wu
VLM
81
57
0
19 Jul 2020
Understanding Spatial Relations through Multiple Modalities
Understanding Spatial Relations through Multiple Modalities
Soham Dan
Hangfeng He
Dan Roth
28
6
0
19 Jul 2020
AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian
  Sub-sampling
AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-sampling
Wenshuo Ma
Tingzhong Tian
Hang Xu
Yimin Huang
Zhenguo Li
60
16
0
18 Jul 2020
Visual Relation Grounding in Videos
Visual Relation Grounding in Videos
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
80
40
0
17 Jul 2020
Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Wenbin Wang
Ruiping Wang
Shiguang Shan
Xilin Chen
3DH
102
53
0
17 Jul 2020
Detecting Human-Object Interactions with Action Co-occurrence Priors
Detecting Human-Object Interactions with Action Co-occurrence Priors
Dong-Jin Kim
Xiao Sun
Jinsoo Choi
Stephen Lin
In So Kweon
76
125
0
17 Jul 2020
Previous
123...212223...313233
Next