ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07332
  4. Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
ArXiv (abs)PDFHTML

Papers citing "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"

50 / 1,654 papers shown
Title
Neural Storyboard Artist: Visualizing Stories with Coherent Image
  Sequences
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
Shizhe Chen
Bei Liu
Jianlong Fu
Ruihua Song
Qin Jin
Pingping Lin
Xiaoyu Qi
Chunting Wang
Jin Zhou
DiffM
75
33
0
24 Nov 2019
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and
  Context Capture for Language Representation -- A Generalization of Bi
  Directional LSTM
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and Context Capture for Language Representation -- A Generalization of Bi Directional LSTM
C. Sur
BDL
49
6
0
22 Nov 2019
Visual Relationship Detection with Low Rank Non-Negative Tensor
  Decomposition
Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition
Mohammed Haroon Dupty
Zhen Zhang
Wee Sun Lee
ViT
59
8
0
22 Nov 2019
Temporal Reasoning via Audio Question Answering
Temporal Reasoning via Audio Question Answering
Haytham M. Fayek
Justin Johnson
65
54
0
21 Nov 2019
Learning Cross-modal Context Graph for Visual Grounding
Learning Cross-modal Context Graph for Visual Grounding
Yongfei Liu
Bo Wan
Xiao-Dan Zhu
Xuming He
94
91
0
20 Nov 2019
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation
Yibo Yang
Hongyang Li
Xia Li
Qijie Zhao
Jianlong Wu
Zhouchen Lin
ISeg
59
63
0
18 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
94
197
0
14 Nov 2019
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation
Yiming Xu
Lin Chen
Zhongwei Cheng
Lixin Duan
Jiebo Luo
OOD
86
24
0
11 Nov 2019
Drill-down: Interactive Retrieval of Complex Scenes using Natural
  Language Queries
Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries
Fuwen Tan
Paola Cascante-Bonilla
Xiaoxiao Guo
Hui Wu
Song Feng
Vicente Ordonez
66
30
0
10 Nov 2019
Visual Relationship Detection with Relative Location Mining
Visual Relationship Detection with Relative Location Mining
Hao Zhou
Chongyang Zhang
Chuanping Hu
ObjD
134
16
0
02 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRMReLM
102
9
0
31 Oct 2019
Hidden State Guidance: Improving Image Captioning using An Image
  Conditioned Autoencoder
Hidden State Guidance: Improving Image Captioning using An Image Conditioned Autoencoder
Jialin Wu
Raymond J. Mooney
55
0
0
31 Oct 2019
Identifying Unknown Instances for Autonomous Driving
Identifying Unknown Instances for Autonomous Driving
K. Wong
Shenlong Wang
Mengye Ren
Ming Liang
R. Urtasun
139
112
0
24 Oct 2019
KnowIT VQA: Answering Knowledge-Based Questions about Videos
KnowIT VQA: Answering Knowledge-Based Questions about Videos
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
152
80
0
23 Oct 2019
Depth-wise Decomposition for Accelerating Separable Convolutions in
  Efficient Convolutional Neural Networks
Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks
Yihui He
Jianing Qian
Jianren Wang
Cindy X. Le
Congrui Hetang
Qi Lyu
Wenping Wang
Tianwei Yue
99
11
0
21 Oct 2019
Cross-modal Scene Graph Matching for Relationship-aware Image-Text
  Retrieval
Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval
Sijin Wang
Ruiping Wang
Ziwei Yao
Shiguang Shan
Xilin Chen
3DV
88
213
0
11 Oct 2019
SMArT: Training Shallow Memory-aware Transformers for Robotic
  Explainability
SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
162
29
0
07 Oct 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
Iro Armeni
Zhi-Yang He
JunYoung Gwak
Amir Zamir
Martin Fischer
Jitendra Malik
Silvio Savarese
3DV3DPC
129
350
0
06 Oct 2019
Learn to Explain Efficiently via Neural Logic Inductive Learning
Learn to Explain Efficiently via Neural Logic Inductive Learning
Yu’an Yang
Le Song
NAI
97
77
0
06 Oct 2019
SMP Challenge: An Overview of Social Media Prediction Challenge 2019
SMP Challenge: An Overview of Social Media Prediction Challenge 2019
M. Menictas
Wen-Huang Cheng
Gioia Di Credico
Bei Liu
Zhaoyang Zeng
Jiebo Luo
53
37
0
04 Oct 2019
Compensating Supervision Incompleteness with Prior Knowledge in Semantic
  Image Interpretation
Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
Ivan Donadello
Luciano Serafini
76
25
0
01 Oct 2019
Multi-Head Attention with Diversity for Learning Grounded Multilingual
  Multimodal Representations
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
Po-Yao (Bernie) Huang
Xiaojun Chang
Alexander G. Hauptmann
138
25
0
30 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLMOT
134
449
0
25 Sep 2019
Synthetic Data for Deep Learning
Synthetic Data for Deep Learning
Sergey I. Nikolenko
149
358
0
25 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLMVLM
365
948
0
24 Sep 2019
Explainable High-order Visual Question Reasoning: A New Benchmark and
  Knowledge-routed Network
Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
57
13
0
23 Sep 2019
Learning Visual Relation Priors for Image-Text Matching and Image
  Captioning with Neural Scene Graph Generators
Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Kuang-Huei Lee
Hamid Palangi
Xi Chen
Houdong Hu
Jianfeng Gao
VLM
67
37
0
22 Sep 2019
Visuallly Grounded Generation of Entailments from Premises
Visuallly Grounded Generation of Entailments from Premises
Somayeh Jafaritazehjani
Albert Gatt
Marc Tanti
LRM
48
1
0
21 Sep 2019
Triplet-Aware Scene Graph Embeddings
Triplet-Aware Scene Graph Embeddings
Brigit Schroeder
Subarna Tripathi
Hanlin Tang
3DPC
75
16
0
19 Sep 2019
Pose-aware Multi-level Feature Network for Human Object Interaction
  Detection
Pose-aware Multi-level Feature Network for Human Object Interaction Detection
Bo Wan
Desen Zhou
Yongfei Liu
Rongjie Li
Xuming He
76
200
0
18 Sep 2019
Inverse Visual Question Answering with Multi-Level Attentions
Inverse Visual Question Answering with Multi-Level Attentions
Yaser Alwatter
Yuhong Guo
BDL
35
1
0
17 Sep 2019
Scaling Object Detection by Transferring Classification Weights
Scaling Object Detection by Transferring Classification Weights
Jason Kuen
Federico Perazzi
Zhe Lin
Jianming Zhang
Yap-Peng Tan
ViT
80
18
0
15 Sep 2019
Scene Graph Parsing by Attention Graph
Scene Graph Parsing by Attention Graph
Martin Andrews
Yew Ken Chia
Sam Witteveen
GNN
48
12
0
13 Sep 2019
Specifying Object Attributes and Relations in Interactive Scene
  Generation
Specifying Object Attributes and Relations in Interactive Scene Generation
Oron Ashual
Lior Wolf
168
180
0
11 Sep 2019
Sunny and Dark Outside?! Improving Answer Consistency in VQA through
  Entailed Question Generation
Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation
Arijit Ray
Karan Sikka
Ajay Divakaran
Stefan Lee
Giedrius Burachas
83
65
0
10 Sep 2019
Hierarchy Parsing for Image Captioning
Hierarchy Parsing for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
VLM
96
166
0
09 Sep 2019
Visual Semantic Reasoning for Image-Text Matching
Visual Semantic Reasoning for Image-Text Matching
Kunpeng Li
Yulun Zhang
Keqin Li
Yuanyuan Li
Y. Fu
VLM
115
508
0
06 Sep 2019
Image Captioning with Very Scarce Supervised Data: Adversarial
  Semi-Supervised Learning Approach
Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
SSLVLM
89
56
0
05 Sep 2019
Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic
  Labels Improve Image Captioning and Visual Question Answering
Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering
Soravit Changpinyo
Bo Pang
Piyush Sharma
Radu Soricut
ObjD
60
20
0
04 Sep 2019
Reflective Decoding Network for Image Captioning
Reflective Decoding Network for Image Captioning
Lei Ke
Wenjie Pei
Ruiyu Li
Xiaoyong Shen
Yu-Wing Tai
ObjD
60
94
0
30 Aug 2019
Explainable Video Action Reasoning via Prior Knowledge and State
  Transitions
Explainable Video Action Reasoning via Prior Knowledge and State Transitions
Tao Zhuo
Zhiyong Cheng
Peng Zhang
Yongkang Wong
Mohan Kankanhalli
FAtt
83
62
0
28 Aug 2019
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
Iro Laina
Christian Rupprecht
Nassir Navab
SSL
76
103
0
25 Aug 2019
Situational Fusion of Visual Representation for Visual Navigation
Situational Fusion of Visual Representation for Visual Navigation
Bokui (William) Shen
Danfei Xu
Yuke Zhu
Leonidas Guibas
Fei-Fei Li
Silvio Savarese
SSL
95
62
0
24 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLMMLLMSSL
315
1,672
0
22 Aug 2019
Are We Modeling the Task or the Annotator? An Investigation of Annotator
  Bias in Natural Language Understanding Datasets
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
356
326
0
21 Aug 2019
Phrase Localization Without Paired Training Examples
Phrase Localization Without Paired Training Examples
Josiah Wang
Lucia Specia
83
44
0
20 Aug 2019
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style
Wei Sun
Tianfu Wu
124
143
0
20 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLMMLLM
254
2,498
0
20 Aug 2019
Learning Semantic-Specific Graph Representation for Multi-Label Image
  Recognition
Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
Tianshui Chen
Muxi Xu
X. Hui
Hefeng Wu
Liang Lin
94
289
0
20 Aug 2019
Proposal-free Temporal Moment Localization of a Natural-Language Query
  in Video using Guided Attention
Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
F. Saleh
Hongdong Li
Stephen Gould
94
147
0
20 Aug 2019
Previous
123...252627...323334
Next