ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07332
  4. Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
ArXiv (abs)PDFHTML

Papers citing "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"

50 / 1,650 papers shown
Title
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language
  Navigation
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Ronghang Hu
Daniel Fried
Anna Rohrbach
Dan Klein
Trevor Darrell
Kate Saenko
80
98
0
02 Jun 2019
Learning to Generate Grounded Visual Captions without Localization
  Supervision
Learning to Generate Grounded Visual Captions without Localization Supervision
Chih-Yao Ma
Yannis Kalantidis
Ghassan AlRegib
Peter Vajda
Marcus Rohrbach
Z. Kira
SSL
43
10
0
01 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External
  Knowledge
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
192
1,095
0
31 May 2019
Scene Text Visual Question Answering
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
150
361
0
31 May 2019
Contextual Translation Embedding for Visual Relationship Detection and
  Scene Graph Generation
Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation
Zih-Siou Hung
Arun Mallya
Svetlana Lazebnik
ViT
82
15
0
28 May 2019
Gaining Extra Supervision via Multi-task learning for Multi-Modal Video
  Question Answering
Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering
Junyeong Kim
Minuk Ma
Kyungsu Kim
Sungjin Kim
Chang D. Yoo
62
27
0
28 May 2019
FAN: Focused Attention Networks
FAN: Focused Attention Networks
Chu Wang
Babak Samari
Vladimir G. Kim
S. Chaudhuri
Kaleem Siddiqi
40
1
0
27 May 2019
Self-Critical Reasoning for Robust Visual Question Answering
Self-Critical Reasoning for Robust Visual Question Answering
Jialin Wu
Raymond J. Mooney
OODNAI
77
161
0
24 May 2019
Image Captioning based on Deep Learning Methods: A Survey
Image Captioning based on Deep Learning Methods: A Survey
Yiyu Wang
Jungang Xu
Yingfei Sun
Xianpei Han
VLM
34
7
0
20 May 2019
Multimodal Transformer with Multi-View Visual Representation for Image
  Captioning
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
65
387
0
20 May 2019
One-Shot Texture Retrieval with Global Context Metric
One-Shot Texture Retrieval with Global Context Metric
Kai Zhu
Wei Zhai
Zhengjun Zha
Yang Cao
3DV
122
6
0
16 May 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image
  Representations
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
71
82
0
15 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual
  Question Answering
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Yebin Liu
Yinglong Wang
Mohan Kankanhalli
57
37
0
13 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
85
175
0
10 May 2019
PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph
PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph
Yikang Li
Tao Ma
Yeqi Bai
Nan Duan
Sining Wei
Xiaogang Wang
140
96
0
05 May 2019
On Exploring Undetermined Relationships for Visual Relationship
  Detection
On Exploring Undetermined Relationships for Visual Relationship Detection
Yibing Zhan
Jun-chen Yu
Ting Yu
Dacheng Tao
79
83
0
05 May 2019
Scene Graph Prediction with Limited Labels
Scene Graph Prediction with Limited Labels
V. Chen
P. Varma
Ranjay Krishna
Michael S. Bernstein
Christopher Ré
Li Fei-Fei
98
87
0
25 Apr 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
75
230
0
25 Apr 2019
Context-Aware Zero-Shot Learning for Object Recognition
Context-Aware Zero-Shot Learning for Object Recognition
Éloi Zablocki
Patrick Bordes
Benjamin Piwowarski
Laure Soulier
Patrick Gallinari
VLM
60
29
0
24 Apr 2019
Deep Metric Learning Beyond Binary Supervision
Deep Metric Learning Beyond Binary Supervision
Sungyeon Kim
Minkyo Seo
Ivan Laptev
Minsu Cho
Suha Kwak
SSL
74
96
0
21 Apr 2019
Context-Aware Zero-Shot Recognition
Context-Aware Zero-Shot Recognition
Ruotian Luo
Ning Zhang
Bohyung Han
L. Yang
80
31
0
19 Apr 2019
Towards VQA Models That Can Read
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
179
1,257
0
18 Apr 2019
Learning to Collocate Neural Modules for Image Captioning
Learning to Collocate Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Jianfei Cai
66
78
0
18 Apr 2019
Unsupervised Discovery of Multimodal Links in Multi-image,
  Multi-sentence Documents
Unsupervised Discovery of Multimodal Links in Multi-image, Multi-sentence Documents
Jack Hessel
Lillian Lee
David M. Mimno
72
30
0
16 Apr 2019
Visual Relationship Detection with Language prior and Softmax
Visual Relationship Detection with Language prior and Softmax
Jaewon Jung
Jongyoul Park
50
9
0
16 Apr 2019
Natural Language Semantics With Pictures: Some Language & Vision
  Datasets and Potential Uses for Computational Semantics
Natural Language Semantics With Pictures: Some Language & Vision Datasets and Potential Uses for Computational Semantics
David Schlangen
67
6
0
15 Apr 2019
Learning to Generate Unambiguous Spatial Referring Expressions for
  Real-World Environments
Learning to Generate Unambiguous Spatial Referring Expressions for Real-World Environments
Fethiye Irmak Dogan
Sinan Kalkan
Iolanda Leite
74
19
0
15 Apr 2019
Context-Aware Embeddings for Automatic Art Analysis
Context-Aware Embeddings for Automatic Art Analysis
Noa Garcia
B. Renoust
Yuta Nakashima
42
52
0
10 Apr 2019
Sketchforme: Composing Sketched Scenes from Text Descriptions for
  Interactive Applications
Sketchforme: Composing Sketched Scenes from Text Descriptions for Interactive Applications
Forrest Huang
John F. Canny
43
25
0
08 Apr 2019
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Revisiting EmbodiedQA: A Simple Baseline and Beyond
Yuehua Wu
Lu Jiang
Yi Yang
LM&Ro
86
30
0
08 Apr 2019
Referring to Objects in Videos using Spatio-Temporal Identifying
  Descriptions
Referring to Objects in Videos using Spatio-Temporal Identifying Descriptions
Peratham Wiriyathammabhum
Abhinav Shrivastava
Vlad I. Morariu
L. Davis
60
5
0
08 Apr 2019
ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors
ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors
Weicheng Kuo
A. Angelova
Jitendra Malik
Nayeon Lee
3DPCISeg
95
120
0
05 Apr 2019
Detecting Human-Object Interactions via Functional Generalization
Detecting Human-Object Interactions via Functional Generalization
Ankan Bansal
Sai Saketh Rambhatla
Abhinav Shrivastava
Rama Chellappa
110
118
0
05 Apr 2019
Target-Tailored Source-Transformation for Scene Graph Generation
Target-Tailored Source-Transformation for Scene Graph Generation
Wentong Liao
Cuiling Lan
Wenjun Zeng
M. Yang
Bodo Rosenhahn
48
5
0
03 Apr 2019
Context and Attribute Grounded Dense Captioning
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
66
76
0
02 Apr 2019
Scene Graph Generation with External Knowledge and Image Reconstruction
Scene Graph Generation with External Knowledge and Image Reconstruction
Jiuxiang Gu
Handong Zhao
Zhe Lin
Sheng Li
Jianfei Cai
Mingyang Ling
89
294
0
01 Apr 2019
Relation-Aware Graph Attention Network for Visual Question Answering
Relation-Aware Graph Attention Network for Visual Question Answering
Linjie Li
Zhe Gan
Yu Cheng
Jingjing Liu
GNN
196
347
0
29 Mar 2019
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption
  Alignment
Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment
Samyak Datta
Karan Sikka
Anirban Roy
Karuna Ahuja
Devi Parikh
Ajay Divakaran
104
104
0
27 Mar 2019
Information Maximizing Visual Question Generation
Information Maximizing Visual Question Generation
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
131
95
0
27 Mar 2019
Unpaired Image Captioning via Scene Graph Alignments
Unpaired Image Captioning via Scene Graph Alignments
Jiuxiang Gu
Shafiq Joty
Jianfei Cai
Handong Zhao
Xu Yang
G. Wang
GNN
104
176
0
26 Mar 2019
An End-to-End Network for Generating Social Relationship Graphs
An End-to-End Network for Generating Social Relationship Graphs
A. Goel
K. Ma
Cheston Tan
GNN
110
40
0
23 Mar 2019
On Class Imbalance and Background Filtering in Visual Relationship
  Detection
On Class Imbalance and Background Filtering in Visual Relationship Detection
Alessio Sarullo
Tingting Mu
102
4
0
20 Mar 2019
Neural Sequential Phrase Grounding (SeqGROUND)
Neural Sequential Phrase Grounding (SeqGROUND)
Pelin Dogan
Leonid Sigal
Markus Gross
ObjD
81
52
0
18 Mar 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Eric Wang
Baivab Sinha
Ying Nian Wu
41
16
0
16 Mar 2019
PifPaf: Composite Fields for Human Pose Estimation
PifPaf: Composite Fields for Human Pose Estimation
S. Kreiss
Lorenzo Bertoni
Alexandre Alahi
3DH
100
425
0
15 Mar 2019
Dense Relational Captioning: Triple-Stream Networks for
  Relationship-Based Captioning
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
98
84
0
14 Mar 2019
MMKG: Multi-Modal Knowledge Graphs
MMKG: Multi-Modal Knowledge Graphs
Ye Liu
Hui Li
Alberto García-Durán
Mathias Niepert
Daniel Oñoro-Rubio
David S. Rosenblum
80
207
0
13 Mar 2019
Visual Semantic Information Pursuit: A Survey
Visual Semantic Information Pursuit: A Survey
Daqi Liu
M. Bober
J. Kittler
75
32
0
13 Mar 2019
Knowledge-Embedded Routing Network for Scene Graph Generation
Knowledge-Embedded Routing Network for Scene Graph Generation
Tianshui Chen
Weihao Yu
Riquan Chen
Liang Lin
GNN
95
377
0
08 Mar 2019
Graphical Contrastive Losses for Scene Graph Parsing
Graphical Contrastive Losses for Scene Graph Parsing
Ji Zhang
Kevin J. Shih
Ahmed Elgammal
Andrew Tao
Bryan Catanzaro
119
233
0
07 Mar 2019
Previous
123...272829...313233
Next