ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07332
  4. Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li-Jia Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
ArXivPDFHTML

Papers citing "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"

50 / 1,103 papers shown
Title
Multi-label Ranking: Mining Multi-label and Label Ranking Data
Multi-label Ranking: Mining Multi-label and Label Ranking Data
L. Dery
32
7
0
03 Jan 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense
  Generation
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Yiran Xing
Z. Shi
Zhao Meng
Gerhard Lakemeyer
Yunpu Ma
Roger Wattenhofer
VLM
72
40
0
02 Jan 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded
  Dialogue
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
36
18
0
01 Jan 2021
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual
  Contexts
OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts
Yuxian Meng
Shuhe Wang
Qinghong Han
Xiaofei Sun
Fei Wu
Rui Yan
Jiwei Li
27
28
0
30 Dec 2020
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
29
39
0
29 Dec 2020
Towards Overcoming False Positives in Visual Relationship Detection
Towards Overcoming False Positives in Visual Relationship Detection
Daisheng Jin
Xiao Ma
Chongzhi Zhang
Yizhuo Zhou
Jiashu Tao
...
Haiyu Zhao
Shuai Yi
Zhoujun Li
Xianglong Liu
Hongsheng Li
25
5
0
23 Dec 2020
MELINDA: A Multimodal Dataset for Biomedical Experiment Method
  Classification
MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification
Te-Lin Wu
Shikhar Singh
S. Paul
Gully A. Burns
Nanyun Peng
30
18
0
16 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
29
9
0
16 Dec 2020
Knowledge-Routed Visual Question Reasoning: Challenges for Deep
  Representation Embedding
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding
Qingxing Cao
Bailin Li
Xiaodan Liang
Keze Wang
Liang Lin
44
36
0
14 Dec 2020
Improving Image Captioning by Leveraging Intra- and Inter-layer Global
  Representation in Transformer Network
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
54
170
0
13 Dec 2020
MiniVLM: A Smaller and Faster Vision-Language Model
MiniVLM: A Smaller and Faster Vision-Language Model
Jianfeng Wang
Xiaowei Hu
Pengchuan Zhang
Xiujun Li
Lijuan Wang
Lefei Zhang
Jianfeng Gao
Zicheng Liu
VLM
MLLM
35
59
0
13 Dec 2020
Image Captioning with Context-Aware Auxiliary Guidance
Image Captioning with Context-Aware Auxiliary Guidance
Zeliang Song
Xiaofei Zhou
Zhendong Mao
Jianlong Tan
36
31
0
10 Dec 2020
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
Zhengyuan Yang
Yijuan Lu
Jianfeng Wang
Xi Yin
D. Florêncio
Lijuan Wang
Cha Zhang
Lei Zhang
Jiebo Luo
VLM
28
141
0
08 Dec 2020
WeaQA: Weak Supervision via Captions for Visual Question Answering
WeaQA: Weak Supervision via Captions for Visual Question Answering
Pratyay Banerjee
Tejas Gokhale
Yezhou Yang
Chitta Baral
25
35
0
04 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework
  of Vision-and-Language BERTs
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
Emanuele Bugliarello
Ryan Cotterell
Naoaki Okazaki
Desmond Elliott
35
119
0
30 Nov 2020
Self-Supervised Real-to-Sim Scene Generation
Self-Supervised Real-to-Sim Scene Generation
Aayush Prakash
Shoubhik Debnath
Jean-Francois Lafleche
Eric Cameracci
Gavriel State
Stan Birchfield
M. Law
35
26
0
30 Nov 2020
Road Scene Graph: A Semantic Graph-Based Scene Representation Dataset
  for Intelligent Vehicles
Road Scene Graph: A Semantic Graph-Based Scene Representation Dataset for Intelligent Vehicles
Yafu Tian
Alexander Carballo
Ruifeng Li
K. Takeda
GNN
31
27
0
27 Nov 2020
Open-Vocabulary Object Detection Using Captions
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
44
418
0
20 Nov 2020
ActBERT: Learning Global-Local Video-Text Representations
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
49
417
0
14 Nov 2020
Human-centric Spatio-Temporal Video Grounding With Visual Transformers
Human-centric Spatio-Temporal Video Grounding With Visual Transformers
Zongheng Tang
Yue Liao
Si Liu
Guanbin Li
Xiaojie Jin
Hongxu Jiang
Qian Yu
Dong Xu
21
94
0
10 Nov 2020
After All, Only The Last Neuron Matters: Comparing Multi-modal Fusion
  Functions for Scene Graph Generation
After All, Only The Last Neuron Matters: Comparing Multi-modal Fusion Functions for Scene Graph Generation
Mohamed Karim Belaid
31
1
0
09 Nov 2020
Refer, Reuse, Reduce: Generating Subsequent References in Visual and
  Conversational Contexts
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Ece Takmaz
Mario Giulianelli
Sandro Pezzelle
Arabella J. Sinclair
Raquel Fernández
20
26
0
09 Nov 2020
CapWAP: Captioning with a Purpose
CapWAP: Captioning with a Purpose
Adam Fisch
Kenton Lee
Ming-Wei Chang
J. Clark
Regina Barzilay
8
11
0
09 Nov 2020
Dual ResGCN for Balanced Scene GraphGeneration
Dual ResGCN for Balanced Scene GraphGeneration
Jingyi Zhang
Yong Zhang
Baoyuan Wu
Yanbo Fan
Fumin Shen
Heng Tao Shen
28
12
0
09 Nov 2020
An Improved Attention for Visual Question Answering
An Improved Attention for Visual Question Answering
Tanzila Rahman
Shih-Han Chou
Leonid Sigal
Giuseppe Carenini
13
42
0
04 Nov 2020
Cross-Media Keyphrase Prediction: A Unified Framework with
  Multi-Modality Multi-Head Attention and Image Wordings
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
Yue Wang
Jing Li
M. Lyu
Irwin King
16
16
0
03 Nov 2020
Learning Dual Semantic Relations with Graph Attention for Image-Text
  Matching
Learning Dual Semantic Relations with Graph Attention for Image-Text Matching
Keyu Wen
Xiaodong Gu
Qingrong Cheng
24
95
0
22 Oct 2020
Contextual Heterogeneous Graph Network for Human-Object Interaction
  Detection
Contextual Heterogeneous Graph Network for Human-Object Interaction Detection
Hai Wang
Weishi Zheng
Yingbiao Ling
27
87
0
20 Oct 2020
Language and Visual Entity Relationship Graph for Agent Navigation
Language and Visual Entity Relationship Graph for Agent Navigation
Yicong Hong
Cristian Rodriguez-Opazo
Yuankai Qi
Qi Wu
Stephen Gould
LM&Ro
179
132
0
19 Oct 2020
CAPT: Contrastive Pre-Training for Learning Denoised Sequence
  Representations
CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Fuli Luo
Pengcheng Yang
Shicheng Li
Xuancheng Ren
Xu Sun
VLM
SSL
18
16
0
13 Oct 2020
DORi: Discovering Object Relationship for Moment Localization of a
  Natural-Language Query in Video
DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
Basura Fernando
Hongdong Li
Stephen Gould
137
11
0
13 Oct 2020
Webly Supervised Image Classification with Metadata: Automatic Noisy
  Label Correction via Visual-Semantic Graph
Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph
Jingkang Yang
Weirong Chen
Xue Jiang
Xiaopeng Yan
Huabin Zheng
Wayne Zhang
NoLa
30
13
0
12 Oct 2020
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase
  Grounding
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
Qinxin Wang
Hao Tan
Sheng Shen
Michael W. Mahoney
Z. Yao
ObjD
47
11
0
12 Oct 2020
Background Learnable Cascade for Zero-Shot Object Detection
Background Learnable Cascade for Zero-Shot Object Detection
Ye Zheng
Ruoran Huang
Chuanqi Han
Xi Huang
Li Cui
ObjD
21
48
0
09 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
23
11
0
05 Oct 2020
Learning Object Detection from Captions via Textual Scene Attributes
Learning Object Detection from Captions via Textual Scene Attributes
Achiya Jerbi
Roei Herzig
Jonathan Berant
Gal Chechik
Amir Globerson
27
21
0
30 Sep 2020
Spatial Attention as an Interface for Image Captioning Models
Spatial Attention as an Interface for Image Captioning Models
P. Sadler
28
0
0
29 Sep 2020
SceneGen: Generative Contextual Scene Augmentation using Scene Graph
  Priors
SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors
Mohammad Keshavarzi
Aakash Parikh
Xiyu Zhai
Melody Mao
Luisa Caldas
An Yang
29
24
0
25 Sep 2020
Machine Knowledge: Creation and Curation of Comprehensive Knowledge
  Bases
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
Gerhard Weikum
Luna Dong
Simon Razniewski
Fabian M. Suchanek
34
125
0
24 Sep 2020
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal
  Transformers
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
Jaemin Cho
Jiasen Lu
Dustin Schwenk
Hannaneh Hajishirzi
Aniruddha Kembhavi
VLM
MLLM
30
102
0
23 Sep 2020
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image
  Classification and Retrieval
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
Andrés Mafla
S. Dey
Ali Furkan Biten
Lluís Gómez
Dimosthenis Karatzas
27
25
0
21 Sep 2020
Knowledge-Guided Multi-Label Few-Shot Learning for General Image
  Recognition
Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition
Tianshui Chen
Liang Lin
Riquan Chen
X. Hui
Hefeng Wu
21
153
0
20 Sep 2020
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation
Jiahao Yu
Yuan Chai
Yujing Wang
Yue Hu
Qi Wu
CML
38
111
0
16 Sep 2020
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
Exploring the Hierarchy in Relation Labels for Scene Graph Generation
Yi Zhou
Shuyang Sun
Chao Zhang
Yikang Li
Wanli Ouyang
32
7
0
12 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
53
609
0
10 Sep 2020
Towards Unique and Informative Captioning of Images
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy Feng
Karthik R. Narasimhan
Olga Russakovsky
25
37
0
08 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal
  Representation Learning across Medical Images and Reports
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
19
63
0
03 Sep 2020
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph
  Generation
PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation
Shaotian Yan
Chen Shen
Zhongming Jin
Jianqiang Huang
Rongxin Jiang
Yao-wu Chen
Xiansheng Hua
34
131
0
02 Sep 2020
Visual Question Answering on Image Sets
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
16
40
0
27 Aug 2020
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced
  Models
Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models
T. Wang
Selen Pehlivan
Jorma T. Laaksonen
29
34
0
18 Aug 2020
Previous
123...151617...212223
Next