ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07332
  4. Cited By
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

23 February 2016
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
Joshua Kravitz
Stephanie Chen
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
ArXiv (abs)PDFHTML

Papers citing "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations"

50 / 1,650 papers shown
Title
Object Detection as a Positive-Unlabeled Problem
Object Detection as a Positive-Unlabeled Problem
Yuewei Yang
Kevin J. Liang
Lawrence Carin
82
39
0
11 Feb 2020
Symbiotic Attention with Privileged Information for Egocentric Action
  Recognition
Symbiotic Attention with Privileged Information for Egocentric Action Recognition
Xiaohan Wang
Yu Wu
Linchao Zhu
Yi Yang
74
63
0
08 Feb 2020
Controlling generative models with continuous factors of variations
Controlling generative models with continuous factors of variations
Antoine Plumerault
Hervé Le Borgne
C´eline Hudelot
DRL
87
127
0
28 Jan 2020
Explaining with Counter Visual Attributes and Examples
Explaining with Counter Visual Attributes and Examples
Sadaf Gulshad
A. Smeulders
XAIFAttAAML
77
15
0
27 Jan 2020
aiTPR: Attribute Interaction-Tensor Product Representation for Image
  Caption
aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption
C. Sur
40
8
0
27 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised
  Image-Text Data
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
132
263
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
79
18
0
20 Jan 2020
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form
  Sentences
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
Zhu Zhang
Zhou Zhao
Yang Zhao
Qi. Wang
Huasheng Liu
Lianli Gao
99
118
0
19 Jan 2020
GTNet: Generative Transfer Network for Zero-Shot Object Detection
GTNet: Generative Transfer Network for Zero-Shot Object Detection
Shizhen Zhao
Changxin Gao
Yuanjie Shao
Lerenhan Li
Changqian Yu
Zhong Ji
Nong Sang
ViTObjD
144
54
0
19 Jan 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Li Wang
Zechen Bai
Yonghua Zhang
Hongtao Lu
75
67
0
15 Jan 2020
Ensemble based discriminative models for Visual Dialog Challenge 2018
Ensemble based discriminative models for Visual Dialog Challenge 2018
Shubham Agarwal
Raghav Goyal
16
1
0
15 Jan 2020
NODIS: Neural Ordinary Differential Scene Understanding
NODIS: Neural Ordinary Differential Scene Understanding
Cong Yuren
H. Ackermann
Wentong Liao
M. Yang
Bodo Rosenhahn
104
16
0
14 Jan 2020
Cross-dataset Training for Class Increasing Object Detection
Cross-dataset Training for Class Increasing Object Detection
Yongqiang Yao
Yan Wang
Yu-Xiao Guo
Jiaojiao Lin
Hongwei Qin
Junjie Yan
ObjD
57
17
0
14 Jan 2020
Classifying All Interacting Pairs in a Single Shot
Classifying All Interacting Pairs in a Single Shot
Sanaa Chafik
Astrid Orcesi
Romaric Audigier
B. Luvison
40
4
0
13 Jan 2020
In Defense of Grid Features for Visual Question Answering
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OODObjD
85
320
0
10 Jan 2020
Weakly Supervised Visual Semantic Parsing
Weakly Supervised Visual Semantic Parsing
Alireza Zareian
Svebor Karaman
Shih-Fu Chang
GNN
108
57
0
08 Jan 2020
Bridging Knowledge Graphs to Generate Scene Graphs
Bridging Knowledge Graphs to Generate Scene Graphs
Alireza Zareian
Svebor Karaman
Shih-Fu Chang
100
212
0
07 Jan 2020
Identifying and Compensating for Feature Deviation in Imbalanced Deep
  Learning
Identifying and Compensating for Feature Deviation in Imbalanced Deep Learning
Han-Jia Ye
Hong-You Chen
De-Chuan Zhan
Wei-Lun Chao
138
102
0
06 Jan 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual
  Question Answering
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Peng Gao
Sen Su
85
11
0
03 Jan 2020
LayoutLM: Pre-training of Text and Layout for Document Image
  Understanding
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
155
720
0
31 Dec 2019
PPDM: Parallel Point Detection and Matching for Real-time Human-Object
  Interaction Detection
PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Yue Liao
Si Liu
Fei Wang
Yanjie Chen
Chen Qian
Jiashi Feng
171
270
0
30 Dec 2019
A Review on Intelligent Object Perception Methods Combining
  Knowledge-based Reasoning and Machine Learning
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
Theodore Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
68
12
0
26 Dec 2019
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal
  Multitask Learning
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning
Huaizheng Zhang
Yong Luo
Qiming Ai
Yonggang Wen
113
15
0
21 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
26
4
0
19 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
110
889
0
17 Dec 2019
Towards Fairer Datasets: Filtering and Balancing the Distribution of the
  People Subtree in the ImageNet Hierarchy
Towards Fairer Datasets: Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy
Kaiyu Yang
Klint Qinami
Li Fei-Fei
Jia Deng
Olga Russakovsky
132
325
0
16 Dec 2019
Learning Canonical Representations for Scene Graph to Image Generation
Learning Canonical Representations for Scene Graph to Image Generation
Roei Herzig
Amir Bar
Huijuan Xu
Gal Chechik
Trevor Darrell
Amir Globerson
GNNOCL
109
109
0
16 Dec 2019
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Jingwei Ji
Ranjay Krishna
Li Fei-Fei
Juan Carlos Niebles
94
346
0
15 Dec 2019
Towards Contextual Learning in Few-shot Object Classification
Towards Contextual Learning in Few-shot Object Classification
M. Fortin
B. Chaib-draa
61
3
0
13 Dec 2019
Multimodal Generative Models for Compositional Representation Learning
Multimodal Generative Models for Compositional Representation Learning
Mike Wu
Noah D. Goodman
GANDRL
97
17
0
11 Dec 2019
Video action detection by learning graph-based spatio-temporal
  interactions
Video action detection by learning graph-based spatio-temporal interactions
Matteo Tomei
Lorenzo Baraldi
Simone Calderara
Simone Bronzin
Rita Cucchiara
131
9
0
09 Dec 2019
A Real-time Global Inference Network for One-stage Referring Expression
  Comprehension
A Real-time Global Inference Network for One-stage Referring Expression Comprehension
Yiyi Zhou
Rongrong Ji
Gen Luo
Xiaoshuai Sun
Jinsong Su
Xinghao Ding
Chia-Wen Lin
Q. Tian
ObjD
85
64
0
07 Dec 2019
Controlling Style and Semantics in Weakly-Supervised Image Generation
Controlling Style and Semantics in Weakly-Supervised Image Generation
Dario Pavllo
Aurelien Lucchi
Thomas Hofmann
92
35
0
06 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
143
252
0
06 Dec 2019
Weak Supervision helps Emergence of Word-Object Alignment and improves
  Vision-Language Tasks
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
60
15
0
06 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
111
117
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
131
481
0
05 Dec 2019
Siamese Natural Language Tracker: Tracking by Natural Language
  Descriptions with Siamese Trackers
Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers
Qi Feng
Vitaly Ablavsky
Qinxun Bai
Stan Sclaroff
68
17
0
04 Dec 2019
Knowledge-Enriched Visual Storytelling
Knowledge-Enriched Visual Storytelling
Chao-Chun Hsu
Zi-Yuan Chen
Chi-Yang Hsu
Chih-Chia Li
Tzu-Yuan Lin
Ting-Hao 'Kenneth' Huang
Lun-Wei Ku
DiffM
90
47
0
03 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDLOODUQCV
93
14
0
02 Dec 2019
Learning to Relate from Captions and Bounding Boxes
Learning to Relate from Captions and Bounding Boxes
Sarthak Garg
Joel Ruben Antony Moniz
Anshu Aviral
Priyatham Bollimpalli
38
3
0
01 Dec 2019
Assessing the Robustness of Visual Question Answering Models
Assessing the Robustness of Visual Question Answering Models
Jia-Hong Huang
Modar Alfadly
Guohao Li
Marcel Worring
AAMLOOD
100
24
0
30 Nov 2019
A Free Lunch in Generating Datasets: Building a VQG and VQA System with
  Attention and Humans in the Loop
A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop
Jihyeon Janel Lee
S. Arora
18
1
0
30 Nov 2019
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OODLRM
326
1,853
0
26 Nov 2019
Identifying Model Weakness with Adversarial Examiner
Identifying Model Weakness with Adversarial Examiner
Michelle Shu
Chenxi Liu
Weichao Qiu
Alan Yuille
AAMLELM
81
20
0
25 Nov 2019
Two Causal Principles for Improving Visual Dialog
Two Causal Principles for Improving Visual Dialog
Jiaxin Qi
Yulei Niu
Jianqiang Huang
Hanwang Zhang
CML
110
148
0
24 Nov 2019
Neural Storyboard Artist: Visualizing Stories with Coherent Image
  Sequences
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences
Shizhe Chen
Bei Liu
Jianlong Fu
Ruihua Song
Qin Jin
Pingping Lin
Xiaoyu Qi
Chunting Wang
Jin Zhou
DiffM
75
33
0
24 Nov 2019
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and
  Context Capture for Language Representation -- A Generalization of Bi
  Directional LSTM
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and Context Capture for Language Representation -- A Generalization of Bi Directional LSTM
C. Sur
BDL
49
6
0
22 Nov 2019
Visual Relationship Detection with Low Rank Non-Negative Tensor
  Decomposition
Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition
Mohammed Haroon Dupty
Zhen Zhang
Wee Sun Lee
ViT
59
8
0
22 Nov 2019
Temporal Reasoning via Audio Question Answering
Temporal Reasoning via Audio Question Answering
Haytham M. Fayek
Justin Johnson
65
54
0
21 Nov 2019
Previous
123...242526...313233
Next