Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.00982
Cited By
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
2 November 2018
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
Jordi Pont-Tuset
Shahab Kamali
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
ObjD
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale"
50 / 243 papers shown
Title
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
H. Rasheed
Muhammad Maaz
Muhammad Uzair Khattak
Salman Khan
Fahad Shahbaz Khan
ObjD
VLM
27
151
0
07 Jul 2022
FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments
P. JishnuJaykumar
Yu-Wei Chao
Yu Xiang
21
11
0
06 Jul 2022
Image Amodal Completion: A Survey
Jiayang Ao
Qiuhong Ke
Krista A. Ehinger
43
16
0
05 Jul 2022
Deep Learning Models on CPUs: A Methodology for Efficient Training
Quchen Fu
Ramesh Chukka
Keith Achorn
Thomas Atta-fosu
Deepak R. Canchi
Zhongwei Teng
Jules White
Douglas C. Schmidt
21
1
0
20 Jun 2022
DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations
Ximeng Sun
Ping Hu
Kate Saenko
VLM
36
120
0
20 Jun 2022
All Mistakes Are Not Equal: Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP)
A. Vaswani
Gaurav Aggarwal
Praneeth Netrapalli
N. Hegde
22
4
0
17 Jun 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
44
237
0
14 Jun 2022
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation
Wouter Van Gansbeke
Simon Vandenhende
Luc Van Gool
44
55
0
13 Jun 2022
Gradient Obfuscation Gives a False Sense of Security in Federated Learning
Kai Yue
Richeng Jin
Chau-Wai Wong
D. Baron
H. Dai
FedML
36
46
0
08 Jun 2022
A Survey on Long-Tailed Visual Recognition
Lu Yang
He Jiang
Q. Song
Jun Guo
27
123
0
27 May 2022
Penalizing Proposals using Classifiers for Semi-Supervised Object Detection
S. Hazra
P. Dasgupta
33
0
0
26 May 2022
Perceptual Learned Source-Channel Coding for High-Fidelity Image Semantic Transmission
Jun Wang
Sixian Wang
Jincheng Dai
Zhongwei Si
Dekun Zhou
K. Niu
19
31
0
26 May 2022
The Case for Perspective in Multimodal Datasets
Marcelo Viridiano
Tiago Timponi Torrent
Oliver Czulo
Arthur Lorenzi
E. Matos
Frederico Belcavello
19
5
0
22 May 2022
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer
A. Gritsenko
Austin Stone
Maxim Neumann
Dirk Weissenborn
...
Zhuoran Shen
Tianlin Li
Xiaohua Zhai
Thomas Kipf
N. Houlsby
ObjD
CLIP
VLM
ViT
OCL
34
307
0
12 May 2022
Deep Learning and Computer Vision Techniques for Microcirculation Analysis: A Review
Maged Abdalla Helmy Abdou
T. Truong
E. Jul
Paulo Ferreira
26
8
0
11 May 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
19
19
0
27 Apr 2022
Training and challenging models for text-guided fashion image retrieval
Eric Dodds
Jack Culpepper
Gaurav Srivastava
18
8
0
23 Apr 2022
Fast AdvProp
Jieru Mei
Yucheng Han
Yutong Bai
Yixiao Zhang
Yingwei Li
Xianhang Li
Alan Yuille
Cihang Xie
AAML
29
8
0
21 Apr 2022
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
44
125
0
12 Apr 2022
Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction
Kalyan Vasudev Alwala
Abhinav Gupta
Shubham Tulsiani
32
30
0
07 Apr 2022
ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO
Sanghyuk Chun
Wonjae Kim
Song Park
Minsuk Chang
Seong Joon Oh
VLM
373
43
0
07 Apr 2022
How stable are Transferability Metrics evaluations?
A. Agostinelli
Michal Pándy
J. Uijlings
Thomas Mensink
V. Ferrari
35
23
0
04 Apr 2022
Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI
Mahima Pushkarna
Andrew Zaldivar
Oddur Kjartansson
AI4TS
38
197
0
03 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
47
573
0
01 Apr 2022
Image Retrieval from Contextual Descriptions
Benno Krojer
Vaibhav Adlakha
Vibhav Vineet
Yash Goyal
Edoardo Ponti
Siva Reddy
19
29
0
29 Mar 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Likun Cai
Zhi-Li Zhang
Yi Zhu
Li Zhang
Mu Li
Xiangyang Xue
VLM
ObjD
40
40
0
24 Mar 2022
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua Wu
Haifeng Wang
MLLM
19
21
0
17 Mar 2022
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy
Yuanhan Zhang
Qi Sun
Yichun Zhou
Zexin He
Zhen-fei Yin
Kunze Wang
Lu Sheng
Yu Qiao
Jing Shao
Ziwei Liu
ObjD
VLM
32
19
0
15 Mar 2022
Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding
Yidan Sun
Qin Chao
Yangfeng Ji
Boyang Albert Li
VGen
35
10
0
11 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Tianlin Li
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
Chia-Ju Chen
30
12
0
07 Mar 2022
Attribute Descent: Simulating Object-Centric Datasets on the Content Level and Beyond
Yue Yao
Liang Zheng
Xiaodong Yang
Milind Napthade
Tom Gedeon
26
17
0
28 Feb 2022
Optical flow-based branch segmentation for complex orchard environments
A. You
C. Grimm
J. Davidson
23
9
0
26 Feb 2022
Object-Guided Day-Night Visual Localization in Urban Scenes
Assia Benbihi
C´edric Pradalier
Ondřej Chum
16
4
0
09 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
74
850
0
07 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
32
7
0
02 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
25
89
0
31 Jan 2022
RelTR: Relation Transformer for Scene Graph Generation
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
100
134
0
27 Jan 2022
CrossRectify: Leveraging Disagreement for Semi-supervised Object Detection
Cheng Ma
Xingjia Pan
QiXiang Ye
Fan Tang
Weiming Dong
Changsheng Xu
45
14
0
26 Jan 2022
CLIP-Event: Connecting Text and Images with Event Structures
Manling Li
Ruochen Xu
Shuohang Wang
Luowei Zhou
Xudong Lin
Chenguang Zhu
Michael Zeng
Heng Ji
Shih-Fu Chang
VLM
CLIP
27
123
0
13 Jan 2022
Equalized Focal Loss for Dense Long-Tailed Object Detection
Bo-wen Li
Yongqiang Yao
Jingru Tan
Gang Zhang
F. Yu
Jianwei Lu
Ye Luo
39
96
0
07 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
32
100
0
23 Dec 2021
HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images
A. Athar
Jonathon Luiten
Alexander Hermans
Deva Ramanan
Bastian Leibe
VOS
30
25
0
16 Dec 2021
Holistic Interpretation of Public Scenes Using Computer Vision and Temporal Graphs to Identify Social Distancing Violations
Gihan Chanaka Jayatilaka
Jameel Hassan
Suren Sritharan
J. B. Senanayaka
H. Weligampola
Roshan Godaliyadda
Parakrama Ekanayake
Vijitha Herath
Janaka Ekanayake
S. Dharmaratne
20
6
0
13 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
27
86
0
09 Dec 2021
Visual Persuasion in COVID-19 Social Media Content: A Multi-Modal Characterization
Mesut Erhan Unal
Adriana Kovashka
Wen-Ting Chung
Yu-Ru Lin
15
4
0
05 Dec 2021
Optimization of phase-only holograms calculated with scaled diffraction calculation through deep neural networks
Yoshiyuki Ishii
Tomoyoshi Shimobaba
David Blinder
Tobias Birnbaum
P. Schelkens
Takashi Kakue
T. Ito
14
10
0
02 Dec 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViT
VGen
18
292
0
24 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
27
111
0
23 Nov 2021
Class-agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz
H. Rasheed
Salman Khan
Fahad Shahbaz Khan
Rao Muhammad Anwer
Ming Yang
23
91
0
22 Nov 2021
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
32
12
0
17 Nov 2021
Previous
1
2
3
4
5
Next