Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.00982
Cited By
v1
v2 (latest)
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
2 November 2018
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
Jordi Pont-Tuset
Shahab Kamali
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
ObjD
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale"
50 / 356 papers shown
Title
Object Placement for Anything
Bingjie Gao
Bo Zhang
Li Niu
OCL
95
0
0
16 Apr 2025
Salient Temporal Encoding for Dynamic Scene Graph Generation
Zhihao Zhu
91
0
0
15 Mar 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti
Roberto Bigazzi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
233
1
0
20 Feb 2025
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Keyi Zhu
Jiajia Li
Kaixiang Zhang
Chaaran Arunachalam
Siddhartha Bhattacharya
R. Lu
Zhaojian Li
189
0
0
03 Feb 2025
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Pan Zhang
Xiaoyi Dong
Yuhang Cao
Yuhang Zang
Rui Qian
...
Xinsong Zhang
Kai Chen
Yu Qiao
Dahua Lin
Jiaqi Wang
KELM
201
16
0
12 Dec 2024
Efficient Progressive Image Compression with Variance-aware Masking
Alberto Presta
Enzo Tartaglione
Attilio Fiandrotti
Marco Grangetto
Pamela Cosman
175
0
0
15 Nov 2024
Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations
David Tschirschwitz
Volker Rodehorst
106
1
0
14 Sep 2024
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
Ting Lei
Shaofeng Yin
Yuxin Peng
Yang Liu
VLM
119
6
0
05 Aug 2024
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
Peng Hao
Xiaobing Wang
Yingying Jiang
Hanchao Jia
Xiaoshuai Hao
Shaowei Cui
Junhang Wei
Xiaoshuai Hao
154
3
0
26 Jul 2024
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments
Yu-Yun Tseng
Tanusree Sharma
Lotus Zhang
Abigale Stangl
Leah Findlater
Yang Wang
Danna Gurari
204
0
0
25 Jul 2024
Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge
Joshua Shay Kricheli
Khoa Vo
Aniruddha Datta
Spencer Ozgur
Paulo Shakarian
107
3
0
21 Jul 2024
Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection
F. Barbato
Umberto Michieli
J. Moon
Pietro Zanuttigh
Mete Ozay
108
2
0
01 Jul 2024
Extreme Point Supervised Instance Segmentation
Hyeonjun Lee
S. Hwang
Suha Kwak
68
2
0
31 May 2024
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Katherine Xu
Lingzhi Zhang
Jianbo Shi
156
17
0
23 May 2024
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
74
1
0
06 Mar 2024
Precise Extraction of Deep Learning Models via Side-Channel Attacks on Edge/Endpoint Devices
Younghan Lee
Sohee Jun
Yungi Cho
Woorim Han
Hyungon Moon
Y. Paek
AAML
40
2
0
05 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
168
12
0
05 Mar 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
248
116
0
08 Feb 2024
SniffyArt: The Dataset of Smelling Persons
Mathias Zinnen
Azhar Hussian
Hang Tran
Prathmesh Madhu
Andreas Maier
Vincent Christlein
70
9
0
20 Nov 2023
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao
Haiping Wu
Weijian Xu
Xiyang Dai
Houdong Hu
Yumao Lu
Michael Zeng
Ce Liu
Lu Yuan
VLM
129
175
0
10 Nov 2023
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
Chengyang Zhao
Songlin Yang
Zhenfang Chen
Mingyu Ding
Chuang Gan
162
17
0
10 Oct 2023
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
Jiahao Xie
Wei Li
Xiangtai Li
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
DiffM
VLM
161
38
0
22 Sep 2023
Collecting Visually-Grounded Dialogue with A Game Of Sorts
Bram Willemsen
Dmytro Kalpakchi
Gabriel Skantze
47
2
0
10 Sep 2023
SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data
Ziyan Yang
Kushal Kafle
Zhe Lin
Scott D. Cohen
Zhihong Ding
Vicente Ordonez
77
1
0
24 Aug 2023
Foreground Object Search by Distilling Composite Image Feature
Bo Zhang
Jiacheng Sui
Li Niu
99
5
0
09 Aug 2023
Label-noise-tolerant medical image classification via self-attention and self-supervised learning
Hongyang Jiang
Mengdi Gao
Yan Hu
Qi Ren
Zhaoheng Xie
Jiang-Dong Liu
NoLa
74
4
0
16 Jun 2023
Learning high-level visual representations from a child's perspective without strong inductive biases
A. Orhan
Brenden M. Lake
SSL
91
19
0
24 May 2023
NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research using Knowledge Graphs in Visual Relationship Detection
D. Herron
Ernesto Jiménez-Ruiz
G. Tarroni
Tillman Weyde
55
2
0
22 May 2023
Rethinking Multimodal Content Moderation from an Asymmetric Angle with Mixed-modality
Jialing Yuan
Ye Yu
Gaurav Mittal
Matthew Hall
Sandra Sajeev
Mei Chen
93
11
0
17 May 2023
ElasticHash: Semantic Image Similarity Search by Deep Hashing with Elasticsearch
Nikolaus Korfhage
M. Mühling
Bernd Freisleben
46
3
0
08 May 2023
OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese
Nghia Hieu Nguyen
Duong T.D. Vo
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
82
20
0
07 May 2023
Building Multimodal AI Chatbots
Mingyu Lee
61
3
0
21 Apr 2023
Tag2Text: Guiding Vision-Language Model via Image Tagging
Xinyu Huang
Youcai Zhang
Jinyu Ma
Weiwei Tian
Rui Feng
Yuejie Zhang
Yaqian Li
Yandong Guo
Lei Zhang
CLIP
MLLM
VLM
3DV
152
77
0
10 Mar 2023
Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
Cusuh Ham
James Hays
Jingwan Lu
Krishna Kumar Singh
Zhifei Zhang
Tobias Hinz
DiffM
102
24
0
24 Feb 2023
Multistage Spatial Context Models for Learned Image Compression
Fangzheng Lin
Heming Sun
Jinming Liu
J. Katto
90
17
0
18 Feb 2023
Contour-based Interactive Segmentation
Danil Galeev
Polina Popenova
Anna Vorontsova
Anton Konushin
88
5
0
13 Feb 2023
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
Xudong Wang
Rohit Girdhar
Stella X. Yu
Ishan Misra
VLM
131
173
0
26 Jan 2023
Implicit Shape Model Trees: Recognition of 3-D Indoor Scenes and Prediction of Object Poses for Mobile Robots
Pascal Meissner
Rüdiger Dillmann
3DPC
56
0
0
25 Jan 2023
Transfer Learning for Olfactory Object Detection
Mathias Zinnen
Prathmesh Madhu
Peter Bell
Andreas Maier
Vincent Christlein
ObjD
34
4
0
24 Jan 2023
ODOR: The ICPR2022 ODeuropa Challenge on Olfactory Object Recognition
Mathias Zinnen
Prathmesh Madhu
Ronak Kosti
Peter Bell
Andreas Maier
Vincent Christlein
ObjD
80
11
0
24 Jan 2023
Long-tail Detection with Effective Class-Margins
Jang Hyun Cho
Philipp Krahenbuhl
94
19
0
23 Jan 2023
OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
Tong Wu
Jiarui Zhang
Xiao Fu
Yuxin Wang
Jiawei Ren
...
Lei Yang
Jiaqi Wang
Chao Qian
Dahua Lin
Ziwei Liu
125
223
0
18 Jan 2023
Training Semantic Segmentation on Heterogeneous Datasets
Panagiotis Meletis
Gijs Dubbelman
76
2
0
18 Jan 2023
Towards Models that Can See and Read
Roy Ganz
Oren Nuriel
Aviad Aberdam
Yair Kittenplon
Shai Mazor
Ron Litman
75
13
0
18 Jan 2023
Towards Spatial Equilibrium Object Detection
Zhaohui Zheng
Yuming Chen
Qibin Hou
Xiang Li
Ming-Ming Cheng
ObjD
56
0
0
14 Jan 2023
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLM
AI4CE
LRM
125
17
0
12 Jan 2023
Vision Transformers Are Good Mask Auto-Labelers
Shiyi Lan
Xitong Yang
Zhiding Yu
Zuxuan Wu
J. Álvarez
Anima Anandkumar
ISeg
ViT
MedIm
97
19
0
10 Jan 2023
Improving Human-AI Collaboration With Descriptions of AI Behavior
Ángel Alexander Cabrera
Adam Perer
Jason I. Hong
86
40
0
06 Jan 2023
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Filip Radenovic
Abhimanyu Dubey
Abhishek Kadian
Todor Mihaylov
Simon Vandenhende
Yash J. Patel
Y. Wen
Vignesh Ramanathan
D. Mahajan
VLM
97
86
0
05 Jan 2023
GIVL: Improving Geographical Inclusivity of Vision-Language Models with Pre-Training Methods
Da Yin
Feng Gao
Govind Thattai
Michael F. Johnston
Kai-Wei Chang
VLM
94
15
0
05 Jan 2023
1
2
3
4
5
6
7
8
Next