Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.03863
Cited By
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
7 December 2022
Hanqing Zhao
Dianmo Sheng
Jianmin Bao
Dongdong Chen
Dong Chen
Fang Wen
Lu Yuan
Ce Liu
Wenbo Zhou
Qi Chu
Weiming Zhang
Neng H. Yu
VLM
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion"
29 / 29 papers shown
Title
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
Dong-Hee Kim
Hyunjee Song
Donghyun Kim
56
0
0
23 May 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
323
0
0
01 Dec 2024
AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
Datao Tang
Xiangyong Cao
Xuan Wu
Jialin Li
Jing Yao
Xueru Bai
Deyu Meng
Yin Li
Deyu Meng
DiffM
97
7
0
23 Nov 2024
Enrich the content of the image Using Context-Aware Copy Paste
Qiushi Guo
VLM
67
0
0
11 Jul 2024
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen
Xiangtai Li
Yining Li
Yanhong Zeng
Jianzong Wu
Xiangyu Zhao
Kai Chen
VLM
DiffM
70
3
0
28 Jun 2024
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks
Junke Wang
Dongdong Chen
Zuxuan Wu
Chong Luo
Luowei Zhou
Yucheng Zhao
Yujia Xie
Ce Liu
Yu-Gang Jiang
Lu Yuan
MLLM
VLM
56
150
0
15 Sep 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
43
92
0
29 Aug 2022
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
H. Rasheed
Muhammad Maaz
Muhammad Uzair Khattak
Salman Khan
Fahad Shahbaz Khan
ObjD
VLM
76
153
0
07 Jul 2022
Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization
Peixian Chen
Kekai Sheng
Mengdan Zhang
Mingbao Lin
Yunhang Shen
Shaohui Lin
Bo Ren
Ke Li
VLM
ObjD
58
27
0
22 Jun 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
148
1,089
0
22 Jun 2022
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Y. Yun
Weisi Lin
ViT
69
29
0
23 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
252
6,768
0
13 Apr 2022
MatteFormer: Transformer-Based Image Matting via Prior-Tokens
Gyutae Park
S. Son
Jaeyoung Yoo
Seho Kim
Nojun Kwak
ViT
40
65
0
29 Mar 2022
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
181
3,531
0
20 Dec 2021
RegionCLIP: Region-based Language-Image Pretraining
Yiwu Zhong
Jianwei Yang
Pengchuan Zhang
Chunyuan Li
Noel Codella
...
Luowei Zhou
Xiyang Dai
Lu Yuan
Yin Li
Jianfeng Gao
VLM
CLIP
81
568
0
16 Dec 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
164
1,783
0
18 Nov 2021
Dynamic Head: Unifying Object Detection Heads with Attentions
Xiyang Dai
Yinpeng Chen
Bin Xiao
Dongdong Chen
Mengchen Liu
Lu Yuan
Lei Zhang
31
566
0
15 Jun 2021
Probabilistic two-stage detection
Xingyi Zhou
V. Koltun
Philipp Krahenbuhl
ObjD
54
224
0
12 Mar 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
284
4,873
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
372
3,778
0
11 Feb 2021
Open-Vocabulary Object Detection Using Captions
Alireza Zareian
Kevin Dela Rosa
Derek Hao Hu
Shih-Fu Chang
VLM
ObjD
87
423
0
20 Nov 2020
1st Place Solution of LVIS Challenge 2020: A Good Box is not a Guarantee of a Good Mask
Jingru Tan
Gang Zhang
Hanming Deng
Changbao Wang
Lewei Lu
Quanquan Li
Jifeng Dai
40
18
0
03 Sep 2020
A Survey on Instance Segmentation: State of the art
A. M. Hafiz
G. M. Bhat
SSeg
ISeg
30
428
0
28 Jun 2020
U
2
^2
2
-Net: Going Deeper with Nested U-Structure for Salient Object Detection
Xuebin Qin
Zichen Zhang
Chenyang Huang
Masood Dehghan
Osmar R. Zaiane
Martin Jägersand
35
1,639
0
18 May 2020
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
66
1,352
0
08 Aug 2019
Modeling Visual Context is Key to Augmenting Object Detection Datasets
Nikita Dvornik
Julien Mairal
Cordelia Schmid
58
243
0
19 Jul 2018
On Pre-Trained Image Features and Synthetic Images for Deep Learning
Stefan Hinterstoißer
Vincent Lepetit
Paul Wohlhart
K. Konolige
VLM
ObjD
32
229
0
29 Oct 2017
Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
Debidatta Dwibedi
Ishan Misra
M. Hebert
77
619
0
04 Aug 2017
Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views
Hao Su
C. Qi
Yangyan Li
Leonidas Guibas
70
737
0
21 May 2015
1