Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.12343
Cited By
v1
v2 (latest)
LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation
22 March 2023
K. Pnvr
Bharat Singh
P. Ghosh
Behjat Siddiquie
David Jacobs
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation"
37 / 37 papers shown
Title
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
165
0
0
16 Mar 2025
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Zhi-Yi Lin
Jouh Yeong Chew
Jan van Gemert
Xucong Zhang
136
3
0
16 Apr 2024
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model
Shaoan Xie
Zhifei Zhang
Zhe Lin
Tobias Hinz
Kun Zhang
DiffM
73
244
0
09 Dec 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
203
1,813
0
17 Nov 2022
DiffEdit: Diffusion-based semantic image editing with mask guidance
Guillaume Couairon
Jakob Verbeek
Holger Schwenk
Matthieu Cord
DiffM
140
507
0
20 Oct 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
190
3,482
0
16 Oct 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
198
1,773
0
02 Aug 2022
GLIPv2: Unifying Localization and Vision-Language Understanding
Haotian Zhang
Pengchuan Zhang
Xiaowei Hu
Yen-Chun Chen
Liunian Harold Li
Xiyang Dai
Lijuan Wang
Lu Yuan
Lei Li
Jianfeng Gao
ObjD
VLM
90
300
0
12 Jun 2022
Improved Vector Quantized Diffusion Models
Zhicong Tang
Shuyang Gu
Jianmin Bao
Dong Chen
Fang Wen
DiffM
218
63
0
31 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
404
6,866
0
13 Apr 2022
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
Oran Gafni
Adam Polyak
Oron Ashual
Shelly Sheynin
Devi Parikh
Yaniv Taigman
DiffM
79
521
0
24 Mar 2022
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr
Martin Danelljan
Andrés Romero
Feng Yu
Radu Timofte
Luc Van Gool
DiffM
349
1,409
0
24 Jan 2022
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
353
3,605
0
20 Dec 2021
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffM
VLM
274
536
0
06 Dec 2021
CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang
Yu Lu
Qiang Li
Xunqiang Tao
Yan Guo
Ming Gong
Tongliang Liu
VLM
107
370
0
30 Nov 2021
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Chris Tensmeyer
Tong Yu
Jiuxiang Gu
Jinhui Xu
Tong Sun
VLM
77
168
0
27 Nov 2021
Vector-quantized Image Modeling with Improved VQGAN
Jiahui Yu
Xin Li
Jing Yu Koh
Han Zhang
Ruoming Pang
James Qin
Alexander Ku
Yuanzhong Xu
Jason Baldridge
Yonghui Wu
ViT
VLM
DRL
117
519
0
09 Oct 2021
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
116
781
0
26 May 2021
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
230
7,857
0
11 May 2021
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
ObjD
VLM
172
883
0
26 Apr 2021
Semantic Image Matting
Yanan Sun
Chi-Keung Tang
Yu-Wing Tai
71
87
0
16 Apr 2021
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
Yuxuan Zhang
Huan Ling
Jun Gao
K. Yin
Jean-Francois Lafleche
Adela Barriuso
Antonio Torralba
Sanja Fidler
3DH
GAN
VLM
74
335
0
13 Apr 2021
Repurposing GANs for One-shot Semantic Part Segmentation
Nontawat Tritrong
Pitchaporn Rewatbowornwong
Supasorn Suwajanakorn
83
109
0
07 Mar 2021
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
340
3,702
0
18 Feb 2021
DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis
Ming Tao
Hao Tang
Leilei Gan
Xiaoyuan Jing
Bingkun Bao
Changsheng Xu
98
214
0
13 Aug 2020
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
Minfeng Zhu
Pingbo Pan
Wei Chen
Yi Yang
GAN
54
582
0
02 Apr 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
586
10,561
0
12 Dec 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
97
828
0
24 Jan 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
377
11,795
0
11 Jan 2018
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GAN
ViT
108
1,718
0
28 Nov 2017
Language-Based Image Editing with Recurrent Attentive Models
Jianbo Chen
Yelong Shen
Jianfeng Gao
Jingjing Liu
Xiaodong Liu
81
122
0
16 Nov 2017
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
226
5,019
0
02 Nov 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
73
239
0
23 Mar 2017
Deep Image Matting
N. Xu
Brian L. Price
Scott D. Cohen
Thomas Huang
58
451
0
10 Mar 2017
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
318
20,023
0
07 Oct 2016
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
69
151
0
01 Aug 2016
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
595
15,893
0
12 Nov 2013
1