Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.07093
Cited By
GLIGEN: Open-Set Grounded Text-to-Image Generation
17 January 2023
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GLIGEN: Open-Set Grounded Text-to-Image Generation"
50 / 472 papers shown
Title
Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation
Guan Gui
Bin-Bin Gao
Xiaozhong Liu
Chengjie Wang
Y. Wu
DiffM
31
0
0
14 May 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
34
0
0
07 May 2025
PiCo: Enhancing Text-Image Alignment with Improved Noise Selection and Precise Mask Control in Diffusion Models
Chang Xie
Chenyi Zhuang
Pan Gao
VLM
35
0
0
06 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Y. Jiang
Qingyao Xu
L. Zhang
DiffM
150
0
0
05 May 2025
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim
Jaeah Lee
Jaesik Park
DiffM
KELM
60
0
0
02 May 2025
YoChameleon: Personalized Vision and Language Generation
Thao Nguyen
Krishna Kumar Singh
Jing Shi
Trung H. Bui
Yong Jae Lee
Yuheng Li
MLLM
82
0
0
29 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
57
0
0
28 Apr 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
Yinqi Li
Hong Chang
Ruibing Hou
Shiguang Shan
Xilin Chen
DiffM
55
0
0
24 Apr 2025
Text-to-Image Alignment in Denoising-Based Models through Step Selection
P. Grimal
Hervé Le Borgne
Olivier Ferret
DiffM
EGVM
48
0
0
24 Apr 2025
Gaussian Splatting is an Effective Data Generator for 3D Object Detection
F. G. Zanjani
Davide Abati
Auke Wiggers
Dimitris Kalatzis
Jens Petersen
Hong Cai
A. Habibian
3DGS
137
0
0
23 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
48
0
0
18 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
45
0
0
14 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
73
0
0
10 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
T. Ge
Bo Zheng
Hongtao Xie
DiffM
35
0
0
09 Apr 2025
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar
Vaibhav Agrawal
Sachidanand VS
R. V. Babu
DiffM
36
0
0
09 Apr 2025
Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling
Jaskirat Singh
Junshen Kevin Chen
Jonas Kohler
Michael Cohen
DiffM
VGen
43
0
0
08 Apr 2025
PanoDreamer: Consistent Text to 360-Degree Scene Generation
Zhexiao Xiong
Z. Chen
Zhong Li
Yi Tian Xu
Nathan Jacobs
3DGS
VGen
26
0
0
07 Apr 2025
BrainMRDiff: A Diffusion Model for Anatomically Consistent Brain MRI Synthesis
Moinak Bhattacharya
Saumya Gupta
Annie Singh
Cheng Chen
Gagandeep Singh
Prateek Prasanna
MedIm
26
0
0
06 Apr 2025
Dynamic Objective MPC for Motion Planning of Seamless Docking Maneuvers
Oliver Schumann
Michael Buchholz
Klaus C. J. Dietmayer
40
0
0
04 Apr 2025
A
T
^\text{T}
T
A: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting
Yizhe Tang
Zhimin Sun
Yuzhen Du
Ran Yi
Guangben Lu
T. Hu
Luying Li
Lizhuang Ma
Fangyuan Zou
DiffM
35
0
0
02 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang
Duo Peng
Feng Chen
Y. Yang
Yinjie Lei
DiffM
76
0
0
02 Apr 2025
Diffusion Meets Few-shot Class Incremental Learning
Junsu Kim
Yunhoe Ku
Dongyoon Han
Seungryul Baek
DiffM
CLL
52
0
0
30 Mar 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Z. Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
43
0
0
30 Mar 2025
High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning
Dailan He
Xihuai Wang
Shulun Wang
Guanglu Song
Bingqi Ma
Hao Shao
Y. Liu
Hongsheng Li
DiffM
65
0
0
28 Mar 2025
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
49
0
0
28 Mar 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
55
0
0
27 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
89
0
0
26 Mar 2025
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
Yuyang Peng
Shishi Xiao
Keming Wu
Qisheng Liao
Bohan Chen
Kevin Lin
Danqing Huang
Ji Li
Yuhui Yuan
DiffM
76
1
0
26 Mar 2025
MMGen: Unified Multi-modal Image Generation and Understanding in One Go
Jiepeng Wang
Zhaoqing Wang
H. Pan
Yuan Liu
Dongdong Yu
Changhu Wang
Wenping Wang
DiffM
80
0
0
26 Mar 2025
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration
Yuyao Zhang
Jinghao Li
Yu-Wing Tai
DiffM
64
0
0
25 Mar 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Oucheng Huang
Yuhang Ma
Zeng Zhao
Mingrui Wu
Jiayi Ji
Rongsheng Zhang
Z. Hu
Xiaoshuai Sun
Rongrong Ji
43
0
0
22 Mar 2025
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
Fanghua Yu
Jinjin Gu
Jinfan Hu
Zheyuan Li
Chao Dong
DiffM
52
0
0
21 Mar 2025
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
Quanhao Li
Zhen Xing
Rui Wang
Hui Zhang
Qi Dai
Zuxuan Wu
VGen
66
0
0
20 Mar 2025
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness
SeungJu Cha
Kwanyoung Lee
Ye-Chan Kim
Hyunwoo Oh
Dong-Jin Kim
48
0
0
20 Mar 2025
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei
Hossein Talebi
Mojtaba Ardakani
Vishal M. Patel
P. Milanfar
M. Delbracio
DiffM
82
1
0
18 Mar 2025
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
Nvidia
Hassan Abu Alhaija
Jose M. Alvarez
Maciej Bala
Tiffany Cai
...
Yuchong Ye
Xiaodong Yang
X. Yang
Xiaohui Zeng
Yu Zeng
VGen
92
1
0
18 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
94
0
0
17 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Y. Yang
97
1
0
16 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
70
1
0
15 Mar 2025
Piece it Together: Part-Based Concepting with IP-Priors
Elad Richardson
Kfir Goldberg
Yuval Alaluf
Daniel Cohen-Or
DiffM
66
0
0
13 Mar 2025
Investigating and Improving Counter-Stereotypical Action Relation in Text-to-Image Diffusion Models
Sina Malakouti
Adriana Kovashka
EGVM
67
0
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
109
5
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
54
0
0
13 Mar 2025
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
Jiun Tian Hoe
Weipeng Hu
Wei Zhou
Chao Xie
Ziwei Wang
Chee Seng Chan
Xudong Jiang
Y. Tan
61
0
0
12 Mar 2025
UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
Haoxuan Wang
Jinlong Peng
Q. He
Hao Yang
Ying Jin
...
Yanjie Pan
Zhenye Gan
M. Chi
Bo Peng
Yuxiang Wang
DiffM
57
0
0
12 Mar 2025
MGHanD: Multi-modal Guidance for authentic Hand Diffusion
Taehyeon Eum
Jieun Choi
Tae-Kyun Kim
52
0
0
11 Mar 2025
TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
Shaobin Zhuang
Yiwei Guo
Yanbo Ding
Kunchang Li
Xinyuan Chen
Yaohui Wang
Fangyikang Wang
Ying Zhang
Chen Li
Y. Wang
45
0
0
10 Mar 2025
AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis
Zhangyu Lai
Yilin Lu
Xinyang Li
Jianghang Lin
Yansong Qu
Liujuan Cao
Ming Li
Rongrong Ji
DiffM
149
0
0
10 Mar 2025
Consistent Image Layout Editing with Diffusion Models
Tao Xia
Yudi Zhang
Ting Liu Lei Zhang
DiffM
64
1
0
09 Mar 2025
PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation
Yanjie Pan
Q. He
Zhengkai Jiang
P. Xu
Chaoyi Wang
...
Yun Cao
Zhenye Gan
M. Chi
Bo Peng
Yuxiang Wang
DiffM
63
0
0
09 Mar 2025
1
2
3
4
...
8
9
10
Next