Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2306.05427
Cited By
v1
v2 (latest)
Grounded Text-to-Image Synthesis with Attention Refocusing
Computer Vision and Pattern Recognition (CVPR), 2023
8 June 2023
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Papers citing
"Grounded Text-to-Image Synthesis with Attention Refocusing"
50 / 112 papers shown
Title
OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation
Agus Gunawan
Samuel Teodoro
Yun Chen
Soo Ye Kim
Jihyong Oh
Munchurl Kim
DiffM
83
0
0
28 Oct 2025
ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
Habin Lim
Yeongseob Won
Juwon Seo
Gyeong-Moon Park
96
0
0
06 Oct 2025
CO3: Contrasting Concepts Compose Better
Debottam Dutta
Jianchong Chen
Rajalaxmi Rajagopalan
Yu-Lin Wei
Romit Roy Choudhury
DiffM
31
0
0
30 Sep 2025
OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps
Bingnan Li
Chen Wang
Haiyang Xu
Xiang Zhang
Ethan Armand
Divyansh Srivastava
Xiaojun Shan
Zeyuan Chen
Jianwen Xie
Zhuowen Tu
VLM
62
0
0
23 Sep 2025
InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention
Qiang Xiang
Shuang Sun
Binglei Li
Dejia Song
Huaxia Li
Nemo Chen
Xu Tang
Yao Hu
Junping Zhang
DiffM
90
0
0
20 Sep 2025
Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
Reina Ishikawa
Ryo Fujii
Hideo Saito
Ryo Hachiuma
52
0
0
03 Sep 2025
Color Bind: Exploring Color Perception in Text-to-Image Models
Shay Shomer Chai
Wenxuan Peng
Bharath Hariharan
Hadar Averbuch-Elor
DiffM
37
1
0
27 Aug 2025
Scaling Group Inference for Diverse and High-Quality Generation
Gaurav Parmar
Or Patashnik
Daniil Ostashev
Kuan-Chieh Wang
Kfir Aberman
Srinivasa Narasimhan
Jun-Yan Zhu
108
0
0
21 Aug 2025
7Bench: a Comprehensive Benchmark for Layout-guided Text-to-image Models
Elena Izzo
Luca Parolari
Davide Vezzaro
Lamberto Ballan
28
0
0
18 Aug 2025
LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation
Kangrui Cen
Baixuan Zhao
Yi Xin
Siqi Luo
Guoquan Zheng
Xiaohong Liu
DiffM
VGen
52
0
0
06 Aug 2025
ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
Cihang Peng
Qiming Hou
Zhong Ren
Kun Zhou
ObjD
70
0
0
01 Aug 2025
LACONIC: A 3D Layout Adapter for Controllable Image Creation
Léopold Maillard
Tom Durand
Adrien Ramanana Rahary
Maks Ovsjanikov
DiffM
103
0
0
04 Jul 2025
Control and Realism: Best of Both Worlds in Layout-to-Image without Training
Bonan li
Yinhan Hu
Songhua Liu
Xinchao Wang
DiffM
132
2
0
18 Jun 2025
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lv
Tianlin Pan
Chenyang Si
Zhaoxi Chen
W. Zuo
Yu Qiao
Kwan-Yee K. Wong
173
3
0
09 Jun 2025
Controllable Coupled Image Generation via Diffusion Models
Chenfei Yuan
Nanshan Jia
Hangqi Li
Peter W. Glynn
Zeyu Zheng
DiffM
125
0
0
07 Jun 2025
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
Quan Shi
Carlos E. Jimenez
Shunyu Yao
Nick Haber
Diyi Yang
Karthik Narasimhan
214
1
0
05 Jun 2025
Psi-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Taehoon Yoon
Yunhong Min
Kyeongmin Yeo
Minhyuk Sung
221
1
0
02 Jun 2025
ComposeAnything: Composite Object Priors for Text-to-Image Generation
Zeeshan Khan
Shizhe Chen
Cordelia Schmid
DiffM
CoGe
177
0
0
30 May 2025
Interactive Video Generation via Domain Adaptation
Ishaan Rawal
Suryansh Kumar
DiffM
VGen
96
0
0
30 May 2025
EF-VI: Enhancing End-Frame Injection for Video Inbetweening
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Mingyu Ding
Lichao Sun
VGen
171
0
0
27 May 2025
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
H. Zhang
Dexiang Hong
Maoke Yang
Yutao Chen
Zhao Zhang
Jie Shao
Xinglong Wu
Zuxuan Wu
Yu Jiang
DiffM
AI4CE
327
7
0
25 May 2025
HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
Hang Wang
Zhi-Qi Cheng
Chenhao Lin
Chao Shen
Lei Zhang
DiffM
290
1
0
10 May 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
Andrii Zadaianchuk
151
0
0
18 Apr 2025
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Bingjie Gao
Xinyu Gao
Xiaoxue Wu
Yujie Zhou
Yu Qiao
Li Niu
Xinyuan Chen
Yaohui Wang
317
5
0
16 Apr 2025
Hierarchical and Step-Layer-Wise Tuning of Attention Specialty for Multi-Instance Synthesis in Diffusion Transformers
Chunyang Zhang
Zhenhong Sun
Zhicheng Zhang
Junyan Wang
Yu Zhang
Dong Gong
H. Mo
Daoyi Dong
258
1
0
14 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2025
Zixuan Wang
Duo Peng
Feng Chen
Yue Yang
Yinjie Lei
DiffM
239
2
0
02 Apr 2025
Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
H. Seo
Junseo Bang
Haechang Lee
Joohoon Lee
Byung Hyun Lee
Se Young Chun
225
0
0
29 Mar 2025
Efficient Multi-Instance Generation with Janus-Pro-Dirven Prompt Parsing
Fan Qi
Yu Duan
Changsheng Xu
DiffM
171
0
0
27 Mar 2025
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Yuyang Peng
Shishi Xiao
Keming Wu
Qisheng Liao
Bohan Chen
Kevin Lin
Danqing Huang
Ji Li
Yuhui Yuan
DiffM
238
6
0
26 Mar 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
Oucheng Huang
Yuhang Ma
Zeng Zhao
Mingrui Wu
Jiayi Ji
Rongsheng Zhang
Zhibo Hu
Xiaoshuai Sun
Rongrong Ji
177
1
0
22 Mar 2025
MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments
Zhixuan Liu
H. Zhu
R. Chen
Jonathan M Francis
Soonmin Hwang
Jiangning Zhang
Jean Oh
VGen
810
2
0
18 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
299
12
0
17 Mar 2025
Piece it Together: Part-Based Concepting with IP-Priors
Elad Richardson
Kfir Goldberg
Yuval Alaluf
Daniel Cohen-Or
DiffM
182
3
0
13 Mar 2025
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
Jiun Tian Hoe
Weipeng Hu
Wei Zhou
Chao Xie
Ziwei Wang
Chee Seng Chan
Xudong Jiang
Y. Tan
182
0
0
12 Mar 2025
ToLo: A Two-Stage, Training-Free Layout-To-Image Generation Framework For High-Overlap Layouts
Linhao Huang
Jing Yu
DiffM
113
1
0
03 Mar 2025
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing
International Conference on Learning Representations (ICLR), 2025
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffM
VGen
227
22
0
24 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
International Conference on Learning Representations (ICLR), 2025
Łukasz Staniszewski
Bartosz Cywiński
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
735
3
0
17 Feb 2025
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Computer Vision and Pattern Recognition (CVPR), 2024
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
296
5
0
28 Nov 2024
Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation
IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), 2024
Lakshmikar R. Polamreddy
Kalyan Roy
Sheng-Han Yueh
Deepshikha Mahato
Shilpa Kuppili
Jialu Li
Youshan Zhang
MedIm
179
3
0
22 Nov 2024
Training-Free Layout-to-Image Generation with Marginal Attention Constraints
Huancheng Chen
Jingtao Li
Weiming Zhuang
H. Vikalo
Lingjuan Lyu
DiffM
244
2
0
15 Nov 2024
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Neural Information Processing Systems (NeurIPS), 2024
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Zhiqiang Wang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
222
20
0
11 Nov 2024
Improving image synthesis with diffusion-negative sampling
European Conference on Computer Vision (ECCV), 2024
Alakh Desai
Nuno Vasconcelos
DiffM
111
6
0
08 Nov 2024
Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach
ACM Multimedia (MM), 2024
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Yiming Wu
Wei Ji
Haoran Liang
Ronghua Liang
78
2
0
03 Nov 2024
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
Neural Information Processing Systems (NeurIPS), 2024
Deepak Sridhar
Abhishek Peri
Rohith Rachala
Nuno Vasconcelos
DiffM
155
2
0
29 Oct 2024
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
Neural Information Processing Systems (NeurIPS), 2024
Phillip Y. Lee
Taehoon Yoon
Minhyuk Sung
217
16
1
27 Oct 2024
TopoDiffusionNet: A Topology-aware Diffusion Model
International Conference on Learning Representations (ICLR), 2024
Saumya Gupta
Dimitris Samaras
Chong Chen
DiffM
257
6
0
22 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
130
0
0
13 Oct 2024
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Neural Information Processing Systems (NeurIPS), 2024
Chieh-Yun Chen
Chiang Tseng
Li-Wu Tsao
Hong-Han Shuai
312
14
0
01 Oct 2024
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending
ACM Symposium on User Interface Software and Technology (UIST), 2024
Nels Numan
Shwetha Rajaram
Balasaravanan Thoravi Kumaravel
Nicolai Marquardt
A. D. Wilson
125
11
0
20 Sep 2024
DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer
International Conference on 3D Vision (3DV), 2024
Runjia Li
Junlin Han
Luke Melas-Kyriazi
Chunyi Sun
Zhaochong An
Zhongrui Gui
Shuyang Sun
Philip Torr
Tomas Jakab
147
5
0
12 Sep 2024
1
2
3
Next