Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.20935
Cited By
ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation
27 May 2025
Sanghyun Jo
Wooyeol Lee
Ziseok Lee
Kyungsu Kim
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ISAC: Training-Free Instance-to-Semantic Attention Control for Improving Multi-Instance Generation"
47 / 47 papers shown
Title
DiffEGG: Diffusion-Driven Edge Generation as a Pixel-Annotation-Free Alternative for Instance Annotation
Sanghyun Jo
Ziseok Lee
Wooyeol Lee
Kyungsu Kim
94
1
0
11 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
Jiawei Han
Guiguang Ding
VLM
ObjD
106
4
0
10 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohammad Hadi Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
60
1
0
09 Mar 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
87
5
0
08 Jan 2025
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
Taihang Hu
Linxuan Li
Joost van de Weijer
Hongcheng Gao
Fahad Shahbaz Khan
Jian Yang
Ming-Ming Cheng
Kai Wang
Yaxing Wang
DiffM
79
7
0
11 Nov 2024
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
Phillip Y. Lee
Taehoon Yoon
Minhyuk Sung
92
7
1
27 Oct 2024
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang
Ling Yang
Ge Li
Yaqi Cai
Jiake Xie
Yong Tang
Yujiu Yang
Mengdi Wang
Bin Cui
EGVM
CoGe
59
9
0
09 Oct 2024
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
Gihyun Kwon
Jong Chul Ye
DiffM
75
5
0
08 Oct 2024
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
Chieh-Yun Chen
Chiang Tseng
Li-Wu Tsao
Hong-Han Shuai
45
8
0
01 Oct 2024
Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis
Haeil Lee
Han S. Lee
Seoyeon Gye
Junmo Kim
DiffM
61
3
0
16 Jul 2024
Make It Count: Text-to-Image Generation with an Accurate Number of Objects
Lital Binyamin
Yoad Tewel
Hilit Segev
Eran Hirsch
Royi Rassin
Gal Chechik
53
8
0
14 Jun 2024
Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Dazhong Shen
Guanglu Song
Zeyue Xue
Fu-Yun Wang
Yu Liu
DiffM
52
13
0
08 Apr 2024
InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization
Xiefan Guo
Jinlin Liu
Miaomiao Cui
Jiankai Li
Hongyu Yang
Di Huang
59
30
0
06 Apr 2024
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary
Or Patashnik
Kfir Aberman
Daniel Cohen-Or
DiffM
52
30
0
25 Mar 2024
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Junsong Chen
Chongjian Ge
Enze Xie
Yue Wu
Lewei Yao
Xiaozhe Ren
Zhongdao Wang
Ping Luo
Huchuan Lu
Zhenguo Li
151
106
0
07 Mar 2024
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Takahiro Shirakawa
Seiichi Uchida
DiffM
43
17
0
06 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
210
1,244
0
05 Mar 2024
InstanceDiffusion: Instance-level Control for Image Generation
Xudong Wang
Trevor Darrell
Sai Saketh Rambhatla
Rohit Girdhar
Ishan Misra
VLM
DiffM
42
91
0
05 Feb 2024
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang
Zhaochen Yu
Chenlin Meng
Minkai Xu
Stefano Ermon
Tengjiao Wang
CoGe
DiffM
68
127
0
22 Jan 2024
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral
Enis Simsar
Federico Tombari
Pinar Yanardag
DiffM
VLM
65
30
0
11 Dec 2023
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
160
364
0
28 Nov 2023
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
Jiayu Xiao
Henglei Lv
Liang Li
Shuhui Wang
Qingming Huang
DiffM
66
21
0
13 Oct 2023
PixArt-
α
α
α
: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
70
414
0
30 Sep 2023
Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter
Jinglong Wang
Xiawei Li
Jing Zhang
Qingyuan Xu
Qin Zhou
Qian Yu
Lu Sheng
Dong Xu
VLM
DiffM
41
47
0
06 Sep 2023
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Jinheng Xie
Yuexiang Li
Yawen Huang
Haozhe Liu
Wentian Zhang
Yefeng Zheng
Mike Zheng Shou
DiffM
98
200
0
20 Jul 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
166
2,242
0
04 Jul 2023
Counting Guidance for High Fidelity Text-to-Image Synthesis
Wonjune Kang
Kevin Galim
H. Koo
Nam Ik Cho
DiffM
74
9
0
30 Jun 2023
Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Royi Rassin
Eran Hirsch
Daniel Glickman
Shauli Ravfogel
Yoav Goldberg
Gal Chechik
DiffM
53
106
0
15 Jun 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
55
157
0
23 May 2023
Training-Free Layout Control with Cross-Attention Guidance
Minghao Chen
Iro Laina
Andrea Vedaldi
DiffM
147
229
0
06 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
264
7,047
0
05 Apr 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
163
1,893
0
09 Mar 2023
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal
Lior Yariv
Y. Lipman
Tali Dekel
71
377
1
16 Feb 2023
Universal Guidance for Diffusion Models
Arpit Bansal
Hong-Min Chu
Avi Schwarzschild
Soumyadip Sengupta
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
66
256
0
14 Feb 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
89
503
0
31 Jan 2023
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Xinze Wang
William Yang Wang
CoGe
89
309
0
09 Dec 2022
Flow Matching for Generative Modeling
Y. Lipman
Ricky T. Q. Chen
Heli Ben-Hamu
Maximilian Nickel
Matt Le
OOD
114
1,189
0
06 Oct 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
128
1,746
0
02 Aug 2022
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
86
3,830
0
26 Jul 2022
Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision
Michael A. Hobley
V. Prisacariu
92
39
0
20 May 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
268
15,081
0
20 Dec 2021
Noise2Score: Tweedie's Approach to Self-Supervised Image Denoising without Clean Images
Kwanyoung Kim
Jong Chul Ye
DiffM
53
105
0
13 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
681
28,659
0
26 Feb 2021
Simple multi-dataset detection
Xingyi Zhou
V. Koltun
Philipp Krahenbuhl
ObjD
255
115
0
25 Feb 2021
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
299
17,550
0
19 Jun 2020
Semi-Supervised Classification with Graph Convolutional Networks
Thomas Kipf
Max Welling
GNN
SSL
447
28,901
0
09 Sep 2016
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
272
43,290
0
01 May 2014
1