ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05427
  4. Cited By
Grounded Text-to-Image Synthesis with Attention Refocusing
v1v2 (latest)

Grounded Text-to-Image Synthesis with Attention Refocusing

Computer Vision and Pattern Recognition (CVPR), 2023
8 June 2023
Quynh Phung
Songwei Ge
Jia-Bin Huang
    DiffM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)

Papers citing "Grounded Text-to-Image Synthesis with Attention Refocusing"

50 / 112 papers shown
Title
Concept Conductor: Orchestrating Multiple Personalized Concepts in
  Text-to-Image Synthesis
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image SynthesisAAAI Conference on Artificial Intelligence (AAAI), 2024
Zebin Yao
Fangxiang Feng
Ruifan Li
Xiaojie Wang
DiffM
121
1
0
07 Aug 2024
SceneTeller: Language-to-3D Scene Generation
SceneTeller: Language-to-3D Scene Generation
Basak Melis Öcal
Maxim Tatarchenko
Sezer Karaoglu
Theo Gevers
160
26
0
30 Jul 2024
Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's
  Impact on Spatio-Temporal Cross-Attentions
Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-AttentionsIEEE Access (IEEE Access), 2024
Ashkan Taghipour
Morteza Ghahremani
Bennamoun
Aref Miri Rekavandi
Zinuo Li
Hamid Laga
F. Boussaïd
VGen
176
5
0
27 Jul 2024
The Fabrication of Reality and Fantasy: Scene Generation with
  LLM-Assisted Prompt Interpretation
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
Yi Yao
Chan-Feng Hsu
Jhe-Hao Lin
Hongxia Xie
Terence Lin
Yi-Ning Huang
Hong-Han Shuai
Wen-Huang Cheng
DiffM
155
5
0
17 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
129
40
0
10 Jul 2024
Sketch-Guided Scene Image Generation
Sketch-Guided Scene Image Generation
Tianyu Zhang
Xiaoxuan Xie
Xusheng Du
H. Xie
DiffM
118
3
0
09 Jul 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for
  Text-to-Image Generation?
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Zhaorun Chen
Yichao Du
Zichen Wen
Yiyang Zhou
Chenhang Cui
...
Jiawei Zhou
Zhuokai Zhao
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVMMLLM
191
54
0
05 Jul 2024
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image
  Models
AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models
Aishwarya Agarwal
Srikrishna Karanam
Balaji Vasan Srinivasan
135
2
0
27 Jun 2024
Exploring the Role of Large Language Models in Prompt Encoding for
  Diffusion Models
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models
Bingqi Ma
Zhuofan Zong
Guanglu Song
Hongsheng Li
Yu Liu
186
33
0
17 Jun 2024
Composing Object Relations and Attributes for Image-Text Matching
Composing Object Relations and Attributes for Image-Text Matching
Khoi Pham
Chuong Huynh
Ser-Nam Lim
Abhinav Shrivastava
CoGe
157
18
0
17 Jun 2024
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Jiho Choi
Seonho Lee
Seungho Lee
Minhyun Lee
Hyunjung Shim
OCL
132
2
0
17 Jun 2024
DiffusionPID: Interpreting Diffusion via Partial Information
  Decomposition
DiffusionPID: Interpreting Diffusion via Partial Information DecompositionNeural Information Processing Systems (NeurIPS), 2024
Shaurya Dewan
Rushikesh Zawar
Prakanshul Saxena
Yingshan Chang
Andrew F. Luo
Yonatan Bisk
DiffM
241
6
0
07 Jun 2024
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image
  Generation
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image GenerationNeural Information Processing Systems (NeurIPS), 2024
Lianyu Pang
Jian Yin
Baoquan Zhao
Feize Wu
Fu Lee Wang
Qing Li
Xudong Mao
DiffM
207
5
0
07 Jun 2024
Coherent Zero-Shot Visual Instruction Generation
Coherent Zero-Shot Visual Instruction Generation
Quynh Phung
Songwei Ge
Jia-Bin Huang
211
2
0
06 Jun 2024
The Crystal Ball Hypothesis in diffusion models: Anticipating object
  positions from initial noise
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Yuanhao Ban
Ruochen Wang
Tianyi Zhou
Boqing Gong
Cho-Jui Hsieh
Minhao Cheng
DiffM
176
9
0
04 Jun 2024
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
Junjie Shentu
Matthew Watson
Noura Al Moubayed
DiffM
233
1
0
28 May 2024
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
Yi Cheng
Ziwei Xu
Dongyun Lin
Harry Cheng
Yongkang Wong
Ying Sun
Joo Hwee Lim
Mohan Kankanhalli
152
1
0
21 May 2024
Compositional Text-to-Image Generation with Dense Blob Representations
Compositional Text-to-Image Generation with Dense Blob RepresentationsInternational Conference on Machine Learning (ICML), 2024
Weili Nie
Sifei Liu
Morteza Mardani
Chao Liu
Benjamin Eckart
Arash Vahdat
DiffM
233
31
0
14 May 2024
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable
Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable
Haozhe Liu
Wentian Zhang
Bing Li
Bernard Ghanem
Jürgen Schmidhuber
DiffMWIGMAAML
180
1
0
01 May 2024
MaGGIe: Masked Guided Gradual Human Instance Matting
MaGGIe: Masked Guided Gradual Human Instance Matting
Chuong Huynh
Seoung Wug Oh
Abhinav Shrivastava
Joon-Young Lee
VOS
146
12
0
24 Apr 2024
Towards Better Text-to-Image Generation Alignment via Attention
  Modulation
Towards Better Text-to-Image Generation Alignment via Attention Modulation
Yihang Wu
Xiao Cao
Kaixin Li
Zitan Chen
Haonan Wang
Lei Meng
Zhiyong Huang
DiffM
150
8
0
22 Apr 2024
MultiBooth: Towards Generating All Your Concepts in an Image from Text
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Chenyang Zhu
Kai Li
Yue Ma
Chunming He
Li Xiu
DiffM
375
43
0
22 Apr 2024
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions
SmartControl: Enhancing ControlNet for Handling Rough Visual ConditionsEuropean Conference on Computer Vision (ECCV), 2024
Xiaoyu Liu
Yuxiang Wei
Ming-Yu Liu
Xianhui Lin
Peiran Ren
Xuansong Xie
Wangmeng Zuo
DiffM
144
14
0
09 Apr 2024
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept
  Matching
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept MatchingNeural Information Processing Systems (NeurIPS), 2024
Dongzhi Jiang
Guanglu Song
Xiaoshi Wu
Renrui Zhang
Dazhong Shen
Zhuofan Zong
Yu Liu
Hongsheng Li
VLM
241
45
0
04 Apr 2024
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image
  Generation
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary
Or Patashnik
Kfir Aberman
Daniel Cohen-Or
DiffM
168
44
0
25 Mar 2024
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffMVGen
162
13
0
24 Mar 2024
Selectively Informative Description can Reduce Undesired Embedding
  Entanglements in Text-to-Image Personalization
Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image PersonalizationComputer Vision and Pattern Recognition (CVPR), 2024
Jimyeong Kim
Jungwon Park
Wonjong Rhee
DiffM
160
7
0
22 Mar 2024
ReGround: Improving Textual and Spatial Grounding at No Cost
ReGround: Improving Textual and Spatial Grounding at No Cost
Yuseung Lee
Minhyuk Sung
DiffM
242
3
0
20 Mar 2024
LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept
  Customization in Training-Free Diffusion Models
LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models
Hao Frank Yang
Wen Wang
Liang Peng
Chaotian Song
Yao Chen
...
Xiaolong Yang
Qinglin Lu
Deng Cai
Boxi Wu
Wei Liu
MoMe
189
44
0
18 Mar 2024
SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with
  Auto-Generated Data
SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated DataNeural Information Processing Systems (NeurIPS), 2024
Jialu Li
Jaemin Cho
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
MoMeDiffM
149
14
0
11 Mar 2024
DivCon: Divide and Conquer for Complex Numerical and Spatial Reasoning in Text-to-Image Generation
DivCon: Divide and Conquer for Complex Numerical and Spatial Reasoning in Text-to-Image Generation
Yuhao Jia
Wenhan Tan
DiffM
203
1
0
11 Mar 2024
MACE: Mass Concept Erasure in Diffusion Models
MACE: Mass Concept Erasure in Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2024
Shilin Lu
Zilan Wang
Leyang Li
Yanzhu Liu
A. Kong
DiffM
184
173
0
10 Mar 2024
Controllable Generation with Text-to-Image Diffusion Models: A Survey
Controllable Generation with Text-to-Image Diffusion Models: A Survey
Pu Cao
Feng Zhou
Qing-Huang Song
Lu Yang
198
63
0
07 Mar 2024
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on
  Noise Cropping and Merging
NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging
Takahiro Shirakawa
Seiichi Uchida
DiffM
129
29
0
06 Mar 2024
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis
Zheng Lv
Yuxiang Wei
Wangmeng Zuo
Kwan-Yee K. Wong
143
21
0
04 Mar 2024
Referee Can Play: An Alternative Approach to Conditional Generation via
  Model Inversion
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Xuantong Liu
Tianyang Hu
Wei Cao
Kenji Kawaguchi
Xingtai Lv
DiffM
147
3
0
26 Feb 2024
Layout-to-Image Generation with Localized Descriptions using ControlNet
  with Cross-Attention Control
Layout-to-Image Generation with Localized Descriptions using ControlNet with Cross-Attention Control
Denis Lukovnikov
Asja Fischer
DiffM
133
5
0
20 Feb 2024
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image
  Diffusion Models
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Xinchen Zhang
Ling Yang
Yaqi Cai
Zhaochen Yu
Kai-Ni Wang
...
Ye Tian
Minkai Xu
Yong Tang
Yujiu Yang
Tengjiao Wang
DiffM
169
10
0
20 Feb 2024
Textual Localization: Decomposing Multi-concept Images for
  Subject-Driven Text-to-Image Generation
Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation
Junjie Shentu
Matthew Watson
Noura Al Moubayed
130
1
0
15 Feb 2024
PALP: Prompt Aligned Personalization of Text-to-Image Models
PALP: Prompt Aligned Personalization of Text-to-Image ModelsACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2024
Moab Arar
Andrey Voynov
Amir Hertz
Omri Avrahami
Shlomi Fruchter
Yael Pritch
Daniel Cohen-Or
Ariel Shamir
DiffM
169
29
0
11 Jan 2024
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based
  Attention-Adjusted Guidance
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance
Qi Mao
Lan Chen
Yuchao Gu
Zhen Fang
Mike Zheng Shou
DiffM
154
15
0
18 Dec 2023
PEEKABOO: Interactive Video Generation via Masked-Diffusion
PEEKABOO: Interactive Video Generation via Masked-DiffusionComputer Vision and Pattern Recognition (CVPR), 2023
Yash Jain
Anshul Nasery
Vibhav Vineet
Harkirat Singh Behl
VGen
161
55
0
12 Dec 2023
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Maitreya Patel
Changhoon Kim
Sheng Cheng
Chitta Baral
Yezhou Yang
VLM
100
21
0
07 Dec 2023
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video
  Generation
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation
Jingkuan Song
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
VGen
96
6
0
28 Nov 2023
Check, Locate, Rectify: A Training-Free Layout Calibration System for
  Text-to-Image Generation
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Biao Gong
Siteng Huang
Yutong Feng
Shiwei Zhang
Yuyuan Li
Yu Liu
DiffM
188
19
0
27 Nov 2023
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via
  Blender-Oriented GPT Planning
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGenDiffM
229
41
0
21 Nov 2023
LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis
LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis
Peiang Zhao
Han Li
Ruiyang Jin
S. Kevin Zhou
DiffM
311
18
0
21 Nov 2023
AutoStory: Generating Diverse Storytelling Images with Minimal Human
  Effort
AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort
Wen Wang
Canyu Zhao
Hao Chen
Zhekai Chen
Kecheng Zheng
Chunhua Shen
DiffM
167
37
0
19 Nov 2023
Semantic Generative Augmentations for Few-Shot Counting
Semantic Generative Augmentations for Few-Shot CountingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Perla Doubinsky
Nicolas Audebert
M. Crucianu
Hervé Le Borgne
VLMDiffM
136
7
0
26 Oct 2023
A Picture is Worth a Thousand Words: Principled Recaptioning Improves
  Image Generation
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Eyal Segalis
Dani Valevski
Danny Lumen
Yossi Matias
Yaniv Leviathan
DiffM
162
32
0
25 Oct 2023
Previous
123
Next