Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.07093
Cited By
GLIGEN: Open-Set Grounded Text-to-Image Generation
17 January 2023
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GLIGEN: Open-Set Grounded Text-to-Image Generation"
50 / 472 papers shown
Title
How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model
Shezheng Song
Xiaopeng Li
Shasha Li
Shan Zhao
Jie Yu
Jun Ma
Xiaoguang Mao
Weimin Zhang
71
4
0
10 Nov 2023
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye
Jason Kuen
Qing Liu
Zhe-nan Lin
Brian L. Price
Dan Xu
VLM
18
10
0
06 Nov 2023
Cross-Image Attention for Zero-Shot Appearance Transfer
Yuval Alaluf
Daniel Garibi
Or Patashnik
Hadar Averbuch-Elor
Daniel Cohen-Or
DiffM
35
69
0
06 Nov 2023
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Wei-Ge Chen
Irina Spiridonova
Jianwei Yang
Jianfeng Gao
Chun-yue Li
MLLM
VLM
13
33
0
01 Nov 2023
CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
Ziyang Yuan
Mingdeng Cao
Xintao Wang
Zhongang Qi
Chun Yuan
Ying Shan
DiffM
25
23
0
30 Oct 2023
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
Haoxin Chen
Menghan Xia
Yin-Yin He
Yong Zhang
Xiaodong Cun
...
Yaofang Liu
Qifeng Chen
Xintao Wang
Chao-Liang Weng
Ying Shan
DiffM
26
278
0
30 Oct 2023
Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models
Pushkal Katara
Zhou Xian
Katerina Fragkiadaki
LM&Ro
41
37
0
27 Oct 2023
Integrating View Conditions for Image Synthesis
Jinbin Bai
Zhen Dong
Aosong Feng
Xiao Zhang
Tian-Chun Ye
Kaicheng Zhou
67
13
0
24 Oct 2023
DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning
Abhaysinh Zala
Han Lin
Jaemin Cho
Mohit Bansal
35
12
0
18 Oct 2023
BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys
Yu Gu
Jianwei Yang
Naoto Usuyama
Chun-yue Li
Sheng Zhang
M. Lungren
Jianfeng Gao
Hoifung Poon
MedIm
30
22
0
16 Oct 2023
TOSS:High-quality Text-guided Novel View Synthesis from a Single Image
Yukai Shi
Jianan Wang
He Cao
Boshi Tang
Xianbiao Qi
Tianyu Yang
Yukun Huang
Shilong Liu
Lei Zhang
H. Shum
DiffM
14
20
0
16 Oct 2023
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
Hanan Gani
Shariq Farooq Bhat
Muzammal Naseer
Salman Khan
Peter Wonka
DiffM
44
38
0
16 Oct 2023
Scene Graph Conditioning in Latent Diffusion
Frank Fundel
DiffM
34
0
0
16 Oct 2023
R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation
Jiayu Xiao
Henglei Lv
Liang Li
Shuhui Wang
Qingming Huang
DiffM
30
20
0
13 Oct 2023
OmniControl: Control Any Joint at Any Time for Human Motion Generation
Yiming Xie
Varun Jampani
Lei Zhong
Deqing Sun
Huaizu Jiang
DiffM
29
108
0
12 Oct 2023
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Vishaal Udandarao
Max F. Burg
Samuel Albanie
Matthias Bethge
VLM
31
9
0
12 Oct 2023
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Zhengyuan Yang
Jianfeng Wang
Linjie Li
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Lijuan Wang
LRM
MLLM
DiffM
13
22
0
12 Oct 2023
Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models
Zeqiang Lai
Xizhou Zhu
Jifeng Dai
Yu Qiao
Wenhai Wang
MLLM
DiffM
51
22
0
11 Oct 2023
Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else
Hazarapet Tunanyan
Dejia Xu
Shant Navasardyan
Zhangyang Wang
Humphrey Shi
DiffM
83
7
0
11 Oct 2023
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Anton Razzhigaev
Arseniy Shakhmatov
Anastasia Maltseva
V.Ya. Arkhipkin
Igor Pavlov
Ilya Ryabov
Angelina Kuts
Alexander Panchenko
Andrey Kuznetsov
Denis Dimitrov
45
78
0
05 Oct 2023
MagicDrive: Street View Generation with Diverse 3D Geometry Control
Ruiyuan Gao
Kai Chen
Enze Xie
Lanqing Hong
Zhenguo Li
Dit-Yan Yeung
Qiang Xu
DiffM
36
103
0
04 Oct 2023
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
Samyadeep Basu
Mehrdad Saberi
S. Bhardwaj
Atoosa Malemir Chegini
Daniela Massiceti
Maziar Sanjabi
S. Hu
S. Feizi
53
16
0
03 Oct 2023
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
Naitik Khandelwal
Xiao Liu
Mengmi Zhang
CLL
31
0
0
02 Oct 2023
Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Hyeonho Jeong
Jong Chul Ye
DiffM
VGen
35
41
0
02 Oct 2023
Completing Visual Objects via Bridging Generation and Segmentation
Xiang Li
Yinpeng Chen
Chung-Ching Lin
Hao Chen
Kai Hu
Rita Singh
Bhiksha Raj
Lijuan Wang
Zicheng Liu
DiffM
23
4
0
01 Oct 2023
LLM-grounded Video Diffusion Models
Long Lian
Baifeng Shi
Semih Yavuz
Ye Liu
Boyi Li
DiffM
19
54
0
29 Sep 2023
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Han Lin
Abhaysinh Zala
Jaemin Cho
Joey Tianyi Zhou
LM&Ro
VGen
DiffM
43
74
0
26 Sep 2023
Identifying Systematic Errors in Object Detectors with the SCROD Pipeline
Valentyn Boreiko
Matthias Hein
J. H. Metzen
31
6
0
23 Sep 2023
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
Xiaofeng Wang
Zheng Hua Zhu
Guan Huang
Xinze Chen
Jiagang Zhu
Jiwen Lu
VGen
22
148
0
18 Sep 2023
DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection
Manlin Zhang
Jie Wu
Yuxi Ren
Ming Li
Jie Qin
Xuefeng Xiao
Wei Liu
Rui Wang
Min Zheng
Andy J. Ma
DiffM
31
20
0
07 Sep 2023
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Sungwon Hwang
J. Hyung
Jaegul Choo
DiffM
27
4
0
07 Sep 2023
Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance
Zexin Hu
Kun Hu
Clinton Mo
Lei Pan
Zhiyong Wang
DiffM
25
2
0
31 Aug 2023
Elucidating the Exposure Bias in Diffusion Models
Mang Ning
Mingxiao Li
Jianlin Su
A. A. Salah
Itir Onal Ertugrul
DiffM
119
35
0
29 Aug 2023
A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions
Tianyi Zhang
Zheng Wang
Jin Huang
M. M. Tasnim
Wei Shi
VLM
16
21
0
25 Aug 2023
Dense Text-to-Image Generation with Attention Modulation
Yunji Kim
Jiyoung Lee
Jin-Hwa Kim
Jung-Woo Ha
Jun-Yan Zhu
DiffM
41
134
0
24 Aug 2023
Audio Generation with Multiple Conditional Diffusion Model
Zhifang Guo
Jianguo Mao
Ruijie Tao
Long Yan
Kazushige Ouchi
Hong Liu
Xiangdong Wang
DiffM
24
11
0
23 Aug 2023
SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Chengyou Jia
Minnan Luo
Zhuohang Dang
Guangwen Dai
Xiaojun Chang
Mengmeng Wang
Jingdong Wang
DiffM
44
13
0
20 Aug 2023
StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Wenhao Chai
Xun Guo
Gaoang Wang
Yang Lu
VGen
DiffM
24
147
0
18 Aug 2023
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
Zhizhong Wang
Lei Zhao
Wei Xing
DiffM
27
120
0
15 Aug 2023
PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions
John Joon Young Chung
Eytan Adar
DiffM
25
56
0
09 Aug 2023
LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Leigang Qu
Shengqiong Wu
Hao Fei
Liqiang Nie
Tat-Seng Chua
LM&Ro
DiffM
MLLM
35
88
0
09 Aug 2023
BEVControl: Accurately Controlling Street-view Elements with Multi-perspective Consistency via BEV Sketch Layout
Kairui Yang
Enhui Ma
Jibing Peng
Qing-Wu Guo
Di Lin
Kaicheng Yu
DiffM
28
57
0
03 Aug 2023
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation
Yasheng Sun
Yifan Yang
Houwen Peng
Yifei Shen
Yuqing Yang
Hang-Rui Hu
Lili Qiu
Hideki Koike
DiffM
LM&Ro
37
33
0
02 Aug 2023
Visual Instruction Inversion: Image Editing via Visual Prompting
Thao Nguyen
Yuheng Li
Utkarsh Ojha
Yong Jae Lee
DiffM
32
22
0
26 Jul 2023
Benchmarking and Analyzing Generative Data for Visual Recognition
Bo-wen Li
Haotian Liu
Liangyu Chen
Yong Jae Lee
C. Li
Ziwei Liu
EGVM
VLM
18
4
0
25 Jul 2023
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Shilin Lu
Yanzhu Liu
A. Kong
43
92
0
24 Jul 2023
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
Jiancang Ma
Junhao Liang
Chen Chen
H. Lu
28
138
0
21 Jul 2023
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Jinheng Xie
Yuexiang Li
Yawen Huang
Haozhe Liu
Wentian Zhang
Yefeng Zheng
Mike Zheng Shou
DiffM
36
193
0
20 Jul 2023
Planting a SEED of Vision in Large Language Model
Yuying Ge
Yixiao Ge
Ziyun Zeng
Xintao Wang
Ying Shan
VLM
MLLM
8
90
0
16 Jul 2023
Counting Guidance for High Fidelity Text-to-Image Synthesis
Wonjune Kang
Kevin Galim
H. Koo
Nam Ik Cho
DiffM
32
8
0
30 Jun 2023
Previous
1
2
3
...
10
7
8
9
Next