Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11487
Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"
50 / 4,340 papers shown
Title
Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Harman Singh
Poorva Garg
M. Gupta
Kevin Shah
Ashish Goswami
A. Mondal
Arnab Kumar Mondal
Dinesh Khandelwal
Dinesh Garg
Parag Singla
LM&Ro
21
1
0
23 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
31
35
0
23 May 2023
Text-guided 3D Human Generation from 2D Collections
Tsu-Jui Fu
Wenhan Xiong
Yixin Nie
Jingyu Liu
Barlas Ouguz
William Yang Wang
47
1
0
23 May 2023
Weakly Supervised 3D Open-vocabulary Segmentation
Kunhao Liu
Fangneng Zhan
Jiahui Zhang
Muyu Xu
Yingchen Yu
Abdulmotaleb El Saddik
Christian Theobalt
Eric P. Xing
Shijian Lu
36
66
0
23 May 2023
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
Ruichen Wang
Zekang Chen
Chen Chen
Jiancang Ma
H. Lu
Xiaodong Lin
DiffM
52
67
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
127
6
0
23 May 2023
WaveDM: Wavelet-Based Diffusion Models for Image Restoration
Yi Huang
Jiancheng Huang
Jianzhuang Liu
Mingfu Yan
Yu Dong
Jiaxi Lv
Chaoqi Chen
Shifeng Chen
36
38
0
23 May 2023
VisorGPT: Learning Visual Prior via Generative Pre-Training
Jinheng Xie
Kai Ye
Yudong Li
Yuexiang Li
Kevin Qinghong Lin
Yefeng Zheng
Linlin Shen
Mike Zheng Shou
ViT
191
8
0
23 May 2023
i-Code Studio: A Configurable and Composable Framework for Integrative AI
Yuwei Fang
Mahmoud Khademi
Chenguang Zhu
Ziyi Yang
Reid Pryzant
...
Yao Qian
Takuya Yoshioka
Lu Yuan
Michael Zeng
Xuedong Huang
38
2
0
23 May 2023
Training Priors Predict Text-To-Image Model Performance
Charles Lovering
Ellie Pavlick
CoGe
52
3
0
23 May 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
48
152
0
23 May 2023
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach
Yufan Zhou
Ruiyi Zhang
Tongfei Sun
Jinhui Xu
DiffM
109
38
0
23 May 2023
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On
Davide Morelli
Alberto Baldrati
Giuseppe Cartella
Marcella Cornia
Marco Bertini
Rita Cucchiara
DiffM
68
102
0
22 May 2023
Efficient Large-Scale Visual Representation Learning And Evaluation
Eden Dolev
A. Awad
Denisa Roberts
Zahra Ebrahimzadeh
Marcin Mejran
Vaibhav Malpani
Mahir Yavuz
50
0
0
22 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
41
21
0
22 May 2023
Training Diffusion Models with Reinforcement Learning
Kevin Black
Michael Janner
Yilun Du
Ilya Kostrikov
Sergey Levine
EGVM
49
322
0
22 May 2023
Target-Aware Generative Augmentations for Single-Shot Adaptation
Kowshik Thopalli
Rakshith Subramanyam
Pavan Turaga
Jayaraman J. Thiagarajan
TTA
50
5
0
22 May 2023
Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models
Alicia Parrish
Hannah Rose Kirk
Jessica Quaye
Charvi Rastogi
Max Bartolo
...
Addison Howard
William J. Cukierski
D. Sculley
Vijay Janapa Reddi
Lora Aroyo
DiffM
43
12
0
22 May 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGen
DiffM
48
238
0
22 May 2023
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Guy Yariv
Itai Gat
Lior Wolf
Yossi Adi
Idan Schwartz
DiffM
44
21
0
22 May 2023
Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?
Zheng Li
Yuxuan Li
Penghai Zhao
Renjie Song
Xiang Li
Jian Yang
43
19
0
22 May 2023
DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment
Shentong Mo
Jing Shi
Yapeng Tian
20
17
0
22 May 2023
FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability through 5W Question-Answering
Megha Chakraborty
Khusbu Pahwa
Anku Rani
Shreyas Chatterjee
Dwip Dalal
...
Shreyash Mishra
K. Sensharma
Aman Chadha
Amit P. Sheth
Amitava Das
DiffM
40
8
0
22 May 2023
Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration
Qifan Yu
Juncheng Li
Wentao Ye
Siliang Tang
Yueting Zhuang
38
13
0
22 May 2023
Guided Motion Diffusion for Controllable Human Motion Synthesis
Korrawe Karunratanakul
Konpat Preechakul
Supasorn Suwajanakorn
Siyu Tang
DiffM
39
124
0
21 May 2023
DreamWaltz: Make a Scene with Complex 3D Animatable Avatars
Yukun Huang
Jianan Wang
Ailing Zeng
He Cao
Xianbiao Qi
Yukai Shi
Zhengjun Zha
Lei Zhang
34
70
0
21 May 2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data
Ziyi Yang
Mahmoud Khademi
Yichong Xu
Reid Pryzant
Yuwei Fang
...
Yu Shi
Lu Yuan
Takuya Yoshioka
Michael Zeng
Xuedong Huang
22
2
0
21 May 2023
Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model
Jie Yang
Bing Li
Fengyu Yang
Ailing Zeng
Lei Zhang
Ruimao Zhang
VLM
DiffM
37
17
0
20 May 2023
SneakyPrompt: Jailbreaking Text-to-image Generative Models
Yuchen Yang
Bo Hui
Haolin Yuan
Neil Zhenqiang Gong
Yinzhi Cao
EGVM
49
74
0
20 May 2023
The Waymo Open Sim Agents Challenge
Nico Montali
John Lambert
Paul Mougin
Alex Kuefler
Nick Rhinehart
...
Tristan Emrich
Zoey Yang
Shimon Whiteson
Brandyn White
Drago Anguelov
LLMAG
48
46
0
19 May 2023
Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
Byungjun Kim
Patrick Kwon
K. Lee
Myunggi Lee
Sookwan Han
Daesik Kim
Hanbyul Joo
DiffM
51
20
0
19 May 2023
AI's Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia
Rida Qadri
Renee Shelby
Cynthia L. Bennett
Emily Denton
38
67
0
19 May 2023
MaGIC: Multi-modality Guided Image Completion
Yongsheng Yu
Hao Wang
Tiejian Luo
Hengrui Fan
Libo Zhang
24
12
0
19 May 2023
Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields
Jingbo Zhang
Xiaoyu Li
Bo Liu
Can Wang
Jing Liao
VGen
DiffM
24
85
0
19 May 2023
LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model
Chenjie Cao
Yunuo Cai
Qiaole Dong
Yikai Wang
Yanwei Fu
DiffM
40
15
0
19 May 2023
Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots
Jinyi Hu
Xu Han
Xiaoyuan Yi
Yutong Chen
Wenhao Li
Zhiyuan Liu
Maosong Sun
DiffM
33
4
0
19 May 2023
Incomplete Multi-view Clustering via Diffusion Completion
Sifan Fang
DiffM
22
4
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
52
83
0
19 May 2023
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu
Xinyi Wang
Yujie Lu
Tsu-Jui Fu
Xinze Wang
Miguel P. Eckstein
William Yang Wang
31
4
0
18 May 2023
SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models
Ziyi Wu
Jingyu Hu
Wuyue Lu
Igor Gilitschenski
Animesh Garg
DiffM
OCL
41
45
0
18 May 2023
Brain-inspired learning in artificial neural networks: a review
Samuel Schmidgall
Jascha Achterberg
Thomas Miconi
Louis Kirsch
Rojin Ziaei
S. P. Hajiseyedrazi
Jason K. Eshraghian
41
52
0
18 May 2023
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
Can Qin
Shu Zhen Zhang
Ning Yu
Yihao Feng
Xinyi Yang
...
Caiming Xiong
Silvio Savarese
Stefano Ermon
Yun Fu
Ran Xu
33
120
0
18 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
57
73
0
18 May 2023
Inspecting the Geographical Representativeness of Images from Text-to-Image Models
Aparna Basu
R. Venkatesh Babu
Danish Pruthi
DiffM
44
39
0
18 May 2023
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Xingang Pan
A. Tewari
Thomas Leimkuhler
Lingjie Liu
Abhimitra Meka
Christian Theobalt
DiffM
65
235
0
18 May 2023
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang
Huan Yang
Zixi Tuo
Huiguo He
Sitong Su
Jianlong Fu
Jiaying Liu
DiffM
VGen
65
114
0
18 May 2023
TextDiffuser: Diffusion Models as Text Painters
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
66
114
0
18 May 2023
LDM3D: Latent Diffusion Model for 3D
Gabriela Ben-Melech Stan
Diana Wofk
Scottie Fox
Alex Redden
Will Saxton
...
Estelle Aflalo
Shao-Yen Tseng
Fabio Nonato
Matthias Muller
Vasudev Lal
35
45
0
18 May 2023
X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models
Yixiong Chen
Li Liu
C. Ding
34
21
0
18 May 2023
DiffUTE: Universal Text Editing Diffusion Model
Haoxing Chen
Zhuoer Xu
Zhangxuan Gu
Jun Lan
Xing Zheng
Yaohui Li
Changhua Meng
Huijia Zhu
Weiqiang Wang
DiffM
38
34
0
18 May 2023
Previous
1
2
3
...
69
70
71
...
85
86
87
Next