Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11487
Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"
50 / 1,364 papers shown
Title
Block Flow: Learning Straight Flow on Data Blocks
Zibin Wang
Zhiyuan Ouyang
Xiangyun Zhang
78
0
0
20 Jan 2025
Nested Annealed Training Scheme for Generative Adversarial Networks
Chang Wan
Ming-Hsuan Yang
Minglu Li
Yunliang Jiang
Zhonglong Zheng
GAN
129
0
0
20 Jan 2025
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Sungbin Kim
Hyunwuk Lee
Wonho Cho
Mincheol Park
Won Woo Ro
151
1
0
20 Jan 2025
DPCL-Diff: The Temporal Knowledge Graph Reasoning Based on Graph Node Diffusion Model with Dual-Domain Periodic Contrastive Learning
Yukun Cao
Lisheng Wang
Luobing Huang
DiffM
110
2
0
20 Jan 2025
Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding
Kohei Torimi
Ryosuke Yamada
Daichi Otsuka
Kensho Hara
Yuki M. Asano
Hirokatsu Kataoka
Y. Aoki
3DV
138
0
0
20 Jan 2025
Can AI-Generated Text be Reliably Detected?
Vinu Sankar Sadasivan
Aounon Kumar
S. Balasubramanian
Wenxiao Wang
Soheil Feizi
DeLMO
291
389
0
20 Jan 2025
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Jin Zhu
Huimin Ma
Jiansheng Chen
Jian Yuan
160
4
0
20 Jan 2025
Diffusion-Based Imitation Learning for Social Pose Generation
Antonio Lech Martin-Ozimek
Isuru Jayarathne
Su Larb Mon
Jouh Yeong Chew
DiffM
62
0
0
18 Jan 2025
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
Sumit Chaturvedi
Mengwei Ren
Yannick Hold-Geoffroy
Jingyuan Liu
Julie Dorsey
Zhixin Shu
DiffM
99
0
0
17 Jan 2025
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi
Kehang Han
Zehao Wang
Arnaud Doucet
Michalis K. Titsias
DiffM
225
105
0
17 Jan 2025
Joint Learning of Depth and Appearance for Portrait Image Animation
Xinya Ji
Gaspard Zoss
Prashanth Chandran
Lingchen Yang
Xun Cao
B. Solenthaler
D. Bradley
3DH
MDE
133
1
0
15 Jan 2025
Grounding Text-to-Image Diffusion Models for Controlled High-Quality Image Generation
Ahmad Süleyman
Göksel Biricik
87
2
0
15 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
137
11
0
13 Jan 2025
IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion
Tharun Anand
Aryan Garg
Kaushik Mitra
VGen
DiffM
88
0
0
13 Jan 2025
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming-Hsuan Yang
Sergey Tulyakov
DiffM
VGen
192
13
0
10 Jan 2025
INFELM: In-depth Fairness Evaluation of Large Text-To-Image Models
Di Jin
Xing Liu
Yu Liu
Jia Qing Yap
Andrea Wong
Adriana Crespo
Qi Lin
Zhiyuan Yin
Qiang Yan
Ryan Ye
EGVM
VLM
500
0
0
10 Jan 2025
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Minxing Luo
Zixun Xia
L. Chen
Zhenhang Li
Weichao Zeng
Jinqiao Wang
Wentao Cheng
Yaxing Wang
Yu Zhou
Jian Yang
DiffM
149
1
0
10 Jan 2025
TextToucher: Fine-Grained Text-to-Touch Generation
Jiahang Tu
Hao Fu
Fengyu Yang
Hanbin Zhao
Chao Zhang
Hui Qian
VLM
DiffM
157
12
0
10 Jan 2025
Unity by Diversity: Improved Representation Learning in Multimodal VAEs
Thomas M. Sutter
Yang Meng
Andrea Agostini
Daphné Chopard
Norbert Fortin
Julia E. Vogt
Bahbak Shahbaba
Stephan Mandt
SSL
111
2
0
08 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
117
5
0
08 Jan 2025
XGeM: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation
Daniele Molino
Francesco Di Feola
E. Faiella
Deborah Fazzini
D. Santucci
Linlin Shen
V. Guarrasi
Paolo Soda
SyDa
MedIm
127
1
0
08 Jan 2025
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
Dongmin Park
Sebin Kim
Taehong Moon
Minkyu Kim
Kangwook Lee
Jaewoong Cho
DiffM
CoGe
117
5
0
08 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
Junxuan Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
135
21
0
05 Jan 2025
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Yuzhu Cai
Sheng Yin
Yuxi Wei
Chenxin Xu
Weibo Mao
Felix Juefei Xu
Siheng Chen
Yanfeng Wang
EGVM
200
3
0
03 Jan 2025
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model
Omid Saghatchian
Atiyeh Gh. Moghadam
Ahmad Nickabadi
MoMe
147
1
0
03 Jan 2025
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
Rahul Sajnani
Jeroen Vanbaar
Jie Min
Kapil D. Katyal
Srinath Sridhar
DiffM
169
13
0
03 Jan 2025
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
176
0
0
03 Jan 2025
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure
Feng Han
Kai-xiang Chen
Chao Gong
Zhipeng Wei
Jingjing Chen
Yu-Gang Jiang
89
3
0
03 Jan 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
139
0
0
03 Jan 2025
Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models
Gen Li
Yuling Yan
DiffM
117
23
0
03 Jan 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
168
6
0
03 Jan 2025
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control
Yuanpeng Tu
Hao Luo
Xi Chen
S. Ji
Xiang Bai
Hengshuang Zhao
DiffM
VGen
160
6
0
02 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
128
0
0
02 Jan 2025
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
Zhipeng Chen
Lan Yang
Yonggang Qi
Honggang Zhang
Kaiyue Pang
Ke Li
Yi-Zhe Song
DiffM
198
0
0
31 Dec 2024
AdaDiff: Adaptive Step Selection for Fast Diffusion Models
Hui Zhang
Zuxuan Wu
Zhen Xing
Jie Shao
Yu-Gang Jiang
147
13
0
31 Dec 2024
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Jiehui Huang
Xiao Dong
Wenhui Song
Zheng Chong
Zhiqiang Zhang
...
Long Chen
Hanhui Li
Yiqiang Yan
Shengcai Liao
Xiaodan Liang
DiffM
84
23
0
31 Dec 2024
HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories
Eric Hedlin
Munawar Hayat
Fatih Porikli
Kwang Moo Yi
Shweta Mahajan
3DH
167
0
0
22 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
253
10
0
19 Dec 2024
Threshold Neuron: A Brain-inspired Artificial Neuron for Efficient On-device Inference
Zihao Zheng
Yan Liang
Jiayu Chen
Peng Zhou
Xiang Chen
Yunxin Liu
204
0
0
18 Dec 2024
CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution
Bingwen Hu
Heng Liu
Zhedong Zheng
Ping Liu
SupR
259
0
0
16 Dec 2024
EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting
Dong In Lee
Hyeongcheol Park
Jiyoung Seo
Eunbyung Park
Hyunje Park
Ha Dam Baek
Shin Sangheon
Sangmin kim
Sangpil Kim
3DGS
207
3
0
16 Dec 2024
Can video generation replace cinematographers? Research on the cinematic language of generated video
Xuelong Li
Kai WU
Siyi Yang
YiZhan Qu
Guohua. Zhang
...
Mingliang Xiong
Hao Deng
Qingwen Liu
Gang Li
Bin He
VGen
DiffM
173
1
0
16 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
400
3
0
14 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hong Chen
Zihan Wang
Xianrui Li
Xingwu Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
286
10
0
14 Dec 2024
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Hongjie Wang
Chih-Yao Ma
Yen-Cheng Liu
Ji Hou
Tao Xu
...
Peizhao Zhang
Tingbo Hou
Peter Vajda
N. Jha
Xiaoliang Dai
LMTD
VGen
VLM
DiffM
197
11
0
13 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
236
2
0
12 Dec 2024
Olympus: A Universal Task Router for Computer Vision Tasks
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip Torr
VLM
ObjD
548
1
0
12 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
216
4
0
11 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
252
6
0
10 Dec 2024
FIRE: Robust Detection of Diffusion-Generated Images via Frequency-Guided Reconstruction Error
Beilin Chu
Xuan Xu
Xin Wang
Yanzhe Zhang
Weike You
Linna Zhou
DiffM
161
4
0
10 Dec 2024
Previous
1
2
3
...
6
7
8
...
26
27
28
Next