ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXivPDFHTML

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 4,340 papers shown
Title
OpenShape: Scaling Up 3D Shape Representation Towards Open-World
  Understanding
OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
Minghua Liu
Ruoxi Shi
Kaiming Kuang
Yinhao Zhu
Xuanlin Li
Shizhong Han
H. Cai
Fatih Porikli
Hao Su
3DPC
47
116
0
18 May 2023
Discffusion: Discriminative Diffusion Models as Few-shot Vision and
  Language Learners
Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He
Weixi Feng
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
William Yang Wang
Xinze Wang
DiffM
62
7
0
18 May 2023
Content-based Unrestricted Adversarial Attack
Content-based Unrestricted Adversarial Attack
Zhaoyu Chen
Yue Liu
Shuang Wu
Kaixun Jiang
Shouhong Ding
Wenqiang Zhang
DiffM
34
63
0
18 May 2023
IMAD: IMage-Augmented multi-modal Dialogue
IMAD: IMage-Augmented multi-modal Dialogue
Viktor Moskvoretskii
Anton Frolov
Denis Kuznetsov
32
4
0
17 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming Liu
Yogesh Balaji
DiffM
VGen
51
254
0
17 May 2023
What You See is What You Read? Improving Text-Image Alignment Evaluation
What You See is What You Read? Improving Text-Image Alignment Evaluation
Michal Yarom
Yonatan Bitton
Soravit Changpinyo
Roee Aharoni
Jonathan Herzig
Oran Lang
E. Ofek
Idan Szpektor
EGVM
62
75
0
17 May 2023
Exploring the Space of Key-Value-Query Models with Intention
Exploring the Space of Key-Value-Query Models with Intention
M. Garnelo
Wojciech M. Czarnecki
43
7
0
17 May 2023
Selective Amnesia: A Continual Learning Approach to Forgetting in Deep
  Generative Models
Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
Alvin Heng
Harold Soh
VLM
KELM
DiffM
40
109
0
17 May 2023
Towards Generalist Robots: A Promising Paradigm via Generative
  Simulation
Towards Generalist Robots: A Promising Paradigm via Generative Simulation
Zhou Xian
Théophile Gervet
Zhenjia Xu
Yi-Ling Qiao
Tsun-Hsuan Wang
Yian Wang
LM&Ro
54
8
0
17 May 2023
Selective Guidance: Are All the Denoising Steps of Guided Diffusion
  Important?
Selective Guidance: Are All the Denoising Steps of Guided Diffusion Important?
Pareesa Ameneh Golnari
Z. Yao
Yuxiong He
DiffM
29
4
0
16 May 2023
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion
  Generation
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
S. Azadi
Akbar Shah
Thomas Hayes
Devi Parikh
Sonal Gupta
DiffM
35
42
0
16 May 2023
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation
Tong Wu
Zhihao Fan
Xiao Liu
Yeyun Gong
Yelong Shen
...
Juntao Li
Zhongyu Wei
Jian Guo
Nan Duan
Weizhu Chen
VLM
85
54
0
16 May 2023
Interactive Fashion Content Generation Using LLMs and Latent Diffusion
  Models
Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models
Krishna Sri Ipsit Mantri
Nevasini Sasikumar
DiffM
45
1
0
15 May 2023
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao
Enze Xie
Lanqing Hong
Zhenguo Li
G. Lee
DiffM
VGen
49
33
0
15 May 2023
Common Diffusion Noise Schedules and Sample Steps are Flawed
Common Diffusion Noise Schedules and Sample Steps are Flawed
Shanchuan Lin
Bingchen Liu
Jiashi Li
Xiao Yang
DiffM
34
203
0
15 May 2023
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Rabeeh Karimi Mahabadi
Hamish Ivison
Jaesung Tae
James Henderson
Iz Beltagy
Matthew E. Peters
Arman Cohan
37
22
0
15 May 2023
Neural Boltzmann Machines
Neural Boltzmann Machines
Alex H. Lang
A. Loukianov
Charles K. Fisher
AI4CE
BDL
35
2
0
15 May 2023
Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed
  Opportunity
Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
Raman Dutt
Linus Ericsson
Pedro Sanchez
Sotirios A. Tsaftaris
Timothy M. Hospedales
MedIm
42
50
0
14 May 2023
Diffusion Models for Imperceptible and Transferable Adversarial Attack
Diffusion Models for Imperceptible and Transferable Adversarial Attack
Jianqi Chen
Hechang Chen
Keyan Chen
Yilan Zhang
Zhengxia Zou
Z. Shi
DiffM
37
59
0
14 May 2023
Beware of diffusion models for synthesizing medical images -- A
  comparison with GANs in terms of memorizing brain MRI and chest x-ray images
Beware of diffusion models for synthesizing medical images -- A comparison with GANs in terms of memorizing brain MRI and chest x-ray images
Muhammad Usman Akbar
Wuhao Wang
Anders Eklund
DiffM
MedIm
31
15
0
12 May 2023
An Inverse Scaling Law for CLIP Training
An Inverse Scaling Law for CLIP Training
Xianhang Li
Zeyu Wang
Cihang Xie
VLM
CLIP
48
55
0
11 May 2023
Exploiting Diffusion Prior for Real-World Image Super-Resolution
Exploiting Diffusion Prior for Real-World Image Super-Resolution
Jianyi Wang
Zongsheng Yue
Shangchen Zhou
Kelvin C. K. Chan
Chen Change Loy
53
285
0
11 May 2023
Learning the Visualness of Text Using Large Vision-Language Models
Learning the Visualness of Text Using Large Vision-Language Models
Gaurav Verma
Ryan A. Rossi
Chris Tensmeyer
Jiuxiang Gu
A. Nenkova
VLM
32
0
0
11 May 2023
Null-text Guidance in Diffusion Models is Secretly a Cartoon-style
  Creator
Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator
Jing Zhao
Heliang Zheng
Chaoyue Wang
Long Lan
Wanrong Huang
Wenjing Yang
DiffM
41
10
0
11 May 2023
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Chenghao Li
Chaoning Zhang
Atish Waghwase
Lik-Hang Lee
François Rameau
Yang Yang
Sung-Ho Bae
Choong Seon Hong
54
75
0
10 May 2023
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal
  Conditional Image Synthesis
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis
Jinsheng Zheng
Daqing Liu
Chaoyue Wang
Minghui Hu
Zuopeng Yang
Changxing Ding
Dacheng Tao
40
1
0
10 May 2023
iEdit: Localised Text-guided Image Editing with Weak Supervision
iEdit: Localised Text-guided Image Editing with Weak Supervision
Rumeysa Bodur
Erhan Gundogdu
Binod Bhattarai
Tae-Kyun Kim
M. Donoser
Loris Bazzani
DiffM
33
14
0
10 May 2023
Comprehensive Dataset of Synthetic and Manipulated Overhead Imagery for
  Development and Evaluation of Forensic Tools
Comprehensive Dataset of Synthetic and Manipulated Overhead Imagery for Development and Evaluation of Forensic Tools
Brandon B. May
K. Trapeznikov
Shengbang Fang
Matthew C. Stamm
DiffM
33
4
0
09 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style
  Transfer
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
Nisha Huang
Yuxin Zhang
Weiming Dong
DiffM
VGen
40
16
0
09 May 2023
Tomography of Quantum States from Structured Measurements via quantum-aware transformer
Tomography of Quantum States from Structured Measurements via quantum-aware transformer
Hailan Ma
Zhenhong Sun
Daoyi Dong
Chunlin Chen
H. Rabitz
45
3
0
09 May 2023
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion
  Models
Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
Wenkai Dong
Song Xue
Xiaoyue Duan
Shumin Han
DiffM
53
58
0
08 May 2023
Text-to-Image Diffusion Models can be Easily Backdoored through
  Multimodal Data Poisoning
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning
Shengfang Zhai
Yinpeng Dong
Qingni Shen
Shih-Chieh Pu
Yuejian Fang
Hang Su
38
72
0
07 May 2023
Exploring One-shot Semi-supervised Federated Learning with A Pre-trained
  Diffusion Model
Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model
Min Yang
Shangchao Su
Bin Li
Xiangyang Xue
DiffM
41
30
0
06 May 2023
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
Seungwoo Lee
Chaerin Kong
D. Jeon
Nojun Kwak
DiffM
26
19
0
06 May 2023
Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling
  Augmentation Framework
Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework
Ruijia Wu
Yuhang Wang
Huafeng Shi
Zhipeng Yu
Yichao Wu
Ding Liang
DiffM
29
9
0
06 May 2023
The Role of Data Curation in Image Captioning
The Role of Data Curation in Image Captioning
Wenyan Li
Jonas F. Lotz
Chen Qiu
Desmond Elliott
DiffM
40
6
0
05 May 2023
Iterative $α$-(de)Blending: a Minimalist Deterministic Diffusion
  Model
Iterative ααα-(de)Blending: a Minimalist Deterministic Diffusion Model
Eric Heitz
Laurent Belcour
T. Chambon
DiffM
17
36
0
05 May 2023
Generative Steganography Diffusion
Generative Steganography Diffusion
Ping Wei
Qing Zhou
Zichi Wang
Zhenxing Qian
Xinpeng Zhang
Sheng Li
DiffM
22
10
0
05 May 2023
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven
  Text-to-Image Generation
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
Hong Chen
Yipeng Zhang
Simin Wu
Xin Eric Wang
Xuguang Duan
Yuwei Zhou
Wenwu Zhu
DiffM
30
48
0
05 May 2023
Controllable Visual-Tactile Synthesis
Controllable Visual-Tactile Synthesis
Ruihan Gao
Wenzhen Yuan
Jun-Yan Zhu
DiffM
27
6
0
04 May 2023
Personalize Segment Anything Model with One Shot
Personalize Segment Anything Model with One Shot
Renrui Zhang
Zhengkai Jiang
Ziyu Guo
Shilin Yan
Junting Pan
Xianzheng Ma
Hao Dong
Peng Gao
Hongsheng Li
MLLM
VLM
44
209
0
04 May 2023
Image Captioners Sometimes Tell More Than Images They See
Image Captioners Sometimes Tell More Than Images They See
Honori Udo
Takafumi Koshinaka
VLM
17
4
0
04 May 2023
Multimodal-driven Talking Face Generation via a Unified Diffusion-based
  Generator
Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator
Chao Xu
Shaoting Zhu
Junwei Zhu
Alexander I. Rudnicky
Jiangning Zhang
Ying Tai
Yong Liu
DiffM
65
14
0
04 May 2023
Catch Missing Details: Image Reconstruction with Frequency Augmented
  Variational Autoencoder
Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder
Xinmiao Lin
Yikang Li
Jenhao Hsiao
C. Ho
Yu Kong
90
18
0
04 May 2023
Shap-E: Generating Conditional 3D Implicit Functions
Shap-E: Generating Conditional 3D Implicit Functions
Heewoo Jun
Alex Nichol
DiffM
209
311
0
03 May 2023
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein
  Flows
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
Chao Du
Tianbo Li
Tianyu Pang
Shuicheng Yan
Min Lin
DiffM
BDL
50
12
0
03 May 2023
Diverse and Vivid Sound Generation from Text Descriptions
Diverse and Vivid Sound Generation from Text Descriptions
Guangwei Li
Xuenan Xu
Lingfeng Dai
Mengyue Wu
K. Yu
58
4
0
03 May 2023
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Changrong Xiao
S. Xu
Kunpeng Zhang
DiffM
34
10
0
03 May 2023
Multimodal Procedural Planning via Dual Text-Image Prompting
Multimodal Procedural Planning via Dual Text-Image Prompting
Yujie Lu
Pan Lu
Zhiyu Zoey Chen
Wanrong Zhu
Xinze Wang
William Yang Wang
LM&Ro
64
43
0
02 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Key-Locked Rank One Editing for Text-to-Image Personalization
Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
DiffM
146
168
0
02 May 2023
Previous
123...707172...858687
Next