ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXivPDFHTML

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 4,339 papers shown
Title
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
31
17
0
21 Nov 2022
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Ajay Jain
Amber Xie
Pieter Abbeel
DiffM
38
89
0
21 Nov 2022
Video Background Music Generation: Dataset, Method and Evaluation
Video Background Music Generation: Dataset, Method and Evaluation
Le Zhuo
Zhaokai Wang
Baisen Wang
Yue Liao
Chenxi Bao
Stanley Peng
Miao Lu
Xiaobo Li
Fei Fang
Si Liu
VGen
28
28
0
21 Nov 2022
Investigating Prompt Engineering in Diffusion Models
Investigating Prompt Engineering in Diffusion Models
Sam Witteveen
Martin Andrews
11
58
0
21 Nov 2022
Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
  Pre-Training
Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training
Ling Yang
Zhilin Huang
Yang Song
Shenda Hong
Ge Li
Wentao Zhang
Bin Cui
Guohao Li
Ming-Hsuan Yang
33
52
0
21 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
41
373
0
20 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
29
63
0
20 Nov 2022
IC3D: Image-Conditioned 3D Diffusion for Shape Generation
IC3D: Image-Conditioned 3D Diffusion for Shape Generation
Cristian Sbrolli
Paolo Cudrano
Matteo Frosi
Matteo Matteucci
DiffM
22
7
0
20 Nov 2022
DiffStyler: Controllable Dual Diffusion for Text-Driven Image
  Stylization
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
Nisha Huang
Yuxin Zhang
Fan Tang
Chongyang Ma
Haibin Huang
Yong Zhang
Weiming Dong
Changsheng Xu
DiffM
28
41
0
19 Nov 2022
EDGE: Editable Dance Generation From Music
EDGE: Editable Dance Generation From Music
Jo-Han Tseng
Rodrigo Castellon
Chenxi Liu
28
223
0
19 Nov 2022
Magic3D: High-Resolution Text-to-3D Content Creation
Magic3D: High-Resolution Text-to-3D Content Creation
Chen-Hsuan Lin
Jun Gao
Luming Tang
Towaki Takikawa
Fangyin Wei
Xun Huang
Karsten Kreis
Sanja Fidler
Ming Liu
Nayeon Lee
67
1,119
0
18 Nov 2022
Invariant Learning via Diffusion Dreamed Distribution Shifts
Invariant Learning via Diffusion Dreamed Distribution Shifts
Priyatham Kattakinda
Alexander Levine
S. Feizi
DiffM
23
10
0
18 Nov 2022
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and
  Generation
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevicius
Zexiang Xu
Matthew Fisher
Paul Henderson
Hakan Bilen
Niloy J. Mitra
Paul Guerrero
53
155
0
17 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
94
1,711
0
17 Nov 2022
Conffusion: Confidence Intervals for Diffusion Models
Conffusion: Confidence Intervals for Diffusion Models
Eliahu Horwitz
Yedid Hoshen
DiffM
29
28
0
17 Nov 2022
Null-text Inversion for Editing Real Images using Guided Diffusion
  Models
Null-text Inversion for Editing Real Images using Guided Diffusion Models
Ron Mokady
Amir Hertz
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
25
537
0
17 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
68
443
0
17 Nov 2022
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion
  Models
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
Simon Alexanderson
Rajmund Nagy
Jonas Beskow
G. Henter
DiffM
VGen
26
166
0
17 Nov 2022
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image
  Generative Models
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models
Ninareh Mehrabi
Palash Goyal
Apurv Verma
Jwala Dhamala
Varun Kumar
Qian Hu
Kai-Wei Chang
R. Zemel
Aram Galstyan
Rahul Gupta
31
6
0
17 Nov 2022
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
Vaclav Kosar
A. Hoskovec
Milan Šulc
Radek Bartyzal
VLM
32
3
0
17 Nov 2022
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
  Coding Networks
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks
Tommaso Salvatori
Yuhang Song
Yordan Yordanov
Beren Millidge
Zheng R. Xu
Lei Sha
Cornelius Emde
Rafal Bogacz
Thomas Lukasiewicz
36
10
0
16 Nov 2022
Versatile Diffusion: Text, Images and Variations All in One Diffusion
  Model
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
Xingqian Xu
Zhangyang Wang
Eric Zhang
Kai Wang
Humphrey Shi
DiffM
43
186
0
15 Nov 2022
Will Large-scale Generative Models Corrupt Future Datasets?
Will Large-scale Generative Models Corrupt Future Datasets?
Ryuichiro Hataya
Han Bao
Hiromi Arai
27
52
0
15 Nov 2022
Cross-Reality Re-Rendering: Manipulating between Digital and Physical
  Realities
Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities
Siddhartha Datta
33
0
0
15 Nov 2022
Direct Inversion: Optimization-Free Text-Driven Real Image Editing with
  Diffusion Models
Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models
Adham Elarabawy
Harish Kamath
Samuel Denton
DiffM
22
16
0
15 Nov 2022
Extreme Generative Image Compression by Learning Text Embedding from
  Diffusion Models
Extreme Generative Image Compression by Learning Text Embedding from Diffusion Models
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
33
23
0
14 Nov 2022
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image
  Generation
Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation
Zhihong Pan
Xiaoxia Zhou
Hao Tian
DiffM
20
11
0
14 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
89
681
0
14 Nov 2022
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
G. Metzer
Elad Richardson
Or Patashnik
Raja Giryes
Daniel Cohen-Or
DiffM
71
453
0
14 Nov 2022
Language models are good pathologists: using attention-based sequence
  reduction and text-pretrained transformers for efficient WSI classification
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
36
3
0
14 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis
  in Quantized Latent Spaces
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
19
11
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
35
4
0
13 Nov 2022
Design of Unmanned Air Vehicles Using Transformer Surrogate Models
Design of Unmanned Air Vehicles Using Transformer Surrogate Models
Adam D. Cobb
Anirban Roy
Daniel Elenius
Susmit Jha
AI4CE
22
1
0
11 Nov 2022
Efficient HLA imputation from sequential SNPs data by Transformer
Efficient HLA imputation from sequential SNPs data by Transformer
Kaho Tanaka
Kosuke Kato
Naoki Nonaka
J. Seita
BDL
27
5
0
11 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
67
7
0
11 Nov 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in
  Diffusion Models
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
54
272
0
09 Nov 2022
DiffPhase: Generative Diffusion-based STFT Phase Retrieval
DiffPhase: Generative Diffusion-based STFT Phase Retrieval
Tal Peer
Simon Welker
Timo Gerkmann
DiffM
45
7
0
08 Nov 2022
Self-conditioned Embedding Diffusion for Text Generation
Self-conditioned Embedding Diffusion for Text Generation
Robin Strudel
Corentin Tallec
Florent Altché
Yilun Du
Yaroslav Ganin
...
Will Grathwohl
Nikolay Savinov
Sander Dieleman
Laurent Sifre
Rémi Leblond
DiffM
26
83
0
08 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
37
32
0
07 Nov 2022
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D
  Medical Image Generation
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
Firas Khader
Gustav Mueller-Franzes
Soroosh Tayebi Arasteh
T. Han
Christoph Haarburger
...
Johannes Stegmaier
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
DiffM
MedIm
22
65
0
07 Nov 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for
  Text-to-Image Synthesis
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
22
36
0
04 Nov 2022
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Andreas Stöckl
29
21
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
81
804
0
02 Nov 2022
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
Cheng Lu
Yuhao Zhou
Fan Bao
Jianfei Chen
Chongxuan Li
Jun Zhu
DiffM
61
557
0
02 Nov 2022
MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic
  Model
MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model
Junde Wu
Rao Fu
Huihui Fang
Yu Zhang
Yehui Yang
Haoyi Xiong
Huiying Liu
Yanwu Xu
MedIm
VLM
DiffM
103
241
0
01 Nov 2022
MagicMix: Semantic Mixing with Diffusion Models
MagicMix: Semantic Mixing with Diffusion Models
Jun Hao Liew
Hanshu Yan
Daquan Zhou
Jiashi Feng
DiffM
187
60
0
28 Oct 2022
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal
  Guidance
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li
Xue Xu
Xinyan Xiao
Jiacheng Liu
Hu Yang
...
Zhanpeng Wang
Zhifan Feng
Qiaoqiao She
Yajuan Lyu
Hua Wu
121
29
0
28 Oct 2022
Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation
  with Wordless Training
Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training
Junfan Lin
Jianlong Chang
Lingbo Liu
Guanbin Li
Liang Lin
Qi Tian
Changan Chen
VGen
66
40
0
28 Oct 2022
Deep Generative Models on 3D Representations: A Survey
Deep Generative Models on 3D Representations: A Survey
Zifan Shi
Sida Peng
Yinghao Xu
Andreas Geiger
Yiyi Liao
Yujun Shen
MedIm
3DV
47
0
0
27 Oct 2022
How well can Text-to-Image Generative Models understand Ethical Natural
  Language Interventions?
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
Hritik Bansal
Da Yin
Masoud Monajatipoor
Kai-Wei Chang
53
99
0
27 Oct 2022
Previous
123...828384858687
Next