ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.11487
  4. Cited By
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

23 May 2022
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
Emily L. Denton
Seyed Kamyar Seyed Ghasemipour
Burcu Karagol Ayan
S. S. Mahdavi
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
    VLM
ArXiv (abs)PDFHTML

Papers citing "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding"

50 / 1,364 papers shown
Title
IGSM: Improved Geometric and Sensitivity Matching for Finetuning Pruned Diffusion Models
IGSM: Improved Geometric and Sensitivity Matching for Finetuning Pruned Diffusion Models
Caleb Zheng
Eli Shlizerman
DiffM
35
0
0
03 Jun 2025
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
Lingwei Dang
Ruizhi Shao
Hongwen Zhang
Wei Min
Yebin Liu
Qingyao Wu
DiffMVGen
82
0
0
03 Jun 2025
FlexPainter: Flexible and Multi-View Consistent Texture Generation
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Dongyu Yan
Leyi Wu
Jiantao Lin
Luozhou Wang
Tianshuo Xu
Zhifei Chen
Zhen Yang
Lie Xu
Shunsi Zhang
Yingcong Chen
DiffM
62
0
0
03 Jun 2025
ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions
ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions
Di Chang
Mingdeng Cao
Yichun Shi
Bo Liu
Shengqu Cai
Shijie Zhou
Weilin Huang
Gordon Wetzstein
M. Soleymani
Peng Wang
DiffMVGen
42
0
0
03 Jun 2025
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
Bimsara Pathiraja
Maitreya Patel
Shivam Singh
Yezhou Yang
Chitta Baral
22
0
0
03 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
25
0
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
50
0
0
02 Jun 2025
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
Geunmin Hwang
Hyun-kyu Ko
Younghyun Kim
S. W. Lee
Eunbyung Park
VGen
50
0
0
02 Jun 2025
WorldExplorer: Towards Generating Fully Navigable 3D Scenes
WorldExplorer: Towards Generating Fully Navigable 3D Scenes
Manuel-Andreas Schneider
Lukas Höllein
Matthias Nießner
VGen
51
0
0
02 Jun 2025
TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation
TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation
Amin Karimi Monsefi
Mridul Khurana
R. Ramnath
Anuj Karpatne
Wei-Lun Chao
Cheng Zhang
60
0
0
02 Jun 2025
IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection
IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection
Wayne Zhang
Changjiang Jiang
Zhonghao Zhang
Chenyang Si
Fengchang Yu
Wei Peng
42
0
0
01 Jun 2025
Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution
Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution
Shijun Shi
Jing Xu
Lijing Lu
Zhihang Li
Kai Hu
37
0
0
01 Jun 2025
Inference-Time Alignment of Diffusion Models with Evolutionary Algorithms
Inference-Time Alignment of Diffusion Models with Evolutionary Algorithms
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiruvathukal
James C. Davis
Yung-Hsiang Lu
27
0
0
30 May 2025
MotionPersona: Characteristics-aware Locomotion Control
MotionPersona: Characteristics-aware Locomotion Control
Mingyi Shi
Wei Liu
Jidong Mei
Wangpok Tse
Rui Chen
Xuelin Chen
Taku Komura
VGen
27
0
0
30 May 2025
Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation
Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation
Yucheng Zhou
Jiahao Yuan
Qianning Wang
EGVM
25
0
0
30 May 2025
LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework
LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework
Xin Kang
Zihan Zheng
Lei Chu
Yue Gao
Jiahao Li
Hao Pan
Xuejin Chen
Yan Lu
DiffM
33
0
0
30 May 2025
Generative AI for Urban Design: A Stepwise Approach Integrating Human Expertise with Multimodal Diffusion Models
Generative AI for Urban Design: A Stepwise Approach Integrating Human Expertise with Multimodal Diffusion Models
Mingyi He
Yuebing Liang
Shenhao Wang
Yunhan Zheng
Qingyi Wang
Dingyi Zhuang
Li Tian
Jinhua Zhao
AI4CE
25
0
0
30 May 2025
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
Jinlu Zhang
Yixin Chen
Zan Wang
Jie Yang
Yizhou Wang
Siyuan Huang
39
1
0
30 May 2025
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation
Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation
Ximing Xing
Yandong Guan
Jing Zhang
Dong Xu
Qian Yu
LRM
67
0
0
30 May 2025
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva
Hidir Yesiltepe
Pinar Yanardag
OffRL
78
0
0
29 May 2025
Efficiently Access Diffusion Fisher: Within the Outer Product Span Space
Efficiently Access Diffusion Fisher: Within the Outer Product Span Space
Fangyikang Wang
Hubery Yin
Shaobin Zhuang
Huminhao Zhu
Yinan Li
Lei Qian
Chao Zhang
Hanbin Zhao
Hui Qian
Chen Li
48
1
0
29 May 2025
Cora: Correspondence-aware image editing using few step diffusion
Cora: Correspondence-aware image editing using few step diffusion
Amirhossein Almohammadi
Aryan Mikaeili
Sauradip Nag
Negar Hassanpour
Andrea Tagliasacchi
Ali Mahdavi-Amiri
DiffM
21
0
0
29 May 2025
A Survey of Generative Categories and Techniques in Multimodal Large Language Models
A Survey of Generative Categories and Techniques in Multimodal Large Language Models
Longzhen Han
Awes Mubarak
Almas Baimagambetov
Nikolaos Polatidis
Thar Baker
LRM
37
0
0
29 May 2025
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion
Gwanghyun Kim
Xueting Li
Ye Yuan
Koki Nagano
Tianye Li
Jan Kautz
Se Young Chun
Umar Iqbal
DiffM
63
0
0
29 May 2025
UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
Tianhang Wang
Fan Lu
Sanqing Qu
Guo Yu
Shihang Du
Ya Wu
Yuan Huang
G. Chen
42
0
0
29 May 2025
Research on Driving Scenario Technology Based on Multimodal Large Lauguage Model Optimization
Research on Driving Scenario Technology Based on Multimodal Large Lauguage Model Optimization
Wang Mengjie
Zhu Huiping
Li Jian
Shi Wenxiu
Zhang Song
20
0
0
28 May 2025
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
S. Li
Lei Wang
Kai Wang
Tao Liu
J. Xie
Joost van de Weijer
Fahad Shahbaz Khan
Shiqi Yang
Yaxing Wang
Jian Yang
53
0
0
28 May 2025
AlignGen: Boosting Personalized Image Generation with Cross-Modality Prior Alignment
AlignGen: Boosting Personalized Image Generation with Cross-Modality Prior Alignment
Yiheng Lin
Shifang Zhao
Ting Liu
Xiaochao Qu
Luoqi Liu
Yao Zhao
Yunchao Wei
DiffM
44
0
0
28 May 2025
SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation
SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation
Dekai Zhu
Yixuan Hu
Youquan Liu
Dongyue Lu
Lingdong Kong
Slobodan Ilic
DiffM
60
0
0
28 May 2025
CoC: Chain-of-Cancer based on Cross-Modal Autoregressive Traction for Survival Prediction
CoC: Chain-of-Cancer based on Cross-Modal Autoregressive Traction for Survival Prediction
Haipeng Zhou
Sicheng Yang
Sihan Yang
J. Qin
Lei Chen
Lei Zhu
10
0
0
28 May 2025
PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning
PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning
Arnulf Jentzen
Julian Kranz
Adrian Riekert
ODL
60
0
0
28 May 2025
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
Dar-Yen Chen
Hmrishav Bandyopadhyay
Kai Zou
Yi-Zhe Song
51
0
0
27 May 2025
OrienText: Surface Oriented Textual Image Generation
OrienText: Surface Oriented Textual Image Generation
Shubham Paliwal
Arushi Jain
Monika Sharma
Vikram Jamwal
Lovekesh Vig
DiffM
788
0
0
27 May 2025
Advancing high-fidelity 3D and Texture Generation with 2.5D latents
Advancing high-fidelity 3D and Texture Generation with 2.5D latents
Xin Yang
Jiantao Lin
Yingjie Xu
Haodong Li
Yingcong Chen
3DV
62
0
0
27 May 2025
ConsiStyle: Style Diversity in Training-Free Consistent T2I Generation
ConsiStyle: Style Diversity in Training-Free Consistent T2I Generation
Yohai Mazuz
Janna Bruner
Lior Wolf
DiffM
56
0
0
27 May 2025
Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
Minimalist Softmax Attention Provably Learns Constrained Boolean Functions
Jerry Yao-Chieh Hu
Xiwen Zhang
Maojiang Su
Zhao Song
Han Liu
MLT
243
1
0
26 May 2025
Regularized Personalization of Text-to-Image Diffusion Models without Distributional Drift
Regularized Personalization of Text-to-Image Diffusion Models without Distributional Drift
Gihoon Kim
Hyungjin Park
Taesup Kim
DiffMVLM
195
0
0
26 May 2025
Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation
Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation
Yuhao He
Jinyu Tian
Haiwei Wu
Jianqing Li
DiffMAAML
46
0
0
26 May 2025
ART-DECO: Arbitrary Text Guidance for 3D Detailizer Construction
ART-DECO: Arbitrary Text Guidance for 3D Detailizer Construction
Qimin Chen
Yuezhi Yang
Yifang Wang
Vladimir G. Kim
Siddhartha Chaudhuri
Hao Zhang
Zhiqin Chen
DiffM
75
0
0
26 May 2025
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
H. Zhang
Dexiang Hong
Maoke Yang
Yutao Chen
Zhao Zhang
Jie Shao
Xinglong Wu
Zuxuan Wu
Yu Jiang
DiffMAI4CE
170
0
0
25 May 2025
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang
Jialei Zhou
Xinchen Li
Qi Zhang
Zhongwei Wan
Tianyu Wang
Duoqian Miao
Changwei Wang
LongBing Cao
DiffM
55
2
0
25 May 2025
Training-free Stylized Text-to-Image Generation with Fast Inference
Training-free Stylized Text-to-Image Generation with Fast Inference
X. Ma
Yaohui Wang
Xinyuan Chen
Tien-Tsin Wong
C. L. P. Chen
811
0
0
25 May 2025
Adaptive Diffusion Guidance via Stochastic Optimal Control
Adaptive Diffusion Guidance via Stochastic Optimal Control
Iskander Azangulov
Peter Potaptchik
Qinyu Li
Eddie Aamari
George Deligiannidis
Judith Rousseau
25
0
0
25 May 2025
Rethinking Direct Preference Optimization in Diffusion Models
Rethinking Direct Preference Optimization in Diffusion Models
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
777
0
0
24 May 2025
LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders
LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders
Borna Khodabandeh
Amirabbas Afzali
Amirhossein Afsharrad
Seyed Shahabeddin Mousavi
Sanjay Lall
Sajjad Amini
Seyed-Mohsen Moosavi-Dezfooli
AAML
36
0
0
24 May 2025
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
Jiayu Wang
Yang Jiao
Yue Yu
Tianwen Qian
Shaoxiang Chen
Jingjing Chen
Yu Jiang
MLLMLM&MAELM
110
0
0
24 May 2025
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models
Min Cheng
Fatemeh Doudi
D. Kalathil
Mohammad Ghavamzadeh
P. R. Kumar
57
0
0
24 May 2025
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Wenchao Zhang
Jiahe Tian
Runze He
Jizhong Han
Jiao Dai
Miaomiao Feng
Wei Mi
Xiaodan Zhang
111
0
0
24 May 2025
Affective Image Editing: Shaping Emotional Factors via Text Descriptions
Affective Image Editing: Shaping Emotional Factors via Text Descriptions
Peixuan Zhang
Shuchen Weng
Chengxuan Zhu
Binghao Tang
Zijian Jia
Si Li
Boxin Shi
DiffM
29
0
0
24 May 2025
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
Yanting Miao
William Loh
Suraj Kothawade
Pacal Poupart
35
0
0
23 May 2025
Previous
12345...262728
Next