Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.05543
Cited By
v1
v2
v3 (latest)
Adding Conditional Control to Text-to-Image Diffusion Models
10 February 2023
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Adding Conditional Control to Text-to-Image Diffusion Models"
50 / 3,090 papers shown
Title
SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models
Subhadeep Koley
Tapas Kumar Dutta
Aneeshan Sain
Pinaki Nath Chowdhury
A. Bhunia
Yi-Zhe Song
VLM
121
0
0
18 Mar 2025
MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments
Zhixuan Liu
H. Zhu
R. Chen
Jonathan M Francis
Soonmin Hwang
Jiangning Zhang
Jean Oh
VGen
487
0
0
18 Mar 2025
SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
Yucheng Mao
Boyang Wang
Nilesh Kulkarni
Jeong Joon Park
DiffM
107
0
0
18 Mar 2025
Stochastic Trajectory Prediction under Unstructured Constraints
Hao Ma
Zhiqiang Pu
Shijie Wang
Boyin Liu
Huimu Wang
Yanyan Liang
Jianqiang Yi
90
0
0
18 Mar 2025
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
Yulin Pan
Xiangteng He
Chaojie Mao
Zhen Han
Zeyinzi Jiang
Junxuan Zhang
Yu Liu
EGVM
VLM
114
2
0
18 Mar 2025
MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling
Damian Boborzi
Phillip Mueller
Jonas Emrich
Dominik Schmid
Sebastian Mueller
Lars Mikelsons
DiffM
119
0
0
18 Mar 2025
Motion Synthesis with Sparse and Flexible Keyjoint Control
I. Hwang
Jinseok Bae
Donggeun Lim
Y. Kim
88
0
0
18 Mar 2025
TarPro: Targeted Protection against Malicious Image Editing
Kaixin Shen
Ruijie Quan
Jiaxu Miao
Jun Xiao
Yi Yang
111
1
0
18 Mar 2025
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
Nvidia
Hassan Abu Alhaija
Jose M. Alvarez
Maciej Bala
Tiffany Cai
...
Yuchong Ye
Xiaodong Yang
Boxin Wang
Fangyin Wei
Yu Zeng
VGen
169
8
0
18 Mar 2025
Advances in 4D Generation: A Survey
Qiaowei Miao
Kehan Li
Jinsheng Quan
Zhiyuan Min
Shaojie Ma
Yichao Xu
Yi Yang
Yawei Luo
148
2
0
18 Mar 2025
Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance
Lisha Li
Jingwen Hou
Weide Liu
Yuming Fang
Jiebin Yan
DiffM
81
1
0
18 Mar 2025
The Power of Context: How Multimodality Improves Image Super-Resolution
Kangfu Mei
Hossein Talebi
Mojtaba Ardakani
Vishal M. Patel
P. Milanfar
M. Delbracio
DiffM
124
2
0
18 Mar 2025
From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration
Mingyang Song
Xiaoye Qu
Jiawei Zhou
Yu Cheng
VLM
168
1
0
17 Mar 2025
Adams Bashforth Moulton Solver for Inversion and Editing in Rectified Flow
Yongjia Ma
Donglin Di
Xuan Liu
Xiaokai Chen
Lei Fan
Wei Chen
Tonghua Su
76
1
0
17 Mar 2025
TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark
Forouzan Fallah
Maitreya Patel
Agneet Chatterjee
Vlad I. Morariu
Chitta Baral
Yezhou Yang
CoGe
116
1
0
17 Mar 2025
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode
Junjia Huang
Pengxiang Yan
Jinhang Cai
Jiyang Liu
Zhao Wang
Yitong Wang
Xinglong Wu
Guanbin Li
DiffM
93
0
0
17 Mar 2025
GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching
Feng Qiao
Zhexiao Xiong
Eric Xing
Nathan Jacobs
DiffM
3DV
92
1
0
17 Mar 2025
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
113
0
0
17 Mar 2025
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Yaowei Li
Lingen Li
Zhaoyang Zhang
Xiaoyu Li
Guangzhi Wang
Hongxiang Li
Xiaodong Cun
Ying Shan
Yuexian Zou
DiffM
107
2
0
17 Mar 2025
Reward-Instruct: A Reward-Centric Approach to Fast Photo-Realistic Image Generation
Yihong Luo
Tianyang Hu
Weijian Luo
Kenji Kawaguchi
Jing Tang
EGVM
461
0
0
17 Mar 2025
PoseSyn: Synthesizing Diverse 3D Pose Data from In-the-Wild 2D Data
ChangHee Yang
H. Song
Seokhun Choi
Seungwoo Lee
Jaechul Kim
Hoseok Do
92
0
0
17 Mar 2025
Evolution-based Region Adversarial Prompt Learning for Robustness Enhancement in Vision-Language Models
Xiaojun Jia
Sensen Gao
Simeng Qin
Ke Ma
Xianrui Li
Yihao Huang
Wei Dong
Yang Liu
Xiaochun Cao
AAML
VLM
120
2
0
17 Mar 2025
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation
Daniil Selikhanovych
David Li
Aleksei Leonov
Nikita Gushchin
Sergei Kushneriuk
Alexander N. Filippov
Evgeny Burnaev
Iaroslav Koshelev
Alexander Korotin
DiffM
157
0
0
17 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
182
3
0
17 Mar 2025
UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation
Yinqiao Wang
Hao Xu
Pheng Ann Heng
Chi-Wing Fu
3DH
95
1
0
17 Mar 2025
Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
Shiran Yuan
Hao Zhao
DiffM
117
0
0
17 Mar 2025
PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior
Seanie Lee
Hwanhee Jung
Byoungsoo Koh
Qixing Huang
Sangho Yoon
Sangpil Kim
73
0
0
17 Mar 2025
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
Tsu-Jui Fu
Yusu Qian
Chen Chen
Wenze Hu
Zhe Gan
Yue Yang
221
2
0
16 Mar 2025
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
Jianwu Fang
Lei-lei Li
Zhedong Zheng
Hongkai Yu
Jianru Xue
Zhengguo Li
Tat-Seng Chua
21
0
0
16 Mar 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Arsh Koneru
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VLM
138
5
0
15 Mar 2025
DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving
Tao Wang
Cong Zhang
Xingguang Qu
Kun Li
Wen Liu
Chenyu Huang
117
1
0
15 Mar 2025
Snapmoji: Instant Generation of Animatable Dual-Stylized Avatars
Eric M. Chen
Di Liu
Sizhuo Ma
Michael Vasilkovsky
Bing Zhou
...
Wei Wang
Jiahao Luo
Dimitris N. Metaxas
Vincent Sitzmann
Jian Wang
3DGS
167
0
0
15 Mar 2025
VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction
Zijian He
Yuwei Ning
Yipeng Qin
Wangrun Wang
Sibei Yang
Liang Lin
G. Li
190
2
0
15 Mar 2025
Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System
Zhiyao Sun
Yu-Hui Wen
Matthieu Lin
Ho-Jui Fang
Sheng Ye
Tian Lv
Yang Liu
127
0
0
15 Mar 2025
Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption
Du Chen
Tianhe Wu
Kede Ma
Lei Zhang
84
4
0
14 Mar 2025
GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior
Zichen Tang
Yuan Yao
Miaomiao Cui
Liefeng Bo
Hongyu Yang
3DGS
DiffM
99
0
0
14 Mar 2025
Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information
Xuanqi Zhang
Jieun Lee
Chris Joslin
WonSook Lee
3DGS
107
0
0
14 Mar 2025
ACMo: Attribute Controllable Motion Generation
Mingjie Wei
Xuemei Xie
G. Shi
114
0
0
14 Mar 2025
Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
Hejia Chen
Haoxian Zhang
Shoulong Zhang
Xiaoqiang Liu
Sisi Zhuang
Yuan Zhang
Pengfei Wan
Di Zhang
Shuai Li
85
3
0
14 Mar 2025
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
Zijing Hu
Fengda Zhang
Long Chen
Kun Kuang
Jiahui Li
Kaifeng Gao
Jun Xiao
X. Wang
Wenwu Zhu
EGVM
235
5
0
14 Mar 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
Hongbin Lin
Zilu Guo
Yiming Zhang
Shuaicheng Niu
Yafeng Li
Ruiyi Zhang
Shuguang Cui
Zhen Li
DiffM
82
1
0
14 Mar 2025
Industrial-Grade Sensor Simulation via Gaussian Splatting: A Modular Framework for Scalable Editing and Full-Stack Validation
Xianming Zeng
Sicong Du
Qifeng Chen
Lizhe Liu
Haoyu Shu
...
Peng Chen
Yapeng Xue
Chunming Zhao
Sheng Yang
Qiang Li
3DGS
104
0
0
14 Mar 2025
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models
Hongyang Wei
Shixuan Liu
C. Yuan
Lefei Zhang
54
1
0
14 Mar 2025
Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities
Ruchika Chavhan
Abhinav Mehrotra
Malcolm Chadwick
Alberto Gil C. P. Ramos
Luca Morreale
Mehdi Noroozi
Sourav Bhattacharya
91
0
0
14 Mar 2025
LUSD: Localized Update Score Distillation for Text-Guided Image Editing
Worameth Chinchuthakun
Tossaporn Saengja
Nontawat Tritrong
Pitchaporn Rewatbowornwong
Pramook Khungurn
Supasorn Suwajanakorn
DiffM
104
0
0
14 Mar 2025
Multi-Stage Generative Upscaler: Reconstructing Football Broadcast Images via Diffusion Models
Luca Martini
Daniele Zolezzi
Saverio Iacono
Gianni Vercelli
DiffM
60
0
0
14 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
99
3
0
13 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
144
0
0
13 Mar 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
130
0
0
13 Mar 2025
PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation
Sen Wang
Dongliang Zhou
Liang Xie
Chao Xu
Ye Yan
Erwei Yin
DiffM
154
3
0
13 Mar 2025
Previous
1
2
3
...
9
10
11
...
60
61
62
Next