Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,348 papers shown
Title
A Unified Conditional Framework for Diffusion-based Image Restoration
Y. Zhang
Xiaoyu Shi
Dasong Li
Xiaogang Wang
Jian Wang
Hongsheng Li
DiffM
24
22
0
31 May 2023
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li
Mohit Bansal
DiffM
27
49
0
30 May 2023
Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models
Ernie Chu
Shuohao Lin
Jun-Cheng Chen
DiffM
25
20
0
30 May 2023
LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images
Viraj Prabhu
Sriram Yenamandra
Prithvijit Chattopadhyay
Judy Hoffman
15
37
0
30 May 2023
Diffusion Model for Dense Matching
Jisu Nam
Gyuseong Lee
Sunwoo Kim
Ines Hyeonsu Kim
Hyoungwon Cho
Seyeong Kim
Seung Wook Kim
DiffM
21
9
0
30 May 2023
Nested Diffusion Processes for Anytime Image Generation
Noam Elata
Bahjat Kawar
T. Michaeli
Michael Elad
DiffM
21
4
0
30 May 2023
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Rui Yang
Lin Song
Yanwei Li
Sijie Zhao
Yixiao Ge
Xiu Li
Ying Shan
SyDa
MLLM
26
208
0
30 May 2023
Real-World Image Variation by Aligning Diffusion Inversion Chain
Yuechen Zhang
Jinbo Xing
Eric Lo
Jiaya Jia
27
34
0
30 May 2023
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
16
42
0
29 May 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
28
31
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
24
27
0
28 May 2023
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Bin Cui
DiffM
22
8
0
27 May 2023
CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography
Jiwen Yu
Xuanyu Zhang
You-song Xu
Jian Zhang
DiffM
35
45
0
26 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee Kenneth Wong
25
234
0
25 May 2023
Break-A-Scene: Extracting Multiple Concepts from a Single Image
Omri Avrahami
Kfir Aberman
Ohad Fried
Daniel Cohen-Or
Dani Lischinski
VLM
DiffM
30
165
0
25 May 2023
Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Lisa Dunlap
Alyssa Umino
Han Zhang
Jiezhi Yang
Joseph E. Gonzalez
Trevor Darrell
DiffM
23
71
0
25 May 2023
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion
Guangyao Zhai
Evin Pınar Örnek
Shun-cheng Wu
Yan Di
F. Tombari
Nassir Navab
Benjamin Busam
DiffM
32
12
0
25 May 2023
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
Yu-xin Zhang
Weiming Dong
Fan Tang
Nisha Huang
Haibin Huang
Chongyang Ma
Tong-Yee Lee
Oliver Deussen
Changsheng Xu
DiffM
25
75
0
25 May 2023
Towards Language-guided Interactive 3D Generation: LLMs as Layout Interpreter with Generative Feedback
Yiqi Lin
Hao Wu
Ruichen Wang
H. Lu
Xiaodong Lin
Hui Xiong
Lin Wang
3DV
40
12
0
25 May 2023
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Brandon Smith
Miguel Farinha
S. Hall
Hannah Rose Kirk
Aleksandar Shtedritski
Max Bain
34
19
0
24 May 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng
Wanrong Zhu
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
X. Wang
William Yang Wang
MLLM
25
161
0
24 May 2023
InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields
Dongqing Wang
Tong Zhang
Alaa Abboud
Sabine Süsstrunk
32
12
0
24 May 2023
ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation
Dongxu Yue
Qin Guo
Munan Ning
Jiaxi Cui
Yuesheng Zhu
Liuliang Yuan
DiffM
26
11
0
24 May 2023
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
Tuhin Chakrabarty
Arkadiy Saakyan
Olivia Winn
Artemis Panagopoulou
Yue Yang
Marianna Apidianaki
Smaranda Muresan
DiffM
25
41
0
24 May 2023
BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Dongxu Li
Junnan Li
Steven C. H. Hoi
28
303
0
24 May 2023
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
30
5
0
24 May 2023
Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Harman Singh
Poorva Garg
M. Gupta
Kevin Shah
Ashish Goswami
A. Mondal
Arnab Kumar Mondal
Dinesh Khandelwal
Dinesh Garg
Parag Singla
LM&Ro
16
1
0
23 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
23
34
0
23 May 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
121
6
0
23 May 2023
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Long Lian
Boyi Li
Adam Yala
Trevor Darrell
28
151
0
23 May 2023
Interactive Data Synthesis for Systematic Vision Adaptation via LLMs-AIGCs Collaboration
Qifan Yu
Juncheng Li
Wentao Ye
Siliang Tang
Yueting Zhuang
25
13
0
22 May 2023
The CLIP Model is Secretly an Image-to-Prompt Converter
Yuxuan Ding
Chunna Tian
Haoxuan Ding
Lingqiao Liu
DiffM
14
14
0
22 May 2023
Guided Motion Diffusion for Controllable Human Motion Synthesis
Korrawe Karunratanakul
Konpat Preechakul
Supasorn Suwajanakorn
Siyu Tang
DiffM
29
122
0
21 May 2023
InstructVid2Vid: Controllable Video Editing with Natural Language Instructions
Bosheng Qin
Juncheng Li
Siliang Tang
Tat-Seng Chua
Yueting Zhuang
VGen
DiffM
21
16
0
21 May 2023
Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
Byungjun Kim
Patrick Kwon
K. Lee
Myunggi Lee
Sookwan Han
Daesik Kim
Hanbyul Joo
DiffM
36
20
0
19 May 2023
RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture
Liangchen Song
Liangliang Cao
Hongyu Xu
Kai Kang
Feng Tang
Junsong Yuan
Yang Zhao
VGen
DiffM
21
44
0
18 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu
Xianjun Yang
Xiujun Li
X. Wang
William Yang Wang
EGVM
44
35
0
18 May 2023
DiffUTE: Universal Text Editing Diffusion Model
Haoxing Chen
Zhuoer Xu
Zhangxuan Gu
Jun Lan
Xing Zheng
Yaohui Li
Changhua Meng
Huijia Zhu
Weiqiang Wang
DiffM
26
34
0
18 May 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
Songwei Ge
Seungjun Nah
Guilin Liu
Tyler Poon
Andrew Tao
Bryan Catanzaro
David Jacobs
Jia-Bin Huang
Ming-Yu Liu
Yogesh Balaji
DiffM
VGen
37
252
0
17 May 2023
Face Recognition Using Synthetic Face Data
Omer Granoviter
Alexey Gruzdev
V. Loginov
Max Kogan
Orly Zvitia
41
1
0
17 May 2023
Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts
Yuyang Zhao
Enze Xie
Lanqing Hong
Zhenguo Li
G. Lee
DiffM
VGen
27
32
0
15 May 2023
Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era
Chenghao Li
Chaoning Zhang
Atish Waghwase
Lik-Hang Lee
François Rameau
Yang Yang
Sung-Ho Bae
Choong Seon Hong
46
74
0
10 May 2023
iEdit: Localised Text-guided Image Editing with Weak Supervision
Rumeysa Bodur
Erhan Gundogdu
Binod Bhattarai
Tae-Kyun Kim
M. Donoser
Loris Bazzani
DiffM
25
14
0
10 May 2023
Text-guided High-definition Consistency Texture Model
Zhibin Tang
Tiantong He
DiffM
15
6
0
10 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
Nisha Huang
Yu-xin Zhang
Weiming Dong
DiffM
VGen
27
16
0
09 May 2023
ReGeneration Learning of Diffusion Models with Rich Prompts for Zero-Shot Image Translation
Yupei Lin
Senyang Zhang
Xiaojun Yang
Xiao Wang
Yukai Shi
DiffM
30
5
0
08 May 2023
AADiff: Audio-Aligned Video Synthesis with Text-to-Image Diffusion
Seungwoo Lee
Chaerin Kong
D. Jeon
Nojun Kwak
DiffM
18
18
0
06 May 2023
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
Hong Chen
Yipeng Zhang
Simin Wu
Xin Eric Wang
Xuguang Duan
Yuwei Zhou
Wenwu Zhu
DiffM
26
47
0
05 May 2023
Multimodal Procedural Planning via Dual Text-Image Prompting
Yujie Lu
Pan Lu
Zhiyu Zoey Chen
Wanrong Zhu
X. Wang
William Yang Wang
LM&Ro
62
43
0
02 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Yoad Tewel
Rinon Gal
Gal Chechik
Y. Atzmon
DiffM
138
168
0
02 May 2023
Previous
1
2
3
...
24
25
26
27
Next