Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,348 papers shown
Title
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
104
5
0
13 Mar 2025
V2Edit: Versatile Video Diffusion Editor for Videos and 3D Scenes
Yanming Zhang
Jun-Kun Chen
Jipeng Lyu
Yu-Xiong Wang
DiffM
VGen
48
0
0
13 Mar 2025
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He
Bo Cheng
Yuhang Ma
Qingxiang Jia
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Liebucha Wu
Dawei Leng
Yuhui Yin
DiffM
VLM
47
0
0
13 Mar 2025
Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs
Chongjun Tu
Peng Ye
Dongzhan Zhou
Lei Bai
Gang Yu
Tao Chen
Wanli Ouyang
59
0
0
13 Mar 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
59
0
0
13 Mar 2025
Enhancing Facial Privacy Protection via Weakening Diffusion Purification
Ali Salar
Qing Liu
Yingli Tian
Guoying Zhao
DiffM
56
0
0
13 Mar 2025
ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer
Bolin Chen
Baoquan Zhao
H. Xie
Yi Cai
Qing Li
Xudong Mao
DiffM
54
0
0
13 Mar 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Y. Guo
67
3
0
13 Mar 2025
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Yucheng Suo
Fan Ma
Kaixin Shen
Linchao Zhu
Yi Yang
VLM
47
0
0
12 Mar 2025
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
Jiun Tian Hoe
Weipeng Hu
Wei Zhou
Chao Xie
Ziwei Wang
Chee Seng Chan
Xudong Jiang
Y. Tan
61
0
0
12 Mar 2025
Context-guided Responsible Data Augmentation with Diffusion Models
Khawar Islam
Naveed Akhtar
46
1
0
12 Mar 2025
Leveraging Semantic Attribute Binding for Free-Lunch Color Control in Diffusion Models
Héctor Laria
Alexandra Gomez-Villa
Jiang Qin
Muhammad Atif Butt
Bogdan Raducanu
Javier Vázquez-Corral
J. Weijer
Kai Wang
DiffM
60
0
0
12 Mar 2025
Identity Preserving Latent Diffusion for Brain Aging Modeling
Gexin Huang
Zhangsihao Yang
Yalin Wang
Guido Gerig
Mengwei Ren
Xiaoxiao Li
MedIm
DiffM
72
0
0
11 Mar 2025
Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models
Ishaan Malhi
Praneet Dutta
Ellie Talius
Sally Ma
Brendan Driscoll
Krista Holden
G. Pruthi
Arunachalam Narayanaswamy
DiffM
51
0
0
11 Mar 2025
MGHanD: Multi-modal Guidance for authentic Hand Diffusion
Taehyeon Eum
Jieun Choi
Tae-Kyun Kim
38
0
0
11 Mar 2025
GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing
Yuanhao Wang
Cheng Zhang
Gonçalo Frazão
Jinlong Yang
Alexandru-Eugen Ichim
Thabo Beeler
Fernando De la Torre
DiffM
80
0
0
11 Mar 2025
Goal Conditioned Reinforcement Learning for Photo Finishing Tuning
Jiarui Wu
Yujin Wang
Lingen Li
Zhang Fan
Tianfan Xue
32
0
0
10 Mar 2025
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model
Lixue Gong
Xiaoxia Hou
Fanshi Li
Liang Li
Xiaochen Lian
...
Qi Zhang
Yuwei Zhang
Shijia Zhao
Jianchao Yang
Weilin Huang
DiffM
VLM
55
6
0
10 Mar 2025
VACE: All-in-One Video Creation and Editing
Zeyinzi Jiang
Zhen Han
Chaojie Mao
J. Zhang
Yulin Pan
Yu Liu
DiffM
VGen
53
5
0
10 Mar 2025
Color Alignment in Diffusion
Ka Chun Shum
Binh-Son Hua
Duc Thanh Nguyen
Sai-Kit Yeung
60
0
0
09 Mar 2025
PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation
Yanjie Pan
Q. He
Zhengkai Jiang
P. Xu
Chaoyi Wang
...
Yun Cao
Zhenye Gan
M. Chi
Bo Peng
Y. Wang
DiffM
61
0
0
09 Mar 2025
Consistent Image Layout Editing with Diffusion Models
Tao Xia
Yudi Zhang
Ting Liu Lei Zhang
DiffM
56
1
0
09 Mar 2025
VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Xinan He
Yue Zhou
Bing Fan
Bin Li
Guopu Zhu
Feng Ding
72
1
0
08 Mar 2025
Get In Video: Add Anything You Want to the Video
Shaobin Zhuang
Zhipeng Huang
Binxin Yang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Chong Sun
Zheng-Jun Zha
Chen Li
Y. Wang
DiffM
VGen
56
0
0
08 Mar 2025
Object-Centric World Model for Language-Guided Manipulation
Youngjoon Jeong
Junha Chun
S. Cha
Taesup Kim
OCL
VGen
138
1
0
08 Mar 2025
Towards Locally Explaining Prediction Behavior via Gradual Interventions and Measuring Property Gradients
Niklas Penzel
Joachim Denzler
FAtt
48
0
0
07 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
90
0
0
06 Mar 2025
Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Zhong Ji
Weilong Cao
Yan Zhang
Yanwei Pang
Jungong Han
X. Li
DiffM
VLM
47
0
0
06 Mar 2025
GenColor: Generative Color-Concept Association in Visual Design
Yihan Hou
Xingchen Zeng
Yusong Wang
Manling Yang
Xiaojiao Chen
Wei Zeng
DiffM
76
0
0
05 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei K. Zhang
Bo Yang
Hua Chen
59
1
0
05 Mar 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
Zhen Yang
Guibao Shen
Liang Hou
Mushui Liu
Luozhou Wang
Xin Tao
Pengfei Wan
Di Zhang
Ying-cong Chen
DiffM
74
0
0
04 Mar 2025
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
Kun Zhang
Jingyu Li
Z. Li
Jingjing Zhang
36
0
0
03 Mar 2025
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
Jiantao Lin
Xin Yang
Meixi Chen
Yingjie Xu
D. Yan
Leyi Wu
Xinli Xu
Lie Xu
Shunsi Zhang
Ying-Cong Chen
58
1
0
03 Mar 2025
CoInD: Enabling Logical Compositions in Diffusion Models
Sachit Gaudi
Gautam Sreekumar
Vishnu Naresh Boddeti
CoGe
73
0
0
03 Mar 2025
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn
Z. Qureshi
Jakub Powierza
Jamie Watson
Mohamed Sayed
3DGS
DiffM
71
0
0
03 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
71
0
0
03 Mar 2025
MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation
Yi Wang
Mushui Liu
Wanggui He
Longxiang Zhang
Z. Huang
...
H. Li
Weilong Dai
Mingli Song
Jie Song
Hao Jiang
MLLM
MoE
LRM
78
1
0
03 Mar 2025
Zero-Shot Head Swapping in Real-World Scenarios
S. Jeong
Taewoong Kang
Hyojin Jang
Jaegul Choo
34
0
0
02 Mar 2025
DiffBrush:Just Painting the Art by Your Hands
Jiaming Chu
Lei Jin
Tao Wang
Junliang Xing
Jian-jun Zhao
DiffM
37
0
0
28 Feb 2025
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model
Mingtao Guo
Guanyu Xing
Yanli Liu
DiffM
VGen
57
0
0
27 Feb 2025
SubZero: Composing Subject, Style, and Action via Zero-Shot Personalization
Shubhankar Borse
K. Bhardwaj
Mohammad Reza Karimi Dastjerdi
Hyojin Park
Shreya Kadambi
...
Prathamesh Mandke
Ankita Nayak
Harris Teague
Munawar Hayat
Fatih Porikli
DiffM
81
1
0
27 Feb 2025
Identity-preserving Distillation Sampling by Fixed-Point Iterator
SeonHwa Kim
Jiwon Kim
S. Park
Donghoon Ahn
Jiwon Kang
Seungryong Kim
Kyong Hwan Jin
Eunju Cha
41
0
0
27 Feb 2025
FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model
Lingzhou Mu
Baiji Liu
Ruonan Zhang
Guiming Mo
Jiawei Jin
Kai Zhang
Haozhi Huang
DiffM
VGen
56
1
0
26 Feb 2025
SVGEditBench V2: A Benchmark for Instruction-based SVG Editing
Kunato Nishina
Yusuke Matsui
65
0
0
26 Feb 2025
Bayesian Optimization for Controlled Image Editing via LLMs
Chengkun Cai
Haoliang Liu
Xu Zhao
Zhongyu Jiang
Tianfang Zhang
Zongkai Wu
Jenq-Neng Hwang
Serge Belongie
Lei Li
BDL
OffRL
103
2
0
25 Feb 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Yunhai Feng
Jiaming Han
Z. Yang
Xiangyu Yue
Sergey Levine
Jianlan Luo
LM&Ro
46
2
0
23 Feb 2025
DualNeRF: Text-Driven 3D Scene Editing via Dual-Field Representation
Yuxuan Xiong
Yue Shi
Yishun Dou
Bingbing Ni
DiffM
37
0
0
22 Feb 2025
A Critical Assessment of Modern Generative Models' Ability to Replicate Artistic Styles
Andrea Asperti
Franky George
Tiberio Marras
Razvan Ciprian Stricescu
Fabio Zanotti
EGVM
44
0
0
21 Feb 2025
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
44
1
0
19 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
Łukasz Staniszewski
Bartosz Cywiñski
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
146
0
0
17 Feb 2025
Previous
1
2
3
4
5
...
25
26
27
Next