Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
140
2
0
06 Mar 2025
Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Zhong Ji
Weilong Cao
Yan Zhang
Yanwei Pang
Jungong Han
Xuelong Li
DiffM
VLM
88
0
0
06 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
177
1
0
05 Mar 2025
GenColor: Generative Color-Concept Association in Visual Design
Yihan Hou
Xingchen Zeng
Yusong Wang
Manling Yang
Xiaojiao Chen
Wei Zeng
DiffM
118
0
0
05 Mar 2025
Efficient Training-Free High-Resolution Synthesis with Energy Rectification in Diffusion Models
Zhen Yang
Guibao Shen
Liang Hou
Mushui Liu
Luozhou Wang
Xin Tao
Pengfei Wan
Di Zhang
Di Zhang
Ying-Cong Chen
DiffM
126
1
0
04 Mar 2025
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn
Z. Qureshi
Jakub Powierza
Jamie Watson
Mohamed Sayed
3DGS
DiffM
177
1
0
03 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
188
1
0
03 Mar 2025
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
Jiantao Lin
Xin Yang
Meixi Chen
Yingjie Xu
D. Yan
Leyi Wu
Xinli Xu
Lie Xu
Shunsi Zhang
Ying-Cong Chen
127
2
0
03 Mar 2025
MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation
Yi Wang
Mushui Liu
Wanggui He
Longxiang Zhang
Z. Huang
...
Haoyang Li
Weilong Dai
Mingli Song
Jie Song
Hao Jiang
MLLM
MoE
LRM
124
9
0
03 Mar 2025
Composed Multi-modal Retrieval: A Survey of Approaches and Applications
Kun Zhang
Jingyu Li
Zhiyu Li
Jingjing Zhang
93
0
0
03 Mar 2025
CoInD: Enabling Logical Compositions in Diffusion Models
Sachit Gaudi
Gautam Sreekumar
Vishnu Boddeti
CoGe
117
1
0
03 Mar 2025
Zero-Shot Head Swapping in Real-World Scenarios
S. Jeong
Taewoong Kang
Hyojin Jang
Jaegul Choo
94
0
0
02 Mar 2025
DiffBrush:Just Painting the Art by Your Hands
Jiaming Chu
Lei Jin
Tao Wang
Junliang Xing
Jian-jun Zhao
DiffM
68
0
0
28 Feb 2025
High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model
Mingtao Guo
Guanyu Xing
Yanli Liu
DiffM
VGen
104
1
0
27 Feb 2025
SubZero: Composing Subject, Style, and Action via Zero-Shot Personalization
Shubhankar Borse
K. Bhardwaj
Mohammad Reza Karimi Dastjerdi
Hyojin Park
Shreya Kadambi
...
Prathamesh Mandke
Ankita Nayak
Harris Teague
Munawar Hayat
Fatih Porikli
DiffM
182
1
0
27 Feb 2025
Identity-preserving Distillation Sampling by Fixed-Point Iterator
SeonHwa Kim
Jiwon Kim
S. Park
Donghoon Ahn
Jiwon Kang
Seungryong Kim
Kyong Hwan Jin
Eunju Cha
75
0
0
27 Feb 2025
FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion model
Lingzhou Mu
Baiji Liu
Ruonan Zhang
Guiming Mo
Jiawei Jin
Kai Zhang
Haozhi Huang
DiffM
VGen
146
2
0
26 Feb 2025
SVGEditBench V2: A Benchmark for Instruction-based SVG Editing
Kunato Nishina
Yusuke Matsui
97
1
0
26 Feb 2025
Bayesian Optimization for Controlled Image Editing via LLMs
Chengkun Cai
Haoliang Liu
Xu Zhao
Zhongyu Jiang
Tianfang Zhang
Zongkai Wu
Lei Li
Lei Li
Lei Li
BDL
OffRL
174
2
0
25 Feb 2025
Contrastive Visual Data Augmentation
Yu Zhou
B. Li
Mohan Tang
Xiaomeng Jin
Te-Lin Wu
Kuan-Hao Huang
Heng Ji
Kai-Wei Chang
Nanyun Peng
117
0
0
24 Feb 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Yunhai Feng
Jiaming Han
Zhiyong Yang
Xiangyu Yue
Sergey Levine
Jianlan Luo
LM&Ro
125
7
0
23 Feb 2025
DualNeRF: Text-Driven 3D Scene Editing via Dual-Field Representation
Yuxuan Xiong
Yue Shi
Yishun Dou
Bingbing Ni
DiffM
69
0
0
22 Feb 2025
A Critical Assessment of Modern Generative Models' Ability to Replicate Artistic Styles
Andrea Asperti
Franky George
Tiberio Marras
Razvan Ciprian Stricescu
Fabio Zanotti
EGVM
93
0
0
21 Feb 2025
A Comprehensive Survey on Composed Image Retrieval
Xuemeng Song
Haoqiang Lin
Haokun Wen
Bohan Hou
Mingzhu Xu
Liqiang Nie
131
3
0
19 Feb 2025
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation
Minghao Fu
Guo-Hua Wang
Liangfu Cao
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
79
0
0
18 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
122
4
0
17 Feb 2025
FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion
Yufan Zhou
Haoyu Shen
Huan Wang
DiffM
265
1
0
17 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
Łukasz Staniszewski
Bartosz Cywiński
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
471
1
0
17 Feb 2025
Human-Centric Foundation Models: Perception, Generation and Agentic Modeling
Shixiang Tang
Yanjie Wang
Lu Chen
Yuan Wang
Sida Peng
Dan Xu
W. Ouyang
VGen
209
2
0
12 Feb 2025
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
H. Seo
Wongi Jeong
Jae-sun Seo
Se Young Chun
140
0
0
12 Feb 2025
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
Zhenxing Mi
Kuan-Chieh Wang
Guocheng Qian
Hanrong Ye
Runtao Liu
Sergey Tulyakov
Kfir Aberman
Dan Xu
LRM
97
2
0
12 Feb 2025
Dual Caption Preference Optimization for Diffusion Models
Amir Saeidi
Yiran Luo
Agneet Chatterjee
Shamanthak Hegde
Bimsara Pathiraja
Yezhou Yang
Chitta Baral
DiffM
106
0
0
09 Feb 2025
AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection
Shuheng Zhang
Yang Liu
Hongbo Zhou
Jun Peng
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
VGen
90
2
0
08 Feb 2025
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
Jinbo Xing
Long Mai
Cusuh Ham
Jiahui Huang
Aniruddha Mahapatra
Chi-Wing Fu
T. Wong
Feng Liu
DiffM
VGen
279
5
0
06 Feb 2025
CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing
Yu Yuan
Shizhao Sun
Qi Liu
Jiang Bian
144
2
0
06 Feb 2025
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
Zhekai Du
Yinjie Min
Jingjing Li
Ke Lu
Changliang Zou
Liuhua Peng
Tingjin Chu
Mingming Gong
464
2
0
05 Feb 2025
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Xianghui Ze
Zhenbo Song
Qiwei Wang
Jianfeng Lu
Yujiao Shi
108
1
0
05 Feb 2025
Improved Training Technique for Latent Consistency Models
Quan Dao
Khanh Doan
Di Liu
Trung Le
Dimitris N. Metaxas
161
3
0
03 Feb 2025
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
Rohit Gandikota
Zongze Wu
Richard Zhang
David Bau
Eli Shechtman
Nick Kolkin
DiffM
83
2
0
03 Feb 2025
Consistent Video Colorization via Palette Guidance
Han Wang
Yuang Zhang
Yuhong Zhang
Lingxiao Lu
Li Song
DiffM
VGen
129
0
0
31 Jan 2025
Inkspire: Supporting Design Exploration with Generative AI through Analogical Sketching
David Chuan-En Lin
Hyeonsu B Kang
Nikolas Martelaro
A. Kittur
Yan-Ying Chen
Matthew K. Hong
162
3
0
30 Jan 2025
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control
Aosong Feng
Weikang Qiu
Jinbin Bai
Xiao Zhang
Zhen Dong
Kaicheng Zhou
Rex Ying
Leandros Tassiulas
DiffM
122
6
0
28 Jan 2025
CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Tim Broedermann
Daniel Gehrig
Yuqian Fu
Luc Van Gool
135
31
0
28 Jan 2025
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
Hossein Mirzaei
Mackenzie W. Mathis
OODD
AAML
129
4
0
28 Jan 2025
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
Andrey Palaev
Adil Mehmood Khan
S. M. Ahsan Kazmi
DiffM
120
0
0
23 Jan 2025
3D Object Manipulation in a Single Image using Generative Models
Ruisi Zhao
Zechuan Zhang
Zongxin Yang
Yi Yang
99
1
0
22 Jan 2025
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
M. Gwilliam
Han Cai
Di Wu
Abhinav Shrivastava
Zhiyu Cheng
226
1
0
22 Jan 2025
Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement
Christoph Gebhardt
Robin Willardt
Seyedmorteza Sadat
Chih-Wei Ning
Andreas Brombach
Jie Song
Otmar Hilliges
Christian Holz
105
0
0
21 Jan 2025
ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Shiyue Zhang
Zheng Chong
Xi Lu
Wenqing Zhang
Haoxiang Li
Xujie Zhang
Jiehui Huang
Xiao Dong
Xiaodan Liang
DiffM
81
0
0
21 Jan 2025
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Zibo Zhao
Zeqiang Lai
Qingxiang Lin
Yunfei Zhao
Haolin Liu
...
Jingwei Huang
Chunchao Guo
Jie Jiang
Jingwei Huang
Chunchao Guo
263
45
0
21 Jan 2025
Previous
1
2
3
...
5
6
7
...
27
28
29
Next