Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
F-ViTA: Foundation Model Guided Visible to Thermal Translation
Jay N. Paranjape
C. D. Melo
Vishal M. Patel
VGen
82
0
0
03 Apr 2025
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
Jiadong Wang
Jingyuan Liu
Xin Sun
Krishna Kumar Singh
Zhixin Shu
...
Nanxuan Zhao
Tuanfeng Y. Wang
Simon Chen
Ulrich Neumann
Jae Shin Yoon
72
0
0
03 Apr 2025
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
Hancheng Min
Chris Callison-Burch
Rene Vidal
DiffM
KELM
149
0
0
03 Apr 2025
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao
Peiyuan Zhang
Kexian Tang
Hao Li
Zicheng Zhang
...
Guangtao Zhai
Junchi Yan
Hua Yang
Xue Yang
Haodong Duan
VLM
LRM
161
6
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
191
24
0
03 Apr 2025
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Runhui Huang
Chunwei Wang
Junwei Yang
Guansong Lu
Yunlong Yuan
...
Lu Hou
Wei Zhang
Lanqing Hong
Hengshuang Zhao
Hang Xu
MLLM
168
7
0
02 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Yongjun Xu
DiffM
108
1
0
02 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
50
0
0
02 Apr 2025
Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration
Zilong Huang
Jun-Jian He
Junyan Ye
Lihan Jiang
Weijia Li
Yuxiao Chen
Ting Han
140
0
0
01 Apr 2025
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline
Lei Wang
Yujie Zhong
Xiaopeng Sun
Jingchun Cheng
C. Feng
Qiong Cao
Lin Ma
Zhaoxin Fan
91
0
0
01 Apr 2025
Scaling Prompt Instructed Zero Shot Composed Image Retrieval with Image-Only Data
Yiqun Duan
Sameera Ramasinghe
Stephen Gould
Ajanthan Thalaiyasingam
111
1
0
01 Apr 2025
The HCI GenAI CO2ST Calculator: A Tool for Calculating the Carbon Footprint of Generative AI Use in Human-Computer Interaction Research
Nanna Inie
Jeanette Falk
Raghavendra Selvan
139
0
0
01 Apr 2025
SPF-Portrait: Towards Pure Text-to-Portrait Customization with Semantic Pollution-Free Fine-Tuning
Xiaole Xian
Zhichao Liao
Qingyu Li
Wenyu Qin
Pengfei Wan
Weicheng Xie
Long Zeng
Linlin Shen
Pingfa Feng
DiffM
154
0
0
01 Apr 2025
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Yize Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Yi Liu
Xiangyang Ji
350
1
0
31 Mar 2025
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents
Jiaxiang Chen
Jingwei Shi
Lei Gan
Jiale Zhang
Qingyu Zhang
Dongqian Zhang
Xin Pang
Zhucong Li
Yinghui Xu
LLMAG
89
0
0
31 Mar 2025
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
Xin Zhang
Siting Huang
Xiangyang Luo
Yifan Xie
Weijiang Yu
Heng Chang
Fei Ma
Fei Richard Yu
DiffM
136
0
0
31 Mar 2025
InstructRestore: Region-Customized Image Restoration with Human Instructions
Shixuan Liu
Jianqi Ma
Lingchen Sun
Xiangtao Kong
Lei Zhang
DiffM
105
0
0
31 Mar 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Zhihao Li
Jiaxing Huang
Pichao Wang
Bihan Wen
Jingchao Wang
DiffM
111
1
0
31 Mar 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
143
0
0
30 Mar 2025
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
Min Cao
Ziyin Zeng
YuXin Lu
Mang Ye
Dong Yi
Jinqiao Wang
SyDa
89
0
0
28 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
Bernhard Kainz
MedIm
82
2
0
28 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
77
0
0
28 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
119
0
0
27 Mar 2025
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
Achint Soni
Meet Soni
Sirisha Rambhatla
DiffM
104
0
0
27 Mar 2025
StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency
Yuyin Chen
Yida Wang
Xinyu Zhang
Kun Zhan
Peng Jia
Yifei Zhan
Xianpeng Lang
3DGS
89
1
0
27 Mar 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
184
1
0
26 Mar 2025
TD-BFR: Truncated Diffusion Model for Efficient Blind Face Restoration
Ziying Zhang
Xiang Gao
Zhixin Wang
Q. Hu
Xiaoyun Zhang
DiffM
109
1
0
26 Mar 2025
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
Qi Si
Bo Wang
Zhao Zhang
107
0
0
26 Mar 2025
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Shijie Zhou
Hui Ren
Yijia Weng
Shuwang Zhang
Zhen Wang
...
Zhiwen Fan
Suya You
Ziyi Wang
Leonidas Guibas
A. Kadambi
VGen
3DGS
148
3
0
26 Mar 2025
EditCLIP: Representation Learning for Image Editing
Qian Wang
Aleksandar Cvejic
Abdelrahman Eldesokey
Peter Wonka
99
0
0
26 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Zhiqiang Zhang
Jia-Nan Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
140
2
0
25 Mar 2025
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
Fernando Julio Cendra
Kai Han
VLM
136
0
0
25 Mar 2025
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
Zhi Hou
Tianyi Zhang
Yuwen Xiong
Haonan Duan
Hengjun Pu
...
Chengyang Zhao
X. Zhu
Yu Qiao
Jifeng Dai
Yuxiao Chen
139
6
0
25 Mar 2025
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren
Zicong Jiang
Tong Zhang
Søren Forchhammer
Sabine Süsstrunk
DiffM
96
0
0
24 Mar 2025
DiffV2IR: Visible-to-Infrared Diffusion Model via Vision-Language Understanding
Lingyan Ran
Lidong Wang
Guangcong Wang
Peng Wang
Yize Zhang
88
0
0
24 Mar 2025
RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis
Yifei Feng
M. Yang
Steve Yang
Sheng Zhang
Jianwei Yu
Zibo Zhao
Yuhong Liu
Jie Jiang
Chunchao Guo
DiffM
110
2
0
24 Mar 2025
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
Jinho Jeong
Sangmin Han
Jinwoo Kim
Seon Joo Kim
72
1
0
24 Mar 2025
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen
Misha Sra
Pradeep Sen
130
0
0
24 Mar 2025
Target-Aware Video Diffusion Models
Taeksoo Kim
Hanbyul Joo
DiffM
VGen
169
1
0
24 Mar 2025
SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction
Zhengyuan Li
Kai Cheng
Anindita Ghosh
Uttaran Bhattacharya
Liangyan Gui
Aniket Bera
DiffM
VGen
100
1
0
23 Mar 2025
MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion
Yikun Ma
Yiqing Li
Jiawei Wu
Xing Luo
Zhi Jin
DiffM
VGen
150
0
0
22 Mar 2025
Guidance Free Image Editing via Explicit Conditioning
Mehdi Noroozi
Alberto Gil C. P. Ramos
Luca Morreale
Ruchika Chavhan
Malcolm Chadwick
Abhinav Mehrotra
Sourav Bhattacharya
DiffM
119
0
0
22 Mar 2025
good4cir: Generating Detailed Synthetic Captions for Composed Image Retrieval
Pranavi Kolouju
Eric Xing
Robert Pless
Nathan Jacobs
Abby Stylianou
3DV
84
0
0
22 Mar 2025
InstructVEdit: A Holistic Approach for Instructional Video Editing
Chi Zhang
C. Feng
Feng Yan
Qiming Zhang
Mingjin Zhang
Yujie Zhong
Jing Zhang
Lin Ma
DiffM
VGen
88
1
0
22 Mar 2025
What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Keyon Vafa
Sarah Bentley
Jon M. Kleinberg
S. Mullainathan
70
2
0
21 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
96
1
0
21 Mar 2025
Enhancing Product Search Interfaces with Sketch-Guided Diffusion and Language Agents
Edward Sun
DiffM
70
0
0
21 Mar 2025
MagicColor: Multi-Instance Sketch Colorization
Yize Zhang
Yue Ma
Bingyuan Wang
Qifeng Chen
Zeyu Wang
DiffM
129
4
0
21 Mar 2025
Controlling Avatar Diffusion with Learnable Gaussian Embedding
Xuan Gao
Jingtao Zhou
Dongyu Liu
Yuqi Zhou
Juyong Zhang
3DGS
DiffM
79
0
0
20 Mar 2025
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
Tianyi Wei
Yifan Zhou
DongDong Chen
Xingang Pan
131
1
0
20 Mar 2025
Previous
1
2
3
4
5
...
27
28
29
Next