Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
Fan Yang
Yousong Zhu
Xin Li
Yufei Zhan
Hongyin Zhao
Shurong Zheng
Yaowei Wang
Ming Tang
Jinqiao Wang
MLLM
VLM
38
0
0
20 Jun 2025
VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Josef Kuchař
Marek Kadlcík
Michal Spiegel
Michal Štefánik
5
0
0
18 Jun 2025
Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer?
Gary Song Yan
Yusen Zhang
Jinyu Zhao
Hao Zhang
Zhangping Yang
...
Tao Zhang
Yujie He
Siyuan Tian
Yao Gou
Min Li
DiffM
56
0
0
18 Jun 2025
Control and Realism: Best of Both Worlds in Layout-to-Image without Training
Bonan li
Yinhan Hu
Songhua Liu
Xinchao Wang
DiffM
43
0
0
18 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
52
0
0
18 Jun 2025
One-shot Face Sketch Synthesis in the Wild via Generative Diffusion Prior and Instruction Tuning
Han Wu
Junyao Li
Kangbo Zhao
Sen Zhang
Yukai Shi
Liang Lin
DiffM
16
0
0
18 Jun 2025
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
Black Forest Labs
Stephen Batifol
A. Blattmann
Frederic Boesel
Saksham Consul
...
Dustin Podell
Robin Rombach
Harry Saini
Axel Sauer
Luke Smith
DiffM
25
0
0
17 Jun 2025
AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing
Biao Yang
Muqi Huang
Yuhui Zhang
Yun Xiong
Kun Zhou
...
Shiyang Zhou
Huishuai Bao
Chuan Li
Feng Shi
Hualei Liu
DiffM
27
0
0
16 Jun 2025
Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing
Zhuoying Li
Zhu Xu
Yuxin Peng
Yang Liu
16
0
0
15 Jun 2025
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies
Chenglin Wang
Yucheng Zhou
Qianning Wang
Zhe Wang
Kai Zhang
CoGe
24
0
0
15 Jun 2025
LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Yicheng Xiao
Lin Song
Rui Yang
Cheng Cheng
Yixiao Ge
Xiu Li
Y. Shan
OffRL
24
0
0
13 Jun 2025
Enhance Multimodal Consistency and Coherence for Text-Image Plan Generation
Xiaoxin Lu
Ranran Haoran Zhang
Yusen Zhang
Rui Zhang
DiffM
20
0
0
13 Jun 2025
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
Ron Yosef
Moran Yanuka
Yonatan Bitton
Dani Lischinski
57
0
0
11 Jun 2025
How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models
Huixuan Zhang
Junzhe Zhang
Xiaojun Wan
38
0
0
10 Jun 2025
Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation
M. Schaar
Richard W. Peck
E. McKinney
Jim Weatherall
Stuart Bailey
...
Rafik Salama
Christina Gunther
Francesca Frau
Antoine Pugeat
Ramon Hernandez
MedIm
69
6
0
10 Jun 2025
Dreamland: Controllable World Creation with Simulator and Generative Models
Sicheng Mo
Ziyang Leng
Leon Liu
Weizhen Wang
Honglin He
Bolei Zhou
VGen
12
0
0
09 Jun 2025
Consistent Video Editing as Flow-Driven Image-to-Video Generation
Ge Wang
Songlin Fan
Hangxu Liu
Quanjian Song
Hewei Wang
Jinfeng Xu
DiffM
VGen
29
0
0
09 Jun 2025
R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation
William Ljungbergh
Bernardo Taveira
Wenzhao Zheng
Adam Tonderski
Chensheng Peng
...
Christoffer Petersson
Michael Felsberg
Kurt Keutzer
Masayoshi Tomizuka
Wei Zhan
22
0
0
09 Jun 2025
DragNeXt: Rethinking Drag-Based Image Editing
Yuan Zhou
Junbao Zhou
Qingshan Xu
Kesen Zhao
Yuxuan Wang
Hao Fei
Richang Hong
Hanwang Zhang
DiffM
12
0
0
09 Jun 2025
Difference Inversion: Interpolate and Isolate the Difference with Token Consistency for Image Analogy Generation
H. Kim
Donghyun Kim
Suhyun Kim
DiffM
31
1
0
09 Jun 2025
PairEdit: Learning Semantic Variations for Exemplar-based Image Editing
Haoguang Lu
Jiacheng Chen
Zhenguo Yang
Aurele Tohokantche Gnanha
Fu Lee Wang
Li Qing
Xudong Mao
DiffM
26
0
0
09 Jun 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai
Yuxuan Fan
Jiantao Qiu
Fupeng Sun
Jiayi Song
Junlin Han
Zichen Liu
Conghui He
Wentao Zhang
Binhang Yuan
MLLM
VLM
26
0
0
08 Jun 2025
TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation
M. Kim
Dongjin Kim
Seokju Yun
Jaegul Choo
DiffM
VGen
29
0
0
08 Jun 2025
Controllable Coupled Image Generation via Diffusion Models
Chenfei Yuan
Nanshan Jia
Hangqi Li
Peter W. Glynn
Zeyu Zheng
DiffM
23
0
0
07 Jun 2025
AssetDropper: Asset Extraction via Diffusion Models with Reward-Driven Optimization
Lanjiong Li
Guanhua Zhao
Lingting Zhu
Zeyu Cai
Lequan Yu
Jian Zhang
Zeyu Wang
20
0
0
06 Jun 2025
FADE: Frequency-Aware Diffusion Model Factorization for Video Editing
Yixuan Zhu
Haolin Wang
Shilin Ma
Wenliang Zhao
Yansong Tang
Lei Chen
Jie Zhou
DiffM
VGen
52
0
0
06 Jun 2025
FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing
Guangzhao Li
Yanming Yang
Chenxi Song
Chi Zhang
DiffM
VGen
107
0
0
05 Jun 2025
SeedEdit 3.0: Fast and High-Quality Generative Image Editing
Peng Wang
Yichun Shi
Xiaochen Lian
Zhonghua Zhai
Xin Xia
Xuefeng Xiao
Weilin Huang
Jianchao Yang
135
0
0
05 Jun 2025
PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative APIs
Jianqing Zhang
Yang Liu
Jie Fu
Yang Hua
Tianyuan Zou
Jian Cao
Qiang Yang
28
0
0
04 Jun 2025
ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions
Di Chang
Mingdeng Cao
Yichun Shi
Bo Liu
Shengqu Cai
Shijie Zhou
Weilin Huang
Gordon Wetzstein
M. Soleymani
Peng Wang
DiffM
VGen
49
0
0
03 Jun 2025
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers
Yan Gong
Yiren Song
Yicheng Li
Chenglin Li
Yin Zhang
KELM
58
0
0
03 Jun 2025
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
Bimsara Pathiraja
Maitreya Patel
Shivam Singh
Yezhou Yang
Chitta Baral
31
0
0
03 Jun 2025
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
Bin Lin
Zongjian Li
Xinhua Cheng
Yuwei Niu
Yang Ye
...
Wangbo Yu
Shaodong Wang
Yunyang Ge
Yatian Pang
Li Yuan
VLM
61
0
0
03 Jun 2025
Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks
Tao Yang
Ruibin Li
Yangming Shi
Yuqi Zhang
Qide Dong
Haoran Cheng
Weiguo Feng
Shilei Wen
Bingyue Peng
Lei Zhang
DiffM
VGen
64
0
0
02 Jun 2025
WorldExplorer: Towards Generating Fully Navigable 3D Scenes
Manuel-Andreas Schneider
Lukas Höllein
Matthias Nießner
VGen
53
0
0
02 Jun 2025
IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout
Fei Shen
Xiaoyu Du
Yutong Gao
Jian Yu
Yushe Cao
Xing Lei
Jinhui Tang
DiffM
61
0
0
02 Jun 2025
Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation
Li Zhou
Lutong Yu
Dongchu Xie
Shaohuan Cheng
Wenyan Li
Haizhou Li
VLM
66
0
0
02 Jun 2025
PromptVFX: Text-Driven Fields for Open-World 3D Gaussian Animation
Mert Kiray
Paul Uhlenbruck
Nassir Navab
Benjamin Busam
VGen
3DGS
AI4CE
33
0
0
01 Jun 2025
Multiverse Through Deepfakes: The MultiFakeVerse Dataset of Person-Centric Visual and Conceptual Manipulations
Parul Gupta
Shreya Ghosh
Tom Gedeon
Thanh-Toan Do
Abhinav Dhall
50
0
0
01 Jun 2025
Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions
Jihyoung Jang
Minwook Bae
Minji Kim
Dilek Z. Hakkani-Tür
Hyounghun Kim
22
0
0
31 May 2025
ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary
Zeqi Gu
Yin Cui
Zhaoshuo Li
Fangyin Wei
Yunhao Ge
Jinwei Gu
Ming-Yu Liu
Abe Davis
Yifan Ding
25
0
0
31 May 2025
GenSpace: Benchmarking Spatially-Aware Image Generation
Zehan Wang
Jiayang Xu
Ziang Zhang
Tianyu Pan
Chao Du
Hengshuang Zhao
Zhou Zhao
EGVM
51
0
0
30 May 2025
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
Runnan Lu
Yuxuan Zhang
Jailing Liu
Haifa Wang
Yiren Song
DiffM
34
0
0
30 May 2025
Cora: Correspondence-aware image editing using few step diffusion
Amirhossein Almohammadi
Aryan Mikaeili
Sauradip Nag
Negar Hassanpour
Andrea Tagliasacchi
Ali Mahdavi-Amiri
DiffM
26
0
0
29 May 2025
DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP
Amber Yijia Zheng
Yu Zhang
Jun Hu
Raymond A. Yeh
Chen Chen
DiffM
28
0
0
29 May 2025
FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing
Jeongsol Kim
Yeobin Hong
Jong Chul Ye
30
0
0
29 May 2025
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
Keren Ye
Ignacio Garcia Dorado
Michalis Raptis
M. Delbracio
Irene Zhu
P. Milanfar
Hossein Talebi
31
0
0
29 May 2025
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
Zijing Hu
Fengda Zhang
Kun Kuang
53
1
0
28 May 2025
Identity-Preserving Text-to-Image Generation via Dual-Level Feature Decoupling and Expert-Guided Fusion
Kewen Chen
Xiaobin Hu
Wenqi Ren
DiffM
51
0
0
28 May 2025
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
Eric Xing
Pranavi Kolouju
Robert Pless
Abby Stylianou
Nathan Jacobs
18
0
0
27 May 2025
1
2
3
4
...
27
28
29
Next