ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions

InstructPix2Pix: Learning to Follow Image Editing Instructions

17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXivPDFHTML

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,348 papers shown
Title
Omni$^2$: Unifying Omnidirectional Image Generation and Editing in an Omni Model
Omni2^22: Unifying Omnidirectional Image Generation and Editing in an Omni Model
Liu Yang
Huiyu Duan
Yucheng Zhu
Xiaohong Liu
Lu Liu
Zitong Xu
Guangji Ma
Xiongkuo Min
Guangtao Zhai
P. Callet
VLM
VGen
137
0
0
15 Apr 2025
ViMo: A Generative Visual GUI World Model for App Agent
ViMo: A Generative Visual GUI World Model for App Agent
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Jianye Hao
Jun Wang
Kun Shao
LM&Ro
VGen
44
0
0
15 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
62
0
0
15 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
63
0
0
14 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffM
VGen
23
0
0
14 Apr 2025
SPICE: A Synergistic, Precise, Iterative, and Customizable Image Editing Workflow
SPICE: A Synergistic, Precise, Iterative, and Customizable Image Editing Workflow
Kenan Tang
Yanhong Li
Yao Qin
DiffM
36
0
0
13 Apr 2025
Towards Explainable Partial-AIGC Image Quality Assessment
Towards Explainable Partial-AIGC Image Quality Assessment
Jiaying Qian
Ziheng Jia
Zicheng Zhang
Zeyu Zhang
Guangtao Zhai
Xiongkuo Min
35
0
0
12 Apr 2025
POEM: Precise Object-level Editing via MLLM control
POEM: Precise Object-level Editing via MLLM control
Marco Schouten
Mehmet Onurcan Kaya
Serge Belongie
Dim P. Papadopoulos
DiffM
75
0
0
10 Apr 2025
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation
Linyan Huang
Haonan Lin
Yanning Zhou
Kaiwen Xiao
42
0
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
R. He
DiffM
71
0
0
10 Apr 2025
A Unified Agentic Framework for Evaluating Conditional Image Generation
A Unified Agentic Framework for Evaluating Conditional Image Generation
Jifang Wang
Xue Yang
Longyue Wang
Zhenran Xu
Y. Wang
Yaowei Wang
Weihua Luo
Kaifu Zhang
Baotian Hu
Min Zhang
EGVM
DiffM
72
0
0
09 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
26
0
0
09 Apr 2025
Probability Density Geodesics in Image Diffusion Latent Space
Probability Density Geodesics in Image Diffusion Latent Space
Qingtao Yu
Jaskirat Singh
Zhaoyuan Yang
Peter Tu
Jing Zhang
Hongdong Li
Richard Hartley
Dylan Campbell
DiffM
60
0
0
09 Apr 2025
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking
Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking
Junxi Chen
Junhao Dong
Xiaohua Xie
33
0
0
08 Apr 2025
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Qi Mao
L. Chen
Yuchao Gu
Mike Zheng Shou
Ming-Hsuan Yang
DiffM
39
0
0
08 Apr 2025
A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model
A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model
Jihun Park
Jongmin Gim
Kyoungmin Lee
Minseok Oh
Minwoo Choi
Jaeyeul Kim
Woo Chool Park
Sunghoon Im
DiffM
30
0
0
08 Apr 2025
Transfer between Modalities with MetaQueries
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
43
6
0
08 Apr 2025
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition
Rupayan Mallick
Sibo Dong
Nataniel Ruiz
Sarah Adel Bargal
DiffM
47
0
0
08 Apr 2025
TactileNet: Bridging the Accessibility Gap with AI-Generated Tactile Graphics for Individuals with Vision Impairment
TactileNet: Bridging the Accessibility Gap with AI-Generated Tactile Graphics for Individuals with Vision Impairment
Adnan Khan
Alireza Choubineh
Mai A. Shaaban
Abbas Akkasi
Majid Komeili
DiffM
35
0
0
07 Apr 2025
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision
Yuandong Pu
Le Zhuo
Kaiwen Zhu
Liangbin Xie
Wenlong Zhang
Xiangyu Chen
Peng Gao
Yu Qiao
Chao Dong
Yihao Liu
MLLM
61
1
0
07 Apr 2025
CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models
CREA: A Collaborative Multi-Agent Framework for Creative Content Generation with Diffusion Models
Kavana Venkatesh
Connor Dunlop
Pinar Yanardag
DiffM
38
0
0
07 Apr 2025
PartStickers: Generating Parts of Objects for Rapid Prototyping
PartStickers: Generating Parts of Objects for Rapid Prototyping
Mo Zhou
Josh Myers-Dean
Danna Gurari
23
0
0
07 Apr 2025
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Hui Liu
Bin Zou
Suiyun Zhang
Kecheng Chen
Rui Liu
Haoliang Li
DiffM
64
0
0
07 Apr 2025
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
Wulin Xie
Y. Zhang
Chaoyou Fu
Yang Shi
Bingyan Nie
Hongkai Chen
Z. Zhang
Liang Wang
T. Tan
31
1
0
04 Apr 2025
F-ViTA: Foundation Model Guided Visible to Thermal Translation
F-ViTA: Foundation Model Guided Visible to Thermal Translation
Jay N. Paranjape
C. D. Melo
Vishal M. Patel
VGen
44
0
0
03 Apr 2025
Concept Lancet: Image Editing with Compositional Representation Transplant
Concept Lancet: Image Editing with Compositional Representation Transplant
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
Hancheng Min
Chris Callison-Burch
René Vidal
DiffM
KELM
72
0
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
J. Wang
Jingyuan Liu
Xin Sun
Krishna Kumar Singh
Zhixin Shu
...
Nanxuan Zhao
Tuanfeng Y. Wang
Simon Chen
Ulrich Neumann
Jae Shin Yoon
27
0
0
03 Apr 2025
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao
Peiyuan Zhang
Kexian Tang
Hao Li
Zicheng Zhang
Guangtao Zhai
Junchi Yan
Hua Yang
Xue Yang
Haodong Duan
VLM
LRM
41
0
0
03 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Y. Xu
DiffM
30
0
0
02 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
30
0
0
02 Apr 2025
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Runhui Huang
Chunwei Wang
Junwei Yang
Guansong Lu
Yunlong Yuan
...
Lu Hou
Wei Zhang
Lanqing Hong
Hengshuang Zhao
Hang Xu
MLLM
83
2
0
02 Apr 2025
FreSca: Unveiling the Scaling Space in Diffusion Models
FreSca: Unveiling the Scaling Space in Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Li Ma
Yapeng Tian
Chenliang Xu
DiffM
48
0
0
02 Apr 2025
Scaling Prompt Instructed Zero Shot Composed Image Retrieval with Image-Only Data
Scaling Prompt Instructed Zero Shot Composed Image Retrieval with Image-Only Data
Yiqun Duan
Sameera Ramasinghe
Stephen Gould
Ajanthan Thalaiyasingam
43
0
0
01 Apr 2025
The HCI GenAI CO2ST Calculator: A Tool for Calculating the Carbon Footprint of Generative AI Use in Human-Computer Interaction Research
The HCI GenAI CO2ST Calculator: A Tool for Calculating the Carbon Footprint of Generative AI Use in Human-Computer Interaction Research
Nanna Inie
Jeanette Falk
Raghavendra Selvan
46
0
0
01 Apr 2025
Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration
Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration
Zilong Huang
Jun-Jian He
Junyan Ye
Lihan Jiang
Weijia Li
Y. Chen
Ting Han
56
0
0
01 Apr 2025
SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning
SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning
Xiaole Xian
Zhichao Liao
Qingyu Li
Wenyu Qin
Pengfei Wan
Weicheng Xie
Long Zeng
L. Shen
P. Feng
DiffM
61
0
0
01 Apr 2025
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline
AP-CAP: Advancing High-Quality Data Synthesis for Animal Pose Estimation via a Controllable Image Generation Pipeline
Lei Wang
Yujie Zhong
Xiaopeng Sun
Jingchun Cheng
C. Feng
Qiong Cao
Lin Ma
Zhaoxin Fan
44
0
0
01 Apr 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Z. Li
Jiaxing Huang
Pichao Wang
Bihan Wen
J. Wang
DiffM
60
1
0
31 Mar 2025
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents
AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents
Jiaxiang Chen
Jingwei Shi
Lei Gan
Jiale Zhang
Qingyu Zhang
Dongqian Zhang
Xin Pang
Zhucong Li
Yinghui Xu
LLMAG
54
0
0
31 Mar 2025
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Y. Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Y. Liu
Xiangyang Ji
120
0
0
31 Mar 2025
InstructRestore: Region-Customized Image Restoration with Human Instructions
InstructRestore: Region-Customized Image Restoration with Human Instructions
S. Liu
Jianqi Ma
Lingchen Sun
Xiangtao Kong
Lei Zhang
DiffM
44
0
0
31 Mar 2025
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
Xin Zhang
Siting Huang
Xiangyang Luo
Yifan Xie
Weijiang Yu
Heng Chang
Fei Ma
Fei Richard Yu
DiffM
38
0
0
31 Mar 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
40
0
0
30 Mar 2025
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation
Minho Park
S. Park
Jungsoo Lee
Hyojin Park
Kyuwoong Hwang
Fatih Porikli
Jaegul Choo
Sungha Choi
34
0
0
28 Mar 2025
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation
Hadrien Reynaud
Alberto Gomez
Paul Leeson
Qingjie Meng
B. Kainz
MedIm
54
0
0
28 Mar 2025
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
Min Cao
Ziyin Zeng
YuXin Lu
Mang Ye
Dong Yi
Jinqiao Wang
SyDa
52
0
0
28 Mar 2025
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing
Achint Soni
Meet Soni
Sirisha Rambhatla
DiffM
61
0
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
72
0
0
27 Mar 2025
StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency
StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency
Yuyin Chen
Yida Wang
X. Zhang
Kun Zhan
Peng Jia
Yifei Zhan
Xianpeng Lang
3DGS
47
0
0
27 Mar 2025
Previous
12345...252627
Next