ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions

InstructPix2Pix: Learning to Follow Image Editing Instructions

17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXivPDFHTML

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,348 papers shown
Title
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
Precise Parameter Localization for Textual Generation in Diffusion Models
Łukasz Staniszewski
Bartosz Cywiñski
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
157
0
0
17 Feb 2025
Human-Centric Foundation Models: Perception, Generation and Agentic Modeling
Human-Centric Foundation Models: Perception, Generation and Agentic Modeling
Shixiang Tang
Y. Wang
Lu Chen
Yuan Wang
Sida Peng
Dan Xu
W. Ouyang
VGen
131
2
0
12 Feb 2025
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
Zhenxing Mi
Kuan-Chieh Jackson Wang
Guocheng Qian
Hanrong Ye
Runtao Liu
Sergey Tulyakov
Kfir Aberman
Dan Xu
LRM
42
0
0
12 Feb 2025
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
H. Seo
Wongi Jeong
Jae-sun Seo
Se Young Chun
55
0
0
12 Feb 2025
Dual Caption Preference Optimization for Diffusion Models
Amir Saeidi
Yiran Luo
Agneet Chatterjee
Shamanthak Hegde
Bimsara Pathiraja
Yezhou Yang
Chitta Baral
DiffM
53
0
0
09 Feb 2025
AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection
Shuheng Zhang
Y. Liu
Hongbo Zhou
Jun Peng
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
VGen
38
0
0
08 Feb 2025
CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing
CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing
Yu Yuan
Shizhao Sun
Qi Liu
Jiang Bian
91
0
0
06 Feb 2025
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation
Jinbo Xing
Long Mai
Cusuh Ham
Jiahui Huang
Aniruddha Mahapatra
Chi-Wing Fu
T. Wong
Feng Liu
DiffM
VGen
124
2
0
06 Feb 2025
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Xianghui Ze
Zhenbo Song
Qiwei Wang
Jianfeng Lu
Yujiao Shi
46
0
0
05 Feb 2025
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
Zhekai Du
Yinjie Min
Jingjing Li
Ke Lu
Changliang Zou
Liuhua Peng
Tingjin Chu
M. Gong
153
1
0
05 Feb 2025
Improved Training Technique for Latent Consistency Models
Improved Training Technique for Latent Consistency Models
Quan Dao
Khanh Doan
Di Liu
Trung Le
Dimitris N. Metaxas
62
3
0
03 Feb 2025
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
Rohit Gandikota
Zongze Wu
Richard Zhang
David Bau
Eli Shechtman
Nick Kolkin
DiffM
48
1
0
03 Feb 2025
Consistent Video Colorization via Palette Guidance
Consistent Video Colorization via Palette Guidance
Han Wang
Yuang Zhang
Yuhong Zhang
Lingxiao Lu
Li-Na Song
DiffM
VGen
86
0
0
31 Jan 2025
Inkspire: Supporting Design Exploration with Generative AI through Analogical Sketching
Inkspire: Supporting Design Exploration with Generative AI through Analogical Sketching
David Chuan-En Lin
Hyeonsu B Kang
Nikolas Martelaro
A. Kittur
Yan-Ying Chen
Matthew K. Hong
99
3
0
30 Jan 2025
CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
CAFuser: Condition-Aware Multimodal Fusion for Robust Semantic Perception of Driving Scenes
Tim Broedermann
Christos Sakaridis
Yuqian Fu
Luc Van Gool
57
23
0
28 Jan 2025
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control
Aosong Feng
Weikang Qiu
Jinbin Bai
Xiao Zhang
Zhen Dong
Kaicheng Zhou
Rex Ying
Leandros Tassiulas
DiffM
58
6
0
28 Jan 2025
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
Hossein Mirzaei
Mackenzie W. Mathis
OODD
AAML
40
2
0
28 Jan 2025
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
Andrey Palaev
Adil Mehmood Khan
S. M. Ahsan Kazmi
DiffM
48
0
0
23 Jan 2025
3D Object Manipulation in a Single Image using Generative Models
3D Object Manipulation in a Single Image using Generative Models
Ruisi Zhao
Zechuan Zhang
Zongxin Yang
Yi Yang
38
1
0
22 Jan 2025
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
M. Gwilliam
Han Cai
Di Wu
Abhinav Shrivastava
Zhiyu Cheng
90
0
0
22 Jan 2025
Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement
Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement
Christoph Gebhardt
Robin Willardt
Seyedmorteza Sadat
Chih-Wei Ning
Andreas Brombach
Jie Song
Otmar Hilliges
Christian Holz
65
0
0
21 Jan 2025
ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
Shiyue Zhang
Zheng Chong
Xi Lu
Wenqing Zhang
Haoxiang Li
Xujie Zhang
Jiehui Huang
Xiao Dong
Xiaodan Liang
DiffM
42
0
0
21 Jan 2025
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Zibo Zhao
Zeqiang Lai
Qingxiang Lin
Yunfei Zhao
Haolin Liu
...
Jingwei Huang
Chunchao Guo
Jie Jiang
Jingwei Huang
Chunchao Guo
111
21
0
21 Jan 2025
Disharmony: Forensics using Reverse Lighting Harmonization
Disharmony: Forensics using Reverse Lighting Harmonization
P. W. Shin
Jack Sampson
Vijaykrishnan Narayanan
Andres Marquez
Mahantesh Halappanavar
DiffM
46
0
0
20 Jan 2025
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Jin Zhu
Huimin Ma
Jiansheng Chen
Jian Yuan
73
4
0
20 Jan 2025
SuperNeRF-GAN: A Universal 3D-Consistent Super-Resolution Framework for Efficient and Enhanced 3D-Aware Image Synthesis
SuperNeRF-GAN: A Universal 3D-Consistent Super-Resolution Framework for Efficient and Enhanced 3D-Aware Image Synthesis
Peng Zheng
Linzhi Huang
Yizhou Yu
Y. Chang
Yilin Wang
Rui Ma
38
0
0
20 Jan 2025
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
Sumit Chaturvedi
Mengwei Ren
Yannick Hold-Geoffroy
Jingyuan Liu
Julie Dorsey
Zhixin Shu
DiffM
64
0
0
17 Jan 2025
IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion
IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion
Tharun Anand
Aryan Garg
Kaushik Mitra
VGen
DiffM
47
0
0
13 Jan 2025
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Xiaoying Xing
Avinab Saha
Junfeng He
Susan Hao
Paul Vicol
...
Sahil Singla
Sarah Young
Yinxiao Li
Feng Yang
Deepak Ramachandran
DiffM
48
0
0
11 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffM
VGen
39
0
0
11 Jan 2025
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection
Anant Mehta
Bryant McArthur
Nagarjuna Kolloju
Zhengzhong Tu
42
0
0
10 Jan 2025
Multi-subject Open-set Personalization in Video Generation
Multi-subject Open-set Personalization in Video Generation
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Yuwei Fang
Kwot Sin Lee
Ivan Skorokhodov
Kfir Aberman
Jun-Yan Zhu
Ming Yang
Sergey Tulyakov
DiffM
VGen
69
7
0
10 Jan 2025
EditAR: Unified Conditional Generation with Autoregressive Models
EditAR: Unified Conditional Generation with Autoregressive Models
Jiteng Mu
Nuno Vasconcelos
X. Wang
DiffM
38
4
0
08 Jan 2025
Instructive3D: Editing Large Reconstruction Models with Text Instructions
Instructive3D: Editing Large Reconstruction Models with Text Instructions
Kunal Kathare
Ankit Dhiman
K Vikas Gowda
Siddharth Aravindan
Shubham Monga
Basavaraja Shanthappa Vandrotti
Lokesh R. Boregowda
DiffM
31
3
0
08 Jan 2025
Edit as You See: Image-guided Video Editing via Masked Motion Modeling
Edit as You See: Image-guided Video Editing via Masked Motion Modeling
Zhi-Lin Huang
Y. Liu
Chujun Qin
Z. Wang
Dong Zhou
Dong Li
E. Barsoum
DiffM
VGen
41
0
0
08 Jan 2025
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li
Kevin J. Shih
Bryan A. Plummer
DiffM
54
0
0
08 Jan 2025
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Jiawei Liu
Yuanzhi Zhu
Feiyu Gao
Z. Yang
P. Wang
Junyang Lin
X. Wang
Wenyu Liu
DiffM
43
0
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
88
11
0
06 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
J. Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
46
15
0
05 Jan 2025
TDM: Temporally-Consistent Diffusion Model for All-in-One Real-World Video Restoration
Yizhou Li
Zihua Liu
Yusuke Monno
Masatoshi Okutomi
DiffM
VGen
26
1
0
04 Jan 2025
PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation
Zhenyu Li
Wenqing Cui
S. Bhat
Peter Wonka
MDE
36
0
0
03 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
96
48
0
03 Jan 2025
SOEDiff: Efficient Distillation for Small Object Editing
SOEDiff: Efficient Distillation for Small Object Editing
Yiming Wu
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Ronghua Liang
DiffM
60
0
0
03 Jan 2025
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model
Omid Saghatchian
Atiyeh Gh. Moghadam
Ahmad Nickabadi
MoMe
41
1
0
03 Jan 2025
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
GeoDiffuser: Geometry-Based Image Editing with Diffusion Models
Rahul Sajnani
Jeroen Vanbaar
Jie Min
Kapil D. Katyal
Srinath Sridhar
DiffM
49
11
0
03 Jan 2025
RORem: Training a Robust Object Remover with Human-in-the-Loop
RORem: Training a Robust Object Remover with Human-in-the-Loop
Ruibin Li
Tao Yang
Song Guo
L. Zhang
42
3
0
01 Jan 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
H. Zhang
Tat-Seng Chua
Shuicheng Yan
64
37
0
31 Dec 2024
Grid Diffusion Models for Text-to-Video Generation
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee
Soyeong Kwon
Taehwan Kim
51
5
0
31 Dec 2024
Edicho: Consistent Image Editing in the Wild
Edicho: Consistent Image Editing in the Wild
Qingyan Bai
Hao Ouyang
Yinghao Xu
Qiuyu Wang
Ceyuan Yang
Ka Leong Cheng
Yujun Shen
Qifeng Chen
DiffM
74
1
0
30 Dec 2024
Previous
123456...252627
Next