ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXiv (abs)PDFHTML

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,418 papers shown
Title
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang
Zhi-Wei Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
101
4
0
23 May 2024
TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing
TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing
Teng Xu
Jiamin Chen
Peng Chen
Youjia Zhang
Junqing Yu
Wei Yang
3DGSDiffM
89
5
0
23 May 2024
Text Prompting for Multi-Concept Video Customization by Autoregressive
  Generation
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
D. Kothandaraman
Kihyuk Sohn
Ruben Villegas
P. Voigtlaender
Dinesh Manocha
Mohammad Babaeizadeh
VGenDiffM
59
2
0
22 May 2024
MotionCraft: Physics-based Zero-Shot Video Generation
MotionCraft: Physics-based Zero-Shot Video Generation
L. S. Aira
Antonio Montanaro
Emanuele Aiello
D. Valsesia
E. Magli
DiffMVGen
76
14
0
22 May 2024
Enhanced Creativity and Ideation through Stable Video Synthesis
Enhanced Creativity and Ideation through Stable Video Synthesis
Elijah Miller
Thomas Dupont
Mingming Wang
VGen
61
1
0
22 May 2024
Personalized Residuals for Concept-Driven Text-to-Image Generation
Personalized Residuals for Concept-Driven Text-to-Image Generation
Cusuh Ham
Matthew Fisher
James Hays
Nicholas I. Kolkin
Yuchen Liu
Richard Y. Zhang
Tobias Hinz
DiffM
62
8
0
21 May 2024
Customize Your Own Paired Data via Few-shot Way
Customize Your Own Paired Data via Few-shot Way
Jinshu Chen
Bingchuan Li
Miao Hua
Panpan Xu
Qian He
DiffM
70
0
0
21 May 2024
EmoEdit: Evoking Emotions through Image Manipulation
EmoEdit: Evoking Emotions through Image Manipulation
Jingyuan Yang
Jiawei Feng
Weibin Luo
Dani Lischinski
Daniel Cohen-Or
Hui Huang
DiffM
82
2
0
21 May 2024
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models
  Using Spatio-Temporal Slices
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
Nathaniel Cohen
Vladimir Kulikov
Matan Kleiner
Inbar Huberman-Spiegelglas
T. Michaeli
VGenDiffM
71
17
0
20 May 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
170
9
0
20 May 2024
Searching Realistic-Looking Adversarial Objects For Autonomous Driving
  Systems
Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems
Shengxiang Sun
Shenzhe Zhu
AAML
131
0
0
19 May 2024
ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing
ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing
Ying Jin
Pengyang Ling
Xiao-wen Dong
Pan Zhang
Jiaqi Wang
Dahua Lin
88
3
0
18 May 2024
Analogist: Out-of-the-box Visual In-Context Learning with Image
  Diffusion Model
Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model
Zheng Gu
Shiyuan Yang
Jing Liao
Jing Huo
Yang Gao
VLMDiffM
77
8
0
16 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
130
21
0
16 May 2024
Coarse or Fine? Recognising Action End States without Labels
Coarse or Fine? Recognising Action End States without Labels
Davide Moltisanti
Hakan Bilen
Laura Sevilla-Lara
Frank Keller
76
0
0
13 May 2024
GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting
  Editing with Image Prompting
GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting
Haodong Chen
Yongle Huang
Haojian Huang
Xiangsheng Ge
Dian Shao
DiffM
147
16
0
13 May 2024
Distilling Diffusion Models into Conditional GANs
Distilling Diffusion Models into Conditional GANs
Minguk Kang
Richard Zhang
Connelly Barnes
Sylvain Paris
Suha Kwak
Jaesik Park
Eli Shechtman
Jun-Yan Zhu
Taesung Park
117
45
0
09 May 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and
  Duration via Flow-based Large Diffusion Transformers
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Peng Gao
Le Zhuo
Ziyi Lin
Ruoyi Du
Xu Luo
...
Weicai Ye
He Tong
Jingwen He
Yu Qiao
Hongsheng Li
VGen
103
91
0
09 May 2024
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian
  Representation
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation
Sitian Shen
Jing Xu
Yuheng Yuan
Xingyi Yang
Qiuhong Shen
Xinchao Wang
3DGS
116
3
0
09 May 2024
FlexEControl: Flexible and Efficient Multimodal Control for
  Text-to-Image Generation
FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation
Xuehai He
Jian Zheng
Jacob Zhiyuan Fang
Robinson Piramuthu
Mohit Bansal
Vicente Ordonez
Gunnar Sigurdsson
Nanyun Peng
Xin Eric Wang
DiffM
98
1
0
08 May 2024
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video
  Motion Editing
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing
Yi Zuo
Lingling Li
Licheng Jiao
Fang Liu
Xu Liu
Wenping Ma
Shuyuan Yang
Yuwei Guo
VGenDiffM
90
1
0
07 May 2024
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional
  Image Editing
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing
Yuying Ge
Sijie Zhao
Chen Li
Yixiao Ge
Ying Shan
73
35
0
07 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
145
16
0
06 May 2024
Exploring Text-based Realistic Building Facades Editing Applicaiton
Exploring Text-based Realistic Building Facades Editing Applicaiton
Jing Wang
Xin Zhang
AI4CE
79
1
0
05 May 2024
Auto-Encoding Morph-Tokens for Multimodal LLM
Auto-Encoding Morph-Tokens for Multimodal LLM
Kaihang Pan
Siliang Tang
Juncheng Li
Zhaoyu Fan
Wei Chow
Shuicheng Yan
Tat-Seng Chua
Yueting Zhuang
Hanwang Zhang
MLLM
80
24
0
03 May 2024
Customizing Text-to-Image Models with a Single Image Pair
Customizing Text-to-Image Models with a Single Image Pair
Maxwell Jones
Sheng-Yu Wang
Nupur Kumari
David Bau
Jun-Yan Zhu
DiffM
98
21
0
02 May 2024
LocInv: Localization-aware Inversion for Text-Guided Image Editing
LocInv: Localization-aware Inversion for Text-Guided Image Editing
Chuanming Tang
Kai Wang
Fei Yang
Joost van de Weijer
DiffM
78
5
0
02 May 2024
TexSliders: Diffusion-Based Texture Editing in CLIP Space
TexSliders: Diffusion-Based Texture Editing in CLIP Space
Julia Guerrero-Viu
Milos Hasan
Arthur Roullier
Midhun Harikumar
Yiwei Hu
Paul Guerrero
Diego F. F. Gutierrez
B. Masiá
Valentin Deschaintre
DiffM
66
13
0
01 May 2024
RGB$\leftrightarrow$X: Image decomposition and synthesis using material-
  and lighting-aware diffusion models
RGB↔\leftrightarrow↔X: Image decomposition and synthesis using material- and lighting-aware diffusion models
Zheng Zeng
Valentin Deschaintre
Iliyan Georgiev
Yannick Hold-Geoffroy
Yiwei Hu
Fujun Luan
Ling-Qi Yan
Miloš Hašan
DiffM
84
47
0
01 May 2024
GraCo: Granularity-Controllable Interactive Segmentation
GraCo: Granularity-Controllable Interactive Segmentation
Yian Zhao
Kehan Li
Ze-Long Cheng
Pengchong Qiao
Xiawu Zheng
Rongrong Ji
Chang Liu
Li-ming Yuan
Jie Chen
112
9
0
01 May 2024
Streamlining Image Editing with Layered Diffusion Brushes
Streamlining Image Editing with Layered Diffusion Brushes
Peyman Gholami
Robert Xiao
DiffM
89
1
0
01 May 2024
Synthetic Image Verification in the Era of Generative AI: What Works and
  What Isn't There Yet
Synthetic Image Verification in the Era of Generative AI: What Works and What Isn't There Yet
D. Tariang
Riccardo Corvi
D. Cozzolino
Giovanni Poggi
Koki Nagano
L. Verdoliva
114
8
0
30 Apr 2024
NeRF-Insert: 3D Local Editing with Multimodal Control Signals
NeRF-Insert: 3D Local Editing with Multimodal Control Signals
Benet Oriol Sabat
Alessandro Achille
Matthew Trager
Stefano Soatto
69
2
0
30 Apr 2024
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
Minghao Chen
Iro Laina
Andrea Vedaldi
3DGS
98
29
0
29 Apr 2024
G-Refine: A General Quality Refiner for Text-to-Image Generation
G-Refine: A General Quality Refiner for Text-to-Image Generation
Chunyi Li
Haoning Wu
Hongkun Hao
Zicheng Zhang
Tengchaun Kou
Chaofeng Chen
Lei Bai
Xiaohong Liu
Weisi Lin
Guangtao Zhai
96
4
0
29 Apr 2024
WorldGPT: Empowering LLM as Multimodal World Model
WorldGPT: Empowering LLM as Multimodal World Model
Zhiqi Ge
Hongzhe Huang
Mingze Zhou
Juncheng Li
Guoming Wang
Siliang Tang
Yueting Zhuang
77
29
0
28 Apr 2024
Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Paint by Inpaint: Learning to Add Image Objects by Removing Them First
Navve Wasserman
Noam Rotstein
Roy Ganz
Ron Kimmel
DiffM
135
16
0
28 Apr 2024
DM-Align: Leveraging the Power of Natural Language Instructions to Make
  Changes to Images
DM-Align: Leveraging the Power of Natural Language Instructions to Make Changes to Images
Maria Mihaela Truşcǎ
Tinne Tuytelaars
Marie-Francine Moens
DiffM
79
1
0
27 Apr 2024
SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse
  Attributes
SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes
Georgia Baltsou
Ioannis Sarridis
C. Koutlis
Symeon Papadopoulos
70
3
0
26 Apr 2024
ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion
Ziyue Zhang
Mingbao Lin
Rongrong Ji
Rongrong Ji
DiffM
142
3
0
26 Apr 2024
Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
Han Wang
Xinning Chai
Yiwen Wang
Yuhong Zhang
Rong Xie
Li Song
DiffM
68
2
0
25 Apr 2024
Editable Image Elements for Controllable Synthesis
Editable Image Elements for Controllable Synthesis
Jiteng Mu
Michael Gharbi
Richard Zhang
Eli Shechtman
Nuno Vasconcelos
Xiaolong Wang
Taesung Park
DiffM
92
9
0
24 Apr 2024
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with
  Reward Feedback Learning
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning
Weifeng Chen
Jiacheng Zhang
Jie Wu
Hefeng Wu
Xuefeng Xiao
Liang Lin
100
13
0
23 Apr 2024
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Amirmojtaba Sabour
Sanja Fidler
Karsten Kreis
DiffM
96
37
0
22 Apr 2024
Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting
Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting
Weili Zeng
Yichao Yan
Qi Zhu
Zhuo Chen
Pengzhi Chu
Weiming Zhao
Xiaokang Yang
164
10
0
22 Apr 2024
U Can't Gen This? A Survey of Intellectual Property Protection Methods
  for Data in Generative AI
U Can't Gen This? A Survey of Intellectual Property Protection Methods for Data in Generative AI
Tanja Sarcevic
Alicja Karlowicz
Rudolf Mayer
Ricardo A. Baeza-Yates
Andreas Rauber
103
7
0
22 Apr 2024
Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas
Gorgeous: Create Your Desired Character Facial Makeup from Any Ideas
Jia Wei Sii
Chee Seng Chan
DiffM
103
0
0
22 Apr 2024
A Multimodal Automated Interpretability Agent
A Multimodal Automated Interpretability Agent
Tamar Rott Shaham
Sarah Schwettmann
Franklin Wang
Achyuta Rajaram
Evan Hernandez
Jacob Andreas
Antonio Torralba
221
28
0
22 Apr 2024
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
Yuying Ge
Sijie Zhao
Jinguo Zhu
Yixiao Ge
Kun Yi
Lin Song
Chen Li
Xiaohan Ding
Ying Shan
VLM
142
142
0
22 Apr 2024
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation
Haoyu Zheng
Wenqiao Zhang
Yaoke Wang
Hao Zhou
Jiang Liu
Juncheng Li
Zheqi Lv
Siliang Tang
Yueting Zhuang
Yueting Zhuang
138
2
0
21 Apr 2024
Previous
123...141516...272829
Next