ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.09800
  4. Cited By
InstructPix2Pix: Learning to Follow Image Editing Instructions
v1v2 (latest)

InstructPix2Pix: Learning to Follow Image Editing Instructions

17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
    DiffM
ArXiv (abs)PDFHTML

Papers citing "InstructPix2Pix: Learning to Follow Image Editing Instructions"

50 / 1,418 papers shown
Title
Latent Diffusion Counterfactual Explanations
Latent Diffusion Counterfactual Explanations
Karim Farid
Simon Schrodi
Max Argus
Thomas Brox
DiffM
99
14
0
10 Oct 2023
FireAct: Toward Language Agent Fine-tuning
FireAct: Toward Language Agent Fine-tuning
Baian Chen
Chang Shu
Ehsan Shareghi
Nigel Collier
Karthik Narasimhan
Shunyu Yao
ALMLLMAG
177
112
0
09 Oct 2023
IPDreamer: Appearance-Controllable 3D Object Generation with Complex
  Image Prompts
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
Bo-Wen Zeng
Shanglin Li
Yutang Feng
Ling Yang
Hong Li
...
Conghui He
Wentao Zhang
Jianzhuang Liu
Baochang Zhang
Shuicheng Yan
DiffM
97
2
0
09 Oct 2023
Efficient-3DiM: Learning a Generalizable Single-image Novel-view
  Synthesizer in One Day
Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day
Yi Ding
Hao Tang
Jen-Hao Rick Chang
Liangchen Song
Zhangyang Wang
Liangliang Cao
DiffM
106
11
0
04 Oct 2023
Kosmos-G: Generating Images in Context with Multimodal Large Language
  Models
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Xichen Pan
Li Dong
Shaohan Huang
Zhiliang Peng
Wenhu Chen
Furu Wei
VLM
152
68
0
04 Oct 2023
Probing Intersectional Biases in Vision-Language Models with
  Counterfactual Examples
Probing Intersectional Biases in Vision-Language Models with Counterfactual Examples
Phillip Howard
Avinash Madasu
Tiep Le
Gustavo Lujan Moreno
Vasudev Lal
VLM
58
5
0
04 Oct 2023
T$^3$Bench: Benchmarking Current Progress in Text-to-3D Generation
T3^33Bench: Benchmarking Current Progress in Text-to-3D Generation
Yuze He
Yushi Bai
Matthieu Lin
Wang Zhao
Yubin Hu
Jenny Sheng
Ran Yi
Juanzi Li
Yong Liu
128
33
0
04 Oct 2023
Magicremover: Tuning-free Text-guided Image inpainting with Diffusion
  Models
Magicremover: Tuning-free Text-guided Image inpainting with Diffusion Models
Si-hang Yang
Lu Zhang
Liqian Ma
Yu Liu
JingJing Fu
You He
DiffM
55
13
0
04 Oct 2023
MagicDrive: Street View Generation with Diverse 3D Geometry Control
MagicDrive: Street View Generation with Diverse 3D Geometry Control
Ruiyuan Gao
Kai Chen
Enze Xie
Lanqing Hong
Zhenguo Li
Dit-Yan Yeung
Qiang Xu
DiffM
96
122
0
04 Oct 2023
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
Samyadeep Basu
Mehrdad Saberi
S. Bhardwaj
Atoosa Malemir Chegini
Daniela Massiceti
Maziar Sanjabi
S. Hu
Soheil Feizi
94
22
0
03 Oct 2023
TP2O: Creative Text Pair-to-Object Generation using Balance
  Swap-Sampling
TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling
Jun Li
Zedong Zhang
Jian Yang
DiffM
83
7
0
03 Oct 2023
ImagenHub: Standardizing the evaluation of conditional image generation
  models
ImagenHub: Standardizing the evaluation of conditional image generation models
Max Ku
Tianle Li
Kai Zhang
Yujie Lu
Xingyu Fu
Wenwen Zhuang
Wenhu Chen
EGVM
132
48
0
02 Oct 2023
Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code
Xu Ju
Ailing Zeng
Hao Wang
Shaoteng Liu
Qiang Xu
DiffM
127
78
0
02 Oct 2023
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster
  Image Generation
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation
Kangfu Mei
M. Delbracio
Hossein Talebi
Zhengzhong Tu
Vishal M. Patel
P. Milanfar
VLMDiffM
97
15
0
02 Oct 2023
Making LLaMA SEE and Draw with SEED Tokenizer
Making LLaMA SEE and Draw with SEED Tokenizer
Yuying Ge
Sijie Zhao
Ziyun Zeng
Yixiao Ge
Chen Li
Xintao Wang
Ying Shan
80
137
0
02 Oct 2023
Controlling Vision-Language Models for Multi-Task Image Restoration
Controlling Vision-Language Models for Multi-Task Image Restoration
Ziwei Luo
Fredrik K. Gustafsson
Zheng Zhao
Jens Sjölund
Thomas B. Schon
VLM
146
41
0
02 Oct 2023
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose
  Generation via Diffusion Models
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Zhiyao Sun
Tian Lv
Sheng Ye
Matthieu Lin
Jenny Sheng
Yuhui Wen
Minjing Yu
Yong Liu
DiffM
144
48
0
30 Sep 2023
InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision
  Generalists
InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists
Yulu Gan
Sungwoo Park
Alexander Schubert
Anthony Philippakis
Ahmed Alaa
VLM
109
25
0
30 Sep 2023
Guiding Instruction-based Image Editing via Multimodal Large Language
  Models
Guiding Instruction-based Image Editing via Multimodal Large Language Models
Johannes Frey
Wenze Hu
Xianzhi Du
William Yang Wang
Yinfei Yang
Zhe Gan
114
98
0
29 Sep 2023
Leveraging Optimization for Adaptive Attacks on Image Watermarks
Leveraging Optimization for Adaptive Attacks on Image Watermarks
Nils Lukas
Abdulrahman Diaa
L. Fenaux
Florian Kerschbaum
AAMLWIGM
105
27
0
29 Sep 2023
RealFill: Reference-Driven Generation for Authentic Image Completion
RealFill: Reference-Driven Generation for Authentic Image Completion
Luming Tang
Nataniel Ruiz
Qinghao Chu
Yuanzhen Li
Aleksander Holynski
...
Bharath Hariharan
Yael Pritch
Neal Wadhwa
Kfir Aberman
Michael Rubinstein
DiffM
89
45
0
28 Sep 2023
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image
  Action Editing
KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image Action Editing
Jiarui Yao
Yifan Liu
Simon S. Du
Shifeng Chen
DiffM
64
24
0
28 Sep 2023
Emu: Enhancing Image Generation Models Using Photogenic Needles in a
  Haystack
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Xiaoliang Dai
Ji Hou
Chih-Yao Ma
Sam S. Tsai
Jialiang Wang
...
Roshan Sumbaly
Vignesh Ramanathan
Zijian He
Peter Vajda
Devi Parikh
VLM
91
216
0
27 Sep 2023
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for
  Text-Based Image Editing
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing
Kai Wang
Fei Yang
Shiqi Yang
Muhammad Atif Butt
Joost van de Weijer
DiffM
111
57
0
27 Sep 2023
FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image
  Editing
FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image Editing
Songyan Chen
Jiancheng Huang
DiffM
43
14
0
26 Sep 2023
Directional Texture Editing for 3D Models
Directional Texture Editing for 3D Models
Shengqi Liu
Zhuo Chen
Jin Gao
Yichao Yan
Wenhan Zhu
Jia-Ming Lyu
Xiaokang Yang
DiffM
96
0
0
26 Sep 2023
COCO-Counterfactuals: Automatically Constructed Counterfactual Examples
  for Image-Text Pairs
COCO-Counterfactuals: Automatically Constructed Counterfactual Examples for Image-Text Pairs
Tiep Le
Vasudev Lal
Phillip Howard
DiffM
81
30
0
23 Sep 2023
Multimodal Deep Learning for Scientific Imaging Interpretation
Multimodal Deep Learning for Scientific Imaging Interpretation
Abdulelah S. Alshehri
Franklin L. Lee
Shihu Wang
38
2
0
21 Sep 2023
PIE: Simulating Disease Progression via Progressive Image Editing
PIE: Simulating Disease Progression via Progressive Image Editing
Kaizhao Liang
Xu Cao
Kuei-Da Liao
Tianren Gao
Wenqian Ye
Zhengyu Chen
Jianguo Cao
Tejas Nama
Jimeng Sun
MedImAI4CE
86
5
0
21 Sep 2023
Interactive Flexible Style Transfer for Vector Graphics
Interactive Flexible Style Transfer for Vector Graphics
Jeremy Warner
Kyu Won Kim
Bjoern Hartmann
54
9
0
20 Sep 2023
Language-driven Object Fusion into Neural Radiance Fields with
  Pose-Conditioned Dataset Updates
Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates
Kashun Shum
Jaeyeon Kim
Binh-Son Hua
Duc Thanh Nguyen
Sai-Kit Yeung
3DHAI4CE
74
8
0
20 Sep 2023
Forgedit: Text Guided Image Editing via Learning and Forgetting
Forgedit: Text Guided Image Editing via Learning and Forgetting
Shiwen Zhang
Shuai Xiao
Weilin Huang
DiffM
76
21
0
19 Sep 2023
Diffusion Methods for Generating Transition Paths
Diffusion Methods for Generating Transition Paths
Luke Triplett
Jianfeng Lu
50
6
0
19 Sep 2023
Progressive Text-to-Image Diffusion with Soft Latent Direction
Progressive Text-to-Image Diffusion with Soft Latent Direction
Yuteng Ye
Jiale Cai
Hang Zhou
Guanwen Li
Youjia Zhang
Zikai Song
Chenxing Gao
Junqing Yu
Wei Yang
104
5
0
18 Sep 2023
PoseFix: Correcting 3D Human Poses with Natural Language
PoseFix: Correcting 3D Human Poses with Natural Language
Ginger Delmas
Philippe Weinzaepfel
Francesc Moreno-Noguer
Grégory Rogez
97
23
0
15 Sep 2023
Limitations of Face Image Generation
Limitations of Face Image Generation
Harrison Rosenberg
Shimaa Ahmed
Guruprasad V Ramesh
Ramya Korlakai Vinayak
Kassem Fawaz
60
1
0
13 Sep 2023
ITI-GEN: Inclusive Text-to-Image Generation
ITI-GEN: Inclusive Text-to-Image Generation
Cheng Zhang
Xuanbai Chen
Siqi Chai
Chen Henry Wu
Dmitry Lagun
Thabo Beeler
Fernando de la Torre
VLM
122
58
0
11 Sep 2023
Editing 3D Scenes via Text Prompts without Retraining
Editing 3D Scenes via Text Prompts without Retraining
Shuangkang Fang
Yufeng Wang
Yezhou Yang
Yi-Hsuan Tsai
Wenrui Ding
Shuchang Zhou
Ming-Hsuan Yang
DiffM
63
2
0
10 Sep 2023
MoEController: Instruction-based Arbitrary Image Manipulation with
  Mixture-of-Expert Controllers
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers
Sijia Li
Chen Chen
H. Lu
DiffM
81
10
0
08 Sep 2023
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Zigang Geng
Binxin Yang
Tiankai Hang
Chen Li
Shuyang Gu
...
Jianmin Bao
Zheng Zhang
Han Hu
DongDong Chen
Baining Guo
DiffMVLM
118
107
0
07 Sep 2023
My Art My Choice: Adversarial Protection Against Unruly AI
My Art My Choice: Adversarial Protection Against Unruly AI
Anthony Rhodes
Ram Bhagat
U. Ciftci
Ilke Demir
DiffM
102
4
0
06 Sep 2023
SLiMe: Segment Like Me
SLiMe: Segment Like Me
Aliasghar Khani
Saeid Asgari Taghanaki
Aditya Sanghi
Ali Mahdavi-Amiri
Ghassan Hamarneh
VLM
145
30
0
06 Sep 2023
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction
  Tuning
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
L. Yu
Bowen Shi
Ramakanth Pasunuru
Benjamin Muller
O. Yu. Golovneva
...
Yaniv Taigman
Maryam Fazel-Zarandi
Asli Celikyilmaz
Luke Zettlemoyer
Armen Aghajanyan
MLLM
101
142
0
05 Sep 2023
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Hierarchical Masked 3D Diffusion Model for Video Outpainting
Fanda Fan
Chaoxu Guo
Litong Gong
Biao Wang
T. Ge
Yuning Jiang
Chunjie Luo
Jianfeng Zhan
DiffMVGen
85
15
0
05 Sep 2023
Iterative Multi-granular Image Editing using Diffusion Models
Iterative Multi-granular Image Editing using Diffusion Models
K. J. Joseph
Prateksha Udhayanan
Tripti Shukla
Aishwarya Agarwal
Srikrishna Karanam
Koustava Goswami
Balaji Vasan Srinivasan
DiffM
97
17
0
01 Sep 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High
  Definition Text-to-Video Generation
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation
Xin Li
Wenqing Chu
Ye Wu
Weihang Yuan
Fanglong Liu
Qi Zhang
Fu Li
Haocheng Feng
Errui Ding
Jingdong Wang
VGen
135
53
0
01 Sep 2023
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning
  Based on Visually Grounded Conversations
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations
Kilichbek Haydarov
Xiaoqian Shen
Avinash Madasu
Mahmoud Salem
Jia Li
Gamaleldin F. Elsayed
Mohamed Elhoseiny
67
4
0
30 Aug 2023
CoVR: Learning Composed Video Retrieval from Web Video Captions
CoVR: Learning Composed Video Retrieval from Web Video Captions
Lucas Ventura
Antoine Yang
Cordelia Schmid
Gül Varol
75
21
0
28 Aug 2023
Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and
  Personalized Stylization
Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized Stylization
Tao Yang
Rongyuan Wu
Peiran Ren
Xuansong Xie
Lei Zhang
DiffM
124
152
0
28 Aug 2023
ORES: Open-vocabulary Responsible Visual Synthesis
ORES: Open-vocabulary Responsible Visual Synthesis
Minheng Ni
Chenfei Wu
Xiaodong Wang
Sheng-Siang Yin
Lijuan Wang
Zicheng Liu
Nan Duan
DiffM
75
9
0
26 Aug 2023
Previous
123...232425...272829
Next