Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel
Rinon Gal
Dvir Samuel
Yuval Atzmon
Lior Wolf
Gal Chechik
VLM
118
9
0
11 Nov 2024
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Cong Wei
Zheyang Xiong
Weiming Ren
Xinrun Du
Ge Zhang
Wenhu Chen
176
28
0
11 Nov 2024
Extreme Rotation Estimation in the Wild
Hana Bezalel
Dotan Ankri
Ruojin Cai
Hadar Averbuch-Elor
146
2
0
11 Nov 2024
ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
Jun-Kun Chen
Yu-Xiong Wang
DiffM
129
5
0
07 Nov 2024
Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation
Benito Buchheim
M. Reimann
Jürgen Döllner
63
0
0
07 Nov 2024
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
Ashutosh Srivastava
Tarun Ram Menta
Abhinav Java
Avadhoot Jadhav
Silky Singh
Surgan Jandial
Balaji Krishnamurthy
DiffM
74
1
0
06 Nov 2024
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions
Hao-Yu Hsu
Zhi-Hao Lin
Albert Zhai
Hongchi Xia
Shenlong Wang
VGen
105
11
0
04 Nov 2024
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Yang Yue
Yulin Wang
Bingyi Kang
Yizeng Han
Shenzhi Wang
Shiji Song
Jiashi Feng
Gao Huang
VLM
108
26
0
04 Nov 2024
NeRF-Aug: Data Augmentation for Robotics with Neural Radiance Fields
Eric Zhu
Mara Levy
M. Gwilliam
Abhinav Shrivastava
94
0
0
04 Nov 2024
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
Leonardo Plini
Luca Scofano
Edoardo De Matteis
Guido Maria DÁmely di Melendugno
Alessandro Flaborea
Andrea Sanchietti
G. Farinella
Fabio Galasso
Antonino Furnari
LRM
EgoV
116
1
0
04 Nov 2024
Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach
Qihe Pan
Zhen Zhao
Zicheng Wang
Sifan Long
Yiming Wu
Wei Ji
Haoran Liang
Ronghua Liang
56
2
0
03 Nov 2024
MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step
Takeshi Noda
C. L. P. Chen
Weiqi Zhang
Xinhai Liu
Yuhang Liu
Zhizhong Han
3DPC
117
8
0
02 Nov 2024
X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios
Yichen Xie
Chenfeng Xu
C-T.John Peng
Shuqi Zhao
Nhat Ho
Alexander T. Pham
Mingyu Ding
Masayoshi Tomizuka
Weidong Zhan
DiffM
92
3
0
02 Nov 2024
Cityscape-Adverse: Benchmarking Robustness of Semantic Segmentation with Realistic Scene Modifications via Diffusion-Based Image Editing
Naufal Suryanto
Andro Aprila Adiputra
Ahmada Yusril Kadiptya
Thi-Thu-Huong Le
Derry Pratama
Yongsu Kim
Howon Kim
DiffM
128
0
0
01 Nov 2024
Fashion-VDM: Video Diffusion Model for Virtual Try-On
J. Karras
Yingwei Li
Nan Liu
Luyang Zhu
Innfarn Yoo
Andreas Lugmayr
Chris Lee
Ira Kemelmacher-Shlizerman
DiffM
VGen
84
7
0
31 Oct 2024
Scaling Concept With Text-Guided Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
92
6
0
31 Oct 2024
PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement
Shutong Jin
Ruiyu Wang
Kuangyi Chen
Florian T. Pokorny
76
0
0
29 Oct 2024
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
Deepak Sridhar
Abhishek Peri
Rohith Rachala
Nuno Vasconcelos
DiffM
64
1
0
29 Oct 2024
AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models
Yaopei Zeng
Yuanpu Cao
Bochuan Cao
Yurui Chang
Jinghui Chen
Lu Lin
DiffM
92
3
0
28 Oct 2024
Novel Object Synthesis via Adaptive Text-Image Harmony
Zeren Xiong
Zedong Zhang
Zikun Chen
Shuo Chen
Xianrui Li
Gan Sun
Jian Yang
Jun Li
DiffM
95
4
0
28 Oct 2024
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images
Kyle Hatch
Ashwin Balakrishna
Oier Mees
Suraj Nair
Seohong Park
...
Masha Itkina
Benjamin Eysenbach
Sergey Levine
Thomas Kollar
Benjamin Burchfiel
119
4
0
26 Oct 2024
ArCSEM: Artistic Colorization of SEM Images via Gaussian Splatting
Takuma Nishimura
Andreea Dogaru
Martin Oeggerli
Bernhard Egger
70
0
0
25 Oct 2024
BIFRÖST: 3D-Aware Image compositing with Language Instructions
Lingxiao Li
Kaixiong Gong
Weihong Li
Xili Dai
Tao Chen
Xiaojun Yuan
Xiangyu Yue
101
2
0
24 Oct 2024
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Haonan Lin
Mengmeng Wang
Jiahao Wang
Wenbin An
Yan Chen
Yong Liu
Feng Tian
Guang Dai
Jingdong Wang
Qianying Wang
DiffM
87
12
0
24 Oct 2024
ChatSearch: a Dataset and a Generative Retrieval Model for General Conversational Image Retrieval
Zijia Zhao
Longteng Guo
Tongtian Yue
Erdong Hu
Shuai Shao
Zehuan Yuan
Hua Huang
Qingbin Liu
48
3
0
24 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
145
15
0
24 Oct 2024
Neural Cover Selection for Image Steganography
Karl Chahine
Hyeji Kim
DiffM
86
0
0
23 Oct 2024
WorldSimBench: Towards Video Generation Models as World Simulators
Yiran Qin
Zhelun Shi
Jiwen Yu
Xijun Wang
Enshen Zhou
...
Lu Sheng
Jing Shao
Junlin Wu
Wanli Ouyang
Ruimao Zhang
EGVM
VGen
218
477
0
23 Oct 2024
One-Step Diffusion Distillation through Score Implicit Matching
Weijian Luo
Zemin Huang
Zhengyang Geng
J. Zico Kolter
Guo-Jun Qi
DiffM
92
21
0
22 Oct 2024
Progressive Compositionality in Text-to-Image Generative Models
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
149
4
0
22 Oct 2024
DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding
Manan Suri
Puneet Mathur
Franck Dernoncourt
R. Jain
Vlad I. Morariu
Ramit Sawhney
Preslav Nakov
Dinesh Manocha
114
3
0
21 Oct 2024
A roadmap for generative mapping: unlocking the power of generative AI for map-making
Sidi Wu
Katharina Henggeler
Yizi Chen
L. Hurni
31
1
0
21 Oct 2024
MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications
Yongrui Yu
Yannian Gu
Shanghang Zhang
Xiaofan Zhang
MedIm
121
2
0
20 Oct 2024
DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer
Ying Hu
Chenyi Zhuang
Pan Gao
DiffM
55
1
0
19 Oct 2024
SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning
Zhewei Dai
Shilei Zeng
Haotian Liu
Xurui Li
Feng Xue
Yu Zhou
DiffM
64
3
0
19 Oct 2024
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation
Bo Cheng
Yuhang Ma
Liebucha Wu
Shanyuan Liu
Ao Ma
Xiaoyu Wu
Dawei Leng
Yuhui Yin
DiffM
54
13
0
18 Oct 2024
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
Rongyao Fang
Chengqi Duan
Kun Wang
Hao Li
H. Tian
Xingyu Zeng
Rui Zhao
Jifeng Dai
Hongsheng Li
Xihui Liu
MLLM
124
15
0
17 Oct 2024
AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing
DuoSheng Chen
Binghui Chen
Yifeng Geng
Liefeng Bo
DiffM
86
1
0
16 Oct 2024
Imagine2Servo: Intelligent Visual Servoing with Diffusion-Driven Goal Generation for Robotic Tasks
Pranjali Pathre
Gunjan Gupta
M. N. Qureshi
Mandyam Brunda
Samarth Brahmbhatt
K. M. Krishna
VGen
39
0
0
16 Oct 2024
SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Zhiyuan Zhang
Dongdong Chen
J. Liao
DiffM
122
3
0
15 Oct 2024
DreamSteerer: Enhancing Source Image Conditioned Editability using Personalized Diffusion Models
Zhengyang Yu
Zhaoyuan Yang
Jing Zhang
DiffM
96
3
0
15 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
121
8
0
15 Oct 2024
Incorporating Task Progress Knowledge for Subgoal Generation in Robotic Manipulation through Image Edits
Xuhui Kang
Yen-Ling Kuo
82
3
0
14 Oct 2024
MagicEraser: Erasing Any Objects via Semantics-Aware Control
Fan Li
Zixiao Zhang
Yi Huang
Jianzhuang Liu
Renjing Pei
Bin Shao
Songcen Xu
DiffM
77
8
0
14 Oct 2024
Learning to Customize Text-to-Image Diffusion In Diverse Context
Taewook Kim
Wei Chen
Qiang Qiu
DiffM
60
2
0
14 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
162
19
0
10 Oct 2024
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
Yukang Cao
Liang Pan
Kai Han
Kwan-Yee K. Wong
Ziwei Liu
VGen
129
6
0
09 Oct 2024
InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Bowen Jin
Ziqi Pang
Bingjun Guo
Yu-Xiong Wang
Jiaxuan You
Jiawei Han
DiffM
94
2
0
09 Oct 2024
Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control
Shimon Vainer
Konstantin Kutsy
Dante De Nigris
Ciara Rowles
Slava Elizarov
Simon Donné
DiffM
110
3
0
09 Oct 2024
HFH-Font: Few-shot Chinese Font Synthesis with Higher Quality, Faster Speed, and Higher Resolution
Haoyang Li
Zhouhui Lian
101
2
0
09 Oct 2024
Previous
1
2
3
...
8
9
10
...
27
28
29
Next