Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
Jiarui Xu
Yossi Gandelsman
Amir Bar
Jianwei Yang
Jianfeng Gao
Trevor Darrell
Xiaolong Wang
VLM
53
3
0
04 Dec 2023
Multimodality-guided Image Style Transfer using Cross-modal GAN Inversion
Hanyu Wang
Pengxiang Wu
Kevin Dela Rosa
Chen Wang
Abhinav Shrivastava
118
9
0
04 Dec 2023
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
Runze He
Shaofei Huang
Xuecheng Nie
Tianrui Hui
Luoqi Liu
Jiao Dai
Jizhong Han
Guanbin Li
Si Liu
DiffM
64
8
0
04 Dec 2023
The Contemporary Art of Image Search: Iterative User Intent Expansion via Vision-Language Model
Yilin Ye
Qian Zhu
Shishi Xiao
Kang Zhang
Wei Zeng
102
4
0
04 Dec 2023
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey
Paul Guerrero
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy Mitra
DiffM
80
33
0
02 Dec 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models
Yutong Bai
Xinyang Geng
K. Mangalam
Amir Bar
Alan Yuille
Trevor Darrell
Jitendra Malik
Alexei A. Efros
MLLM
VLM
88
169
0
01 Dec 2023
Gaussian Grouping: Segment and Edit Anything in 3D Scenes
Mingqiao Ye
Martin Danelljan
Fisher Yu
Lei Ke
3DGS
DiffM
129
188
0
01 Dec 2023
Text-Guided 3D Face Synthesis -- From Generation to Editing
Yunjie Wu
Yapeng Meng
Zhipeng Hu
Lincheng Li
Haoqian Wu
Kun Zhou
Weiwei Xu
Xin Yu
DiffM
130
10
0
01 Dec 2023
Lasagna: Layered Score Distillation for Disentangled Object Relighting
D. Bashkirova
Arijit Ray
Rupayan Mallick
Sarah Adel Bargal
Jianming Zhang
Ranjay Krishna
Kate Saenko
86
4
0
30 Nov 2023
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing
Qi Dai
Zihao Zhang
Hui Zhang
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
102
17
0
30 Nov 2023
S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion
V. Kolmogorov
Rustem Takhanov
Dani Lischinski
DiffM
86
3
0
30 Nov 2023
Exploiting Diffusion Prior for Generalizable Dense Prediction
Hsin-Ying Lee
Hung-Yu Tseng
Hsin-Ying Lee
Ming-Hsuan Yang
DiffM
MDE
96
23
0
30 Nov 2023
Motion-Conditioned Image Animation for Video Editing
Wilson Yan
Andrew Brown
Pieter Abbeel
Rohit Girdhar
S. Azadi
DiffM
VGen
131
12
0
30 Nov 2023
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples
Phillip Howard
Avinash Madasu
Tiep Le
Gustavo Lujan Moreno
Anahita Bhiwandiwalla
Vasudev Lal
123
24
0
30 Nov 2023
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Zineng Tang
Ziyi Yang
Mahmoud Khademi
Yang Liu
Chenguang Zhu
Mohit Bansal
LRM
MLLM
AuLLM
127
52
0
30 Nov 2023
Detailed Human-Centric Text Description-Driven Large Scene Synthesis
Gwanghyun Kim
Dong un Kang
H. Seo
Hayeon Kim
Se Young Chun
3DV
DiffM
61
2
0
30 Nov 2023
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
Hyelin Nam
Gihyun Kwon
Geon Yeong Park
Jong Chul Ye
DiffM
92
29
0
30 Nov 2023
CosAvatar: Consistent and Animatable Portrait Video Tuning with Text Prompt
Haiyao Xiao
Chenglai Zhong
Xuan Gao
Yudong Guo
Juyong Zhang
73
0
0
30 Nov 2023
Non-Cross Diffusion for Semantic Consistency
Ziyang Zheng
Ruiyuan Gao
Qiang Xu
DiffM
72
2
0
30 Nov 2023
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Zhonghao Wang
Wei Wei
Yang Zhao
Zhisheng Xiao
M. Hasegawa-Johnson
Humphrey Shi
Tingbo Hou
DiffM
123
12
0
30 Nov 2023
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Qidong Huang
Xiao-wen Dong
Pan Zhang
Bin Wang
Conghui He
Jiaqi Wang
Dahua Lin
Weiming Zhang
Neng H. Yu
MLLM
144
206
0
29 Nov 2023
SODA: Bottleneck Diffusion Models for Representation Learning
Drew A. Hudson
Daniel Zoran
Mateusz Malinowski
Andrew Kyle Lampinen
Andrew Jaegle
James L. McClelland
Loic Matthey
Felix Hill
Alexander Lerchner
DiffM
106
56
0
29 Nov 2023
VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models
Xiang Li
Qianli Shen
Kenji Kawaguchi
81
5
0
29 Nov 2023
SmoothVideo: Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning
Liang Peng
Haoran Cheng
Zheng Yang
Ruisi Zhao
Linxuan Xia
Chaotian Song
Qinglin Lu
Boxi Wu
Wei Liu
VGen
60
2
0
29 Nov 2023
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
Haoyu Zhao
Tianyi Lu
Jiaxi Gu
Xing Zhang
Qingping Zheng
Zuxuan Wu
Hang Xu
Yu-Gang Jiang
VGen
DiffM
119
12
0
29 Nov 2023
Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Mohammad Mahdi Derakhshani
Menglin Xia
Harkirat Singh Behl
Cees G. M. Snoek
Victor Rühle
86
2
0
28 Nov 2023
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Cong Wei
Yang Chen
Haonan Chen
Hexiang Hu
Ge Zhang
Jie Fu
Alan Ritter
Wenhu Chen
88
70
0
28 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
119
50
0
28 Nov 2023
COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design
Peidong Jia
Chenxuan Li
Yuhui Yuan
Zeyu Liu
Yichao Shen
...
Dong Chen
Ji Li
Xiaodong Xie
Shanghang Zhang
Baining Guo
67
8
0
28 Nov 2023
Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis
Xiaohui Chen
Yongfei Liu
Yingxiang Yang
Jianbo Yuan
Quanzeng You
Liping Liu
Hongxia Yang
DiffM
86
13
0
28 Nov 2023
LEDITS++: Limitless Image Editing using Text-to-Image Models
Manuel Brack
Felix Friedrich
Katharina Kornmeier
Linoy Tsaban
P. Schramowski
Kristian Kersting
Apolinário Passos
DiffM
105
76
0
28 Nov 2023
ROSO: Improving Robotic Policy Inference via Synthetic Observations
Yusuke Miyashita
Dimitris Gahtidis
Colin La
Jeremy Rabinowicz
Juxi Leitner
50
2
0
28 Nov 2023
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation
Jingkuan Song
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
VGen
73
4
0
28 Nov 2023
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
103
20
0
28 Nov 2023
CLiC: Concept Learning in Context
Mehdi Safaee
Aryan Mikaeili
Or Patashnik
Daniel Cohen-Or
Ali Mahdavi-Amiri
84
11
0
28 Nov 2023
Self-correcting LLM-controlled Diffusion Models
Tsung-Han Wu
Long Lian
Joseph E. Gonzalez
Boyi Li
Trevor Darrell
127
67
0
27 Nov 2023
GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions
Jiemin Fang
Junjie Wang
Xiaopeng Zhang
Lingxi Xie
Qi Tian
3DGS
DiffM
130
117
0
27 Nov 2023
LLMGA: Multimodal Large Language Model based Generation Assistant
Bin Xia
Shiyin Wang
Yingfan Tao
Yitong Wang
Jiaya Jia
MLLM
95
12
0
27 Nov 2023
Instruct2Attack: Language-Guided Semantic Adversarial Attacks
Jiang-Long Liu
Chen Wei
Yuxiang Guo
Heng Yu
Alan Yuille
Soheil Feizi
Chun Pong Lau
Rama Chellappa
DiffM
AAML
95
7
0
27 Nov 2023
DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination
KamWoh Ng
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
DiffM
47
6
0
27 Nov 2023
Z
∗
Z^*
Z
∗
: Zero-shot Style Transfer via Attention Rearrangement
Yingying Deng
Xiangyu He
Fan Tang
Weiming Dong
DiffM
84
8
0
25 Nov 2023
GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting
Yiwen Chen
Zilong Chen
Chi Zhang
Feng Wang
Xiaofeng Yang
Yikai Wang
Zhongang Cai
Lei Yang
Huaping Liu
Guosheng Lin
3DGS
189
200
0
24 Nov 2023
Highly Detailed and Temporal Consistent Video Stylization via Synchronized Multi-Frame Diffusion
M. Xie
Hanyuan Liu
Chengze Li
Tien-Tsin Wong
VGen
DiffM
113
0
0
24 Nov 2023
DemoFusion: Democratising High-Resolution Image Generation With No
Ruoyi Du
Dongliang Chang
Timothy M. Hospedales
Yi-Zhe Song
Zhanyu Ma
127
56
0
24 Nov 2023
Image Super-Resolution with Text Prompt Diffusion
Zheng Chen
Yulun Zhang
Jinjin Gu
Xin Yuan
Linghe Kong
Guihai Chen
Xiaokang Yang
DiffM
152
21
0
24 Nov 2023
Posterior Distillation Sampling
Juil Koo
Chanho Park
Minhyuk Sung
DiffM
114
30
0
23 Nov 2023
A Somewhat Robust Image Watermark against Diffusion-based Editing Models
Mingtian Tan
Tianhao Wang
Somesh Jha
WIGM
81
3
0
22 Nov 2023
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGen
DiffM
119
25
0
21 Nov 2023
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Rohit Gandikota
Joanna Materzyñska
Tingrui Zhou
Antonio Torralba
David Bau
DiffM
116
77
0
20 Nov 2023
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
Chenliang Zhou
Fangcheng Zhong
Param Hanji
Zhilin Guo
Kyle Fogarty
Alejandro Sztrajman
Hongyun Gao
Cengiz Öztireli
80
3
0
20 Nov 2023
Previous
1
2
3
...
21
22
23
...
27
28
29
Next