Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.09800
Cited By
v1
v2 (latest)
InstructPix2Pix: Learning to Follow Image Editing Instructions
17 November 2022
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"InstructPix2Pix: Learning to Follow Image Editing Instructions"
50 / 1,418 papers shown
Title
Quantifying and Enabling the Interpretability of CLIP-like Models
Avinash Madasu
Yossi Gandelsman
Vasudev Lal
Phillip Howard
VLM
99
2
0
10 Sep 2024
PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation
Ginger Delmas
Philippe Weinzaepfel
Francesc Moreno-Noguer
Grégory Rogez
65
2
0
10 Sep 2024
NeIn: Telling What You Don't Want
Nhat-Tan Bui
Dinh-Hieu Hoang
Quoc-Huy Trinh
Minh-Triet Tran
Truong Nguyen
Susan Gauch
146
2
0
09 Sep 2024
Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
Jiaxin Cheng
Zixu Zhao
Tong He
Tianjun Xiao
Yicong Zhou
Zheng Zhang
DiffM
146
0
0
07 Sep 2024
DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation
Wenliang Zhao
Haolin Wang
Jie Zhou
Jiwen Lu
DiffM
54
3
0
05 Sep 2024
DiVE: DiT-based Video Generation with Enhanced Control
Junpeng Jiang
Gangyi Hong
Lijun Zhou
Enhui Ma
Hengtong Hu
...
Kaicheng Yu
Haiyang Sun
Kun Zhan
Peng Jia
Miao Zhang
VGen
DiffM
54
14
0
03 Sep 2024
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing
Vadim Titov
Madina Khalmatova
Alexandra Ivanova
Dmitry Vetrov
Aibek Alanov
DiffM
131
8
0
02 Sep 2024
COMOGen: A Controllable Text-to-3D Multi-object Generation Framework
Shaorong Sun
Shuchao Pang
Yazhou Yao
Xiaoshui Huang
69
1
0
01 Sep 2024
EraseDraw: Learning to Insert Objects by Erasing Them from Images
Alper Canberk
Maksym Bondarenko
Ege Ozguroglu
Ruoshi Liu
Carl Vondrick
DiffM
117
2
0
31 Aug 2024
Training-Free Sketch-Guided Diffusion with Latent Optimization
Sandra Zhang Ding
Jiafeng Mao
Kiyoharu Aizawa
DiffM
184
3
0
31 Aug 2024
Box2Flow: Instance-based Action Flow Graphs from Videos
Jiatong Li
Kalliopi Basioti
Vladimir Pavlovic
126
0
0
30 Aug 2024
GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models
Moreno DÍncà
E. Peruzzo
Massimiliano Mancini
Xingqian Xu
Humphrey Shi
N. Sebe
103
0
0
29 Aug 2024
TEDRA: Text-based Editing of Dynamic and Photoreal Actors
Basavaraj Sunagad
Heming Zhu
Mohit Mendiratta
Adam Kortylewski
Christian Theobalt
Marc Habermann
DiffM
99
1
0
28 Aug 2024
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Fabio Quattrini
Vittorio Pippi
Silvia Cascianelli
Rita Cucchiara
72
3
0
28 Aug 2024
Alfie: Democratising RGBA Image Generation With No
Fabio Quattrini
Vittorio Pippi
Silvia Cascianelli
Rita Cucchiara
DiffM
93
6
0
27 Aug 2024
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis
Weijia Li
Jun He
Junyan Ye
Huaping Zhong
Zhimeng Zheng
Zilong Huang
Dahua Lin
Conghui He
88
7
0
27 Aug 2024
DefectTwin: When LLM Meets Digital Twin for Railway Defect Inspection
Rahatara Ferdousi
M. Anwar Hossain
Chunsheng Yang
Abdulmotaleb El Saddik
31
3
0
26 Aug 2024
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy
Peiyan Li
Hongtao Wu
Yan Huang
Chilam Cheang
Liang Wang
Tao Kong
VGen
95
13
0
26 Aug 2024
ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
Xindi Wu
Dingli Yu
Yangsibo Huang
Olga Russakovsky
Sanjeev Arora
CoGe
EGVM
102
22
0
26 Aug 2024
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
Yiwei Ma
Jiayi Ji
Ke Ye
Weihuang Lin
Zhibin Wang
Yonghan Zheng
Qiang-feng Zhou
Xiaoshuai Sun
Rongrong Ji
130
11
0
26 Aug 2024
Avatar Concept Slider: Controllable Editing of Concepts in 3D Human Avatars
Yixuan He
Lin Geng Foo
Ajmal Mian
Hossein Rahmani
Jun Liu
Christian Theobalt
77
1
0
26 Aug 2024
Prompt-Softbox-Prompt: A free-text Embedding Control for Image Editing
Yitong Yang
Yinglin Wang
Jing Wang
Tian Zhang
DiffM
88
1
0
24 Aug 2024
Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing
Zitao Shuai
Chenwei Wu
Zhengxu Tang
Bowen Song
Liyue Shen
61
0
0
23 Aug 2024
Abstract Art Interpretation Using ControlNet
Rishabh Srivastava
Addrish Roy
26
0
0
23 Aug 2024
Diffusion-Based Visual Art Creation: A Survey and New Perspectives
Bingyuan Wang
Qifeng Chen
Zeyu Wang
114
7
0
22 Aug 2024
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Haoning Wu
Shaocheng Shen
Qiang Hu
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
114
11
0
20 Aug 2024
Learning Instruction-Guided Manipulation Affordance via Large Models for Embodied Robotic Tasks
Dayou Li
Chenkun Zhao
Shuo Yang
Lin Ma
Yibin Li
Wei Zhang
LM&Ro
69
1
0
20 Aug 2024
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data
Tao Yang
Yangming Shi
Yunwen Huang
Feng Chen
Yin Zheng
Lei Zhang
DiffM
VGen
87
0
0
19 Aug 2024
ARMADA: Attribute-Based Multimodal Data Augmentation
Xiaomeng Jin
Jeonghwan Kim
Yu Zhou
Kuan-Hao Huang
Te-Lin Wu
Nanyun Peng
Heng Ji
76
2
0
19 Aug 2024
Style-Editor: Text-driven object-centric style editing
Jihun Park
Jongmin Gim
Kyoungmin Lee
Seunghun Lee
Sunghoon Im
DiffM
72
0
0
16 Aug 2024
TurboEdit: Instant text-based image editing
Zongze Wu
Nicholas I. Kolkin
Jonathan Brandt
Richard Zhang
Eli Shechtman
DiffM
86
13
0
14 Aug 2024
DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency
Xiaojing Zhong
Xinyi Huang
Xiaofeng Yang
Guosheng Lin
Qingyao Wu
DiffM
77
4
0
14 Aug 2024
Connecting Dreams with Visual Brainstorming Instruction
Yasheng Sun
Bohan Li
Mingchen Zhuge
Deng-Ping Fan
Salman Khan
Fahad Shahbaz Khan
Hideki Koike
DiffM
64
0
0
14 Aug 2024
GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models
Lei Kang
Fei Yang
Kai Wang
Mohamed Ali Souibgui
Lluís Gómez
Alicia Fornés
Ernest Valveny
Dimosthenis Karatzas
DiffM
78
0
0
14 Aug 2024
Controlling the World by Sleight of Hand
Sruthi Sudhakar
Ruoshi Liu
Basile Van Hoorick
Carl Vondrick
Richard Zemel
98
4
0
13 Aug 2024
EditScribe: Non-Visual Image Editing with Natural Language Verification Loops
Ruei-Che Chang
Yuxuan Liu
Lotus Zhang
Anhong Guo
DiffM
65
2
0
13 Aug 2024
Egocentric Vision Language Planning
Zhirui Fang
Ming Yang
Weishuai Zeng
Boyu Li
Junpeng Yue
Ziluo Ding
Xiu Li
Zongqing Lu
LM&Ro
69
1
0
11 Aug 2024
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Qirui Jiao
Daoyuan Chen
Yilun Huang
Yaliang Li
Ying Shen
VLM
113
8
0
08 Aug 2024
InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting
Xin-Yi Yu
Jun-Xin Yu
Li-Bo Zhou
Yan Wei
Lin-Lin Ou
3DGS
71
7
0
08 Aug 2024
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts
Ciara Rowles
Shimon Vainer
Dante De Nigris
Slava Elizarov
Konstantin Kutsy
Simon Donné
DiffM
90
10
0
06 Aug 2024
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations
Leo Donisch
Sigurd Schacht
Carsten Lanquillon
81
2
0
06 Aug 2024
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Zhaowei Li
Wei Wang
Yiqing Cai
Xu Qi
Pengyu Wang
Dong Zhang
Hang Song
Botian Jiang
Zhida Huang
Tao Wang
AIFin
LRM
97
5
0
05 Aug 2024
ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning
Yanjie Wang
Alan Yuille
Zhuowan Li
Zilong Zheng
LRM
123
5
0
05 Aug 2024
SAT3D: Image-driven Semantic Attribute Transfer in 3D
Zhijun Zhai
Zengmao Wang
Xiaoxiao Long
Kaixuan Zhou
Bo Du
80
0
0
03 Aug 2024
Stimulating Imagination: Towards General-purpose Object Rearrangement
Jianyang Wu
Jie Gu
Xiaokang Ma
Chu Tang
Jingmin Chen
DiffM
LM&Ro
OCL
57
0
0
03 Aug 2024
Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis
Junyan Ye
Jun He
Weijia Li
Zhutao Lv
Yi Lin
Haote Yang
Haote Yang
Conghui He
93
0
0
03 Aug 2024
FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation
Xiang Gao
Jiaying Liu
118
2
0
02 Aug 2024
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
Gilad Deutch
Rinon Gal
Daniel Garibi
Or Patashnik
Daniel Cohen-Or
DiffM
86
28
0
01 Aug 2024
MotionFix: Text-Driven 3D Human Motion Editing
Lavrentia Aravani
Alpár Ceske
Markos Diomataris
Michael J. Black
Gül Varol
VGen
DiffM
107
19
0
01 Aug 2024
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He
Yi Zhuang
Yuanxun Lu
Yao Yao
Siyu Zhu
Xiaofei Wu
Zixiao Zhang
Xun Cao
Hao Zhu
3DH
90
3
0
01 Aug 2024
Previous
1
2
3
...
10
11
12
...
27
28
29
Next