Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.14180
Cited By
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
26 August 2024
Yiwei Ma
Jiayi Ji
Ke Ye
Weihuang Lin
Zhibin Wang
Yonghan Zheng
Qiang-feng Zhou
Xiaoshuai Sun
Rongrong Ji
Re-assign community
ArXiv
PDF
HTML
Papers citing
"I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing"
50 / 52 papers shown
Title
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi
Davide Bucciarelli
Federico Betti
Marcella Cornia
Lorenzo Baraldi
N. Sebe
Rita Cucchiara
200
0
0
26 May 2025
CompBench: Benchmarking Complex Instruction-guided Image Editing
Bohan Jia
Wenxuan Huang
Yuntian Tang
Junbo Qiao
Jincheng Liao
...
Lin Chen
Fei Zhao
Zihan Wang
Yuan Xie
Shaohui Lin
CoGe
125
1
0
18 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
253
0
0
05 May 2025
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei
Shengqiong Wu
Wei Ji
Hao Zhang
Hao Fei
Mong Li Lee
Wynne Hsu
LRM
VGen
111
78
0
08 Jan 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
Hao Zhang
Tat-Seng Chua
Shuicheng Yan
160
41
0
31 Dec 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Xingcheng Zhang
Jifeng Dai
Yuxin Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
87
122
0
09 Apr 2024
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
263
562
0
07 Mar 2024
Diffusion Model-Based Image Editing: A Survey
Yi Huang
Jiancheng Huang
Yifan Liu
Mingfu Yan
Jiaxi Lv
Jianzhuang Liu
Wei Xiong
He Zhang
Liangliang Cao
Liangliang Cao
EGVM
140
98
0
27 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
179
113
0
08 Feb 2024
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Xiangxiang Chu
Limeng Qiao
Xinyu Zhang
Shuang Xu
Fei Wei
...
Xiaofei Sun
Yiming Hu
Xinyang Lin
Bo Zhang
Chunhua Shen
VLM
MLLM
74
105
0
06 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
139
261
0
29 Jan 2024
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following
Shufan Li
Harkanwar Singh
Aditya Grover
DiffM
60
8
0
11 Dec 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
52
34
0
29 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLM
VLM
107
2,049
0
11 May 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
283
950
0
27 Apr 2023
RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
Ruixia Wu
Zheng-Peng Duan
Chunle Guo
Zhi Chai
Chongyi Li
59
92
0
08 Apr 2023
Learning A Sparse Transformer Network for Effective Image Deraining
Xiang Chen
Hao‐Ran Li
Mingqiang Li
Jin-shan Pan
ViT
87
231
0
21 Mar 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,359
0
15 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
426
4,563
0
30 Jan 2023
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang
Chitwan Saharia
Ceslee Montgomery
Jordi Pont-Tuset
Shai Noy
...
Radu Soricut
Jason Baldridge
Mohammad Norouzi
Peter Anderson
William Chan
65
183
0
13 Dec 2022
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Zhixing Zhang
Ligong Han
Arna Ghosh
Dimitris N. Metaxas
Jian Ren
DiffM
111
159
0
08 Dec 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
203
1,799
0
17 Nov 2022
GENIE: Higher-Order Denoising Diffusion Solvers
Tim Dockhorn
Arash Vahdat
Karsten Kreis
DiffM
82
111
0
11 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELM
ReLM
LRM
278
1,245
0
20 Sep 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
186
1,769
0
02 Aug 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
382
3,542
0
29 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
401
6,866
0
13 Apr 2022
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
277
30,109
0
01 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
532
4,360
0
28 Jan 2022
Denoising Diffusion Restoration Models
Bahjat Kawar
Michael Elad
Stefano Ermon
Jiaming Song
DiffM
269
836
0
27 Jan 2022
Reflash Dropout in Image Super-Resolution
Xiangtao Kong
Xina Liu
Jinjin Gu
Yu Qiao
Chao Dong
UQCV
76
58
0
22 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
419
15,515
0
20 Dec 2021
Blended Diffusion for Text-driven Editing of Natural Images
Omri Avrahami
Dani Lischinski
Ohad Fried
DiffM
116
947
0
29 Nov 2021
From Synthetic to Real: Image Dehazing Collaborating with Unlabeled Real Data
Ye Liu
Lei Zhu
Shunda Pei
Huazhu Fu
Jing Qin
Qing Zhang
Liang Wan
Wei Feng
52
132
0
06 Aug 2021
Structured Denoising Diffusion Models in Discrete State-Spaces
Jacob Austin
Daniel D. Johnson
Jonathan Ho
Daniel Tarlow
Rianne van den Berg
DiffM
167
941
0
07 Jul 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
927
29,436
0
26 Feb 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
397
4,953
0
24 Feb 2021
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
337
3,686
0
18 Feb 2021
WDNet: Watermark-Decomposition Network for Visible Watermark Removal
Yang Liu
Zhen Zhu
X. Bai
48
49
0
14 Dec 2020
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
275
7,384
0
06 Oct 2020
Training Generative Adversarial Networks with Limited Data
Tero Karras
M. Aittala
Janne Hellsten
S. Laine
J. Lehtinen
Timo Aila
GAN
157
1,886
0
11 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
780
42,055
0
28 May 2020
Analyzing and Improving the Image Quality of StyleGAN
Tero Karras
S. Laine
M. Aittala
Janne Hellsten
J. Lehtinen
Timo Aila
GAN
284
5,815
0
03 Dec 2019
Dense Haze: A benchmark for image dehazing with dense-haze and haze-free images
C. Ancuti
Cosmin Ancuti
M. Sbert
Radu Timofte
48
299
0
05 Apr 2019
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
583
10,561
0
12 Dec 2018
Deep Retinex Decomposition for Low-Light Enhancement
Chen Wei
Wenjing Wang
Wenhan Yang
Jiaying Liu
102
1,725
0
14 Aug 2018
Self-Attention Generative Adversarial Networks
Han Zhang
Ian Goodfellow
Dimitris N. Metaxas
Augustus Odena
GAN
145
3,726
0
21 May 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
377
11,795
0
11 Jan 2018
Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring
Seungjun Nah
Tae Hyun Kim
Kyoung Mu Lee
139
1,974
0
07 Dec 2016
Least Squares Generative Adversarial Networks
Xudong Mao
Qing Li
Haoran Xie
Raymond Y. K. Lau
Zhen Wang
Stephen Paul Smolley
GAN
329
4,574
0
13 Nov 2016
1
2
Next