ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19149
  4. Cited By
MIND-Edit: MLLM Insight-Driven Editing via Language-Vision Projection

MIND-Edit: MLLM Insight-Driven Editing via Language-Vision Projection

25 May 2025
Shuyu Wang
Weiqi Li
Qian Wang
Shijie Zhao
Jian Zhang
    DiffM
ArXivPDFHTML

Papers citing "MIND-Edit: MLLM Insight-Driven Editing via Language-Vision Projection"

22 / 22 papers shown
Title
Step1X-Edit: A Practical Framework for General Image Editing
Step1X-Edit: A Practical Framework for General Image Editing
Shixuan Liu
Yucheng Han
Peng Xing
Fukun Yin
Rui Wang
...
Yibo Zhu
Binxing Jiao
Wei Wei
Gang Yu
Daxin Jiang
DiffM
145
9
0
24 Apr 2025
Insert Anything: Image Insertion via In-Context Editing in DiT
Insert Anything: Image Insertion via In-Context Editing in DiT
Wensong Song
Hong Jiang
Zongxing Yang
Ruijie Quan
Yi Yang
DiffM
87
2
0
21 Apr 2025
Transfer between Modalities with MetaQueries
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
73
12
0
08 Apr 2025
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
Nikai Du
Zhennan Chen
Zheyu Chen
Shan Gao
Xi Chen
Zhengkai Jiang
Jian Yang
Ying Tai
DiffM
58
1
0
30 Mar 2025
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Weiqi Li
Xinyu Zhang
Shijie Zhao
Yize Zhang
Junlin Li
Li Zhang
Jian Zhang
56
7
0
28 Mar 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
202
51
0
03 Jan 2025
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
Jinbin Bai
Wei Chow
L. Yang
Hefei Ling
Juncheng Billy Li
Hao Zhang
Shuicheng Yan
135
6
0
05 Dec 2024
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Qifan Yu
Wei Chow
Zhongqi Yue
Kaihang Pan
Yang Wu
Xiaoyang Wan
Juncheng Billy Li
Siliang Tang
Hao Zhang
Yueting Zhuang
DiffM
134
19
0
24 Nov 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
  Teacher
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
72
18
0
26 Aug 2024
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
Jue Wang
Yuxiang Lin
Tianshuo Yuan
Zhi-Qi Cheng
Xiaolong Wang
Jiao GH
Wei Chen
Xiaojiang Peng
DiffM
20
3
0
22 Aug 2024
GenArtist: Multimodal LLM as an Agent for Unified Image Generation and
  Editing
GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
Zhenyu Wang
Aoxue Li
Zhenguo Li
Xihui Liu
MLLM
DiffM
98
31
0
08 Jul 2024
Zero-shot Image Editing with Reference Imitation
Zero-shot Image Editing with Reference Imitation
Xi Chen
Yutong Feng
Mengting Chen
Yiyang Wang
Shilong Zhang
Yu Liu
Yujun Shen
Hengshuang Zhao
DiffM
51
24
0
11 Jun 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
134
290
0
16 May 2024
Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic
  Propagation
Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
Haofeng Liu
Chenshu Xu
Yifei Yang
Lihua Zeng
Shengfeng He
DiffM
86
25
0
01 Apr 2024
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
  Editing
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
Chong Mou
Xintao Wang
Jie Song
Ying Shan
Jian Zhang
DiffM
40
51
0
04 Feb 2024
360DVD: Controllable Panorama Video Generation with 360-Degree Video
  Diffusion Model
360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model
Qian Wang
Weiqi Li
Chong Mou
Xinhua Cheng
Jian Zhang
VGen
77
19
0
12 Jan 2024
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models
Chong Mou
Xintao Wang
Jie Song
Ying Shan
Jian Zhang
DiffM
73
148
0
05 Jul 2023
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
132
1,746
0
02 Aug 2022
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
703
28,659
0
26 Feb 2021
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
339
17,550
0
19 Jun 2020
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
513
10,500
0
12 Dec 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
302
11,610
0
11 Jan 2018
1