ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.02160
50
4

Less-to-More Generalization: Unlocking More Controllability by In-Context Generation

2 April 2025
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
    DiffM
ArXivPDFHTML
Abstract

Although subject-driven generation has been extensively explored in image generation due to its wide applications, it still has challenges in data scalability and subject expansibility. For the first challenge, moving from curating single-subject datasets to multiple-subject ones and scaling them is particularly difficult. For the second, most recent methods center on single-subject generation, making it hard to apply when dealing with multi-subject scenarios. In this study, we propose a highly-consistent data synthesis pipeline to tackle this challenge. This pipeline harnesses the intrinsic in-context generation capabilities of diffusion transformers and generates high-consistency multi-subject paired data. Additionally, we introduce UNO, which consists of progressive cross-modal alignment and universal rotary position embedding. It is a multi-image conditioned subject-to-image model iteratively trained from a text-to-image model. Extensive experiments show that our method can achieve high consistency while ensuring controllability in both single-subject and multi-subject driven generation.

View on arXiv
@article{wu2025_2504.02160,
  title={ Less-to-More Generalization: Unlocking More Controllability by In-Context Generation },
  author={ Shaojin Wu and Mengqi Huang and Wenxu Wu and Yufeng Cheng and Fei Ding and Qian He },
  journal={arXiv preprint arXiv:2504.02160},
  year={ 2025 }
}
Comments on this paper