ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18414
  4. Cited By
U-REPA: Aligning Diffusion U-Nets to ViTs

U-REPA: Aligning Diffusion U-Nets to ViTs

24 March 2025
Yuchuan Tian
Hanting Chen
Mengyu Zheng
Yuchen Liang
Chao Xu
Yunhe Wang
ArXiv (abs)PDFHTML

Papers citing "U-REPA: Aligning Diffusion U-Nets to ViTs"

22 / 22 papers shown
Title
REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
Ziqiao Wang
Wangbo Zhao
Yuhao Zhou
Zekai Li
Zhiyuan Liang
...
Pengfei Zhou
Kai Zhang
Zhangyang Wang
Kai Wang
Yang You
86
0
0
22 May 2025
Open-Sora: Democratizing Efficient Video Production for All
Open-Sora: Democratizing Efficient Video Production for All
Zangwei Zheng
Xiangyu Peng
Tianji Yang
Chenhui Shen
Shenggui Li
Hongxin Liu
Yukun Zhou
Tianyi Li
Yang You
VGen
160
256
0
31 Dec 2024
DiC: Rethinking Conv3x3 Designs in Diffusion Models
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian
Jing Han
Chengcheng Wang
Yuchen Liang
Chao Xu
Hanting Chen
DiffM
142
2
0
31 Dec 2024
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu
Sangkyung Kwak
Huiwon Jang
Jongheon Jeong
Jonathan Huang
Jinwoo Shin
Saining Xie
OCL
157
102
0
09 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
140
86
0
08 Oct 2024
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K
  Text-to-Image Generation
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Junsong Chen
Chongjian Ge
Enze Xie
Yue Wu
Lewei Yao
Xiaozhe Ren
Zhongdao Wang
Ping Luo
Huchuan Lu
Zhenguo Li
202
123
0
07 Mar 2024
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
  Interpolant Transformers
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Nanye Ma
Mark Goldstein
M. S. Albergo
Nicholas M. Boffi
Eric Vanden-Eijnden
Saining Xie
DiffM
126
214
0
16 Jan 2024
Fast Training of Diffusion Models with Masked Transformers
Fast Training of Diffusion Models with Masked Transformers
Hongkai Zheng
Weili Nie
Arash Vahdat
Anima Anandkumar
DiffM
94
73
0
15 Jun 2023
Structural Pruning for Diffusion Models
Structural Pruning for Diffusion Models
Gongfan Fang
Xinyin Ma
Xinchao Wang
96
140
0
18 May 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
120
2,434
0
19 Dec 2022
All are Worth Words: A ViT Backbone for Diffusion Models
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
146
365
0
25 Sep 2022
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
502
15,788
0
20 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViTTPM
477
7,827
0
11 Nov 2021
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
271
7,958
0
11 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
730
6,135
0
29 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
327
2,533
0
20 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
999
29,871
0
26 Feb 2021
Score-Based Generative Modeling through Stochastic Differential
  Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song
Jascha Narain Sohl-Dickstein
Diederik P. Kingma
Abhishek Kumar
Stefano Ermon
Ben Poole
DiffMSyDa
367
6,586
0
26 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
682
41,483
0
22 Oct 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
295
7,492
0
06 Oct 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
742
18,364
0
19 Jun 2020
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg3DV
1.9K
77,441
0
18 May 2015
1