ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.08748
  4. Cited By
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with
  Fine-Grained Chinese Understanding

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

14 May 2024
Zhimin Li
Jianwei Zhang
Qin Lin
Jiangfeng Xiong
Yanxin Long
Xinchi Deng
Yingfang Zhang
Xingchao Liu
Minbin Huang
Zedong Xiao
Dayou Chen
Jiajun He
Jiahao Li
Wenyue Li
Chen Zhang
Rongwei Quan
Jianxiang Lu
Jiabin Huang
Xiaoyan Yuan
Xiao-Ting Zheng
Yixuan Li
Jihong Zhang
Chao Zhang
Mengxi Chen
Jie Liu
Zheng Fang
Weiyan Wang
Jinbao Xue
Yang-Dan Tao
Jianchen Zhu
Kai Liu
Si-Da Lin
Yifu Sun
Yun Li
Dongdong Wang
Ming-Dao Chen
Zhichao Hu
Xiao Xiao
Yan Chen
Yuhong Liu
Wei Liu
Dingyong Wang
Yong Yang
Jie Jiang
Qinglin Lu
    ViT
ArXivPDFHTML

Papers citing "Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding"

31 / 31 papers shown
Title
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Zhiyuan Chen
Keyi Li
Yifan Jia
Le Ye
Yufei Ma
DiffM
35
0
0
09 May 2025
WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution
WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution
Pietro Bongini
S. Mandelli
Andrea Montibeller
Mirko Casu
Orazio Pontorno
...
Paolo Bestagini
Irene Amerini
F. D. De Natale
Sebastiano Battiato
Mauro Barni
VLM
83
0
0
28 Apr 2025
RepText: Rendering Visual Text via Replicating
RepText: Rendering Visual Text via Replicating
Haozhao Wang
Yongjun Xu
Yong Li
Jiajun Li
Chaowei Zhang
Jingchao Wang
Kejia Yang
Z. Chen
VLM
66
0
0
28 Apr 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
Yiming Li
VLM
75
0
0
22 Apr 2025
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization
Liang Peng
Boxi Wu
Haoran Cheng
Yibo Zhao
Xiaofei He
36
0
0
20 Apr 2025
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training
Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training
Xinsong Zhang
Yarong Zeng
Xinting Huang
Hu Hu
Runquan Xie
Han Hu
Zhanhui Kang
MLLM
VLM
55
0
0
17 Apr 2025
Understanding Attention Mechanism in Video Diffusion Models
Understanding Attention Mechanism in Video Diffusion Models
Bingyan Liu
Chengyu Wang
Tongtong Su
Huan Ten
Jun Huang
K. Guo
Kui Jia
VGen
64
0
0
16 Apr 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
Feiyu Xiong
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
177
2
0
27 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Zhiqiang Zhang
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
73
1
0
25 Mar 2025
Unleashing Vecset Diffusion Model for Fast Shape Generation
Unleashing Vecset Diffusion Model for Fast Shape Generation
Zeqiang Lai
Yunfei Zhao
Zibo Zhao
Haolin Liu
Fuyun Wang
...
Jinwei Huang
Yuhong Liu
Jie Jiang
Chunchao Guo
Xiangyu Yue
DiffM
179
1
0
20 Mar 2025
Personalize Anything for Free with Diffusion Transformer
Personalize Anything for Free with Diffusion Transformer
Haoran Feng
Zehuan Huang
Lin Li
Hairong Lv
Lu Sheng
DiffM
87
1
0
16 Mar 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
Zhen Yang
Guibao Shen
Liang Hou
Mushui Liu
Luozhou Wang
Xin Tao
Pengfei Wan
Di Zhang
Ying-cong Chen
DiffM
79
0
0
04 Mar 2025
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li
Zhelun Shi
Xuhao Hu
Bowen Dong
Yiran Qin
Xihui Liu
Lu Sheng
Jing Shao
114
1
0
21 Feb 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Guoqing Ma
Haoyang Huang
K. Yan
L. Chen
Nan Duan
...
Yansen Wang
Yuanwei Lu
Yu-Cheng Chen
Yu-Juan Luo
Y. Luo
DiffM
VGen
175
18
0
14 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
122
4
0
11 Feb 2025
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Zibo Zhao
Zeqiang Lai
Qingxiang Lin
Yunfei Zhao
Haolin Liu
...
Jingwei Huang
Chunchao Guo
Jie Jiang
Jingwei Huang
Chunchao Guo
113
22
0
21 Jan 2025
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
Zheng Chong
Wenqing Zhang
Shiyue Zhang
Jun Zheng
Xiao Dong
Haoxiang Li
Yiling Wu
D. Jiang
Xiaodan Liang
DiffM
32
1
0
20 Jan 2025
Enhancing Image Generation Fidelity via Progressive Prompts
Enhancing Image Generation Fidelity via Progressive Prompts
Zhen Xiong
Yuqi Li
Chuanguang Yang
Tiao Tan
Zhihong Zhu
Siyuan Li
Yue Ma
45
1
0
13 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
J. Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
48
15
0
05 Jan 2025
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking
  Face Generation, Customization, and Restoration
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu
Huiyu Duan
Qiang Hu
Liu Yang
Chunlei Cai
Tianxiao Ye
Huayu Liu
Xiaoyun Zhang
Guangtao Zhai
EGVM
97
1
0
17 Dec 2024
AI-generated Image Detection: Passive or Watermark?
AI-generated Image Detection: Passive or Watermark?
Moyang Guo
Yuepeng Hu
Zhengyuan Jiang
Zeyu Li
Amir Sadovnik
Arka Daw
Neil Zhenqiang Gong
83
1
0
20 Nov 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
53
4
0
15 Oct 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
  Transformers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Yunfan LU
Song Han
VLM
46
51
0
14 Oct 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
67
48
0
05 Aug 2024
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Zheng Chong
Xiao Dong
Haoxiang Li
Shiyue Zhang
Wenqing Zhang
Xujie Zhang
Hanqing Zhao
D. Jiang
Xiaodan Liang
DiffM
57
17
0
21 Jul 2024
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
Minbin Huang
Yanxin Long
Xinchi Deng
Ruihang Chu
Jiangfeng Xiong
Xiaodan Liang
Hong Cheng
Qinglin Lu
Wei Liu
MLLM
EGVM
65
8
0
13 Mar 2024
One-step Diffusion with Distribution Matching Distillation
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin
Michael Gharbi
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
Taesung Park
DiffM
124
221
0
30 Nov 2023
Adversarial Diffusion Distillation
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
138
332
0
28 Nov 2023
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
  Diffusion GANs
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Yanwu Xu
Yang Zhao
Zhisheng Xiao
Tingbo Hou
137
107
0
14 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
287
4,261
0
30 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
521
0
02 Jan 2023
1