ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.09113
27
0

Seedance 1.0: Exploring the Boundaries of Video Generation Models

1 July 2025
Yu Gao
Haoyuan Guo
Tuyen Hoang
Weilin Huang
Lu Jiang
Fangyuan Kong
Huixia Li
Jiashi Li
Liang Li
Xiaojie Li
Xunsong Li
Y. Li
Shanchuan Lin
Zhijie Lin
Jiawei Liu
Shu Liu
Xiaonan Nie
Zhiwu Qing
Yuxi Ren
Li Sun
Zhi Tian
Rui Wang
Sen Wang
G. Wei
Guohong Wu
Jie Wu
Ruiqi Xia
Fei Xiao
Xuefeng Xiao
Jiangqiao Yan
Ceyuan Yang
Jianchao Yang
Runkai Yang
Tao Yang
Y. Yang
Zilyu Ye
Xuejiao Zeng
Yan Zeng
H. Zhang
Yang Zhao
Xiaozheng Zheng
Peihao Zhu
Jiaxin Zou
Feilong Zuo
    DiffMVGenVLM
ArXiv (abs)PDFHTML
Main:22 Pages
14 Figures
Bibliography:2 Pages
Appendix:2 Pages
Abstract

Notable breakthroughs in diffusion modeling have propelled rapid improvements in video generation, yet current foundational model still face critical challenges in simultaneously balancing prompt following, motion plausibility, and visual quality. In this report, we introduce Seedance 1.0, a high-performance and inference-efficient video foundation generation model that integrates several core technical improvements: (i) multi-source data curation augmented with precision and meaningful video captioning, enabling comprehensive learning across diverse scenarios; (ii) an efficient architecture design with proposed training paradigm, which allows for natively supporting multi-shot generation and jointly learning of both text-to-video and image-to-video tasks. (iii) carefully-optimized post-training approaches leveraging fine-grained supervised fine-tuning, and video-specific RLHF with multi-dimensional reward mechanisms for comprehensive performance improvements; (iv) excellent model acceleration achieving ~10x inference speedup through multi-stage distillation strategies and system-level optimizations. Seedance 1.0 can generate a 5-second video at 1080p resolution only with 41.4 seconds (NVIDIA-L20). Compared to state-of-the-art video generation models, Seedance 1.0 stands out with high-quality and fast video generation having superior spatiotemporal fluidity with structural stability, precise instruction adherence in complex multi-subject contexts, native multi-shot narrative coherence with consistent subject representation.

View on arXiv
@article{gao2025_2506.09113,
  title={ Seedance 1.0: Exploring the Boundaries of Video Generation Models },
  author={ Yu Gao and Haoyuan Guo and Tuyen Hoang and Weilin Huang and Lu Jiang and Fangyuan Kong and Huixia Li and Jiashi Li and Liang Li and Xiaojie Li and Xunsong Li and Yifu Li and Shanchuan Lin and Zhijie Lin and Jiawei Liu and Shu Liu and Xiaonan Nie and Zhiwu Qing and Yuxi Ren and Li Sun and Zhi Tian and Rui Wang and Sen Wang and Guoqiang Wei and Guohong Wu and Jie Wu and Ruiqi Xia and Fei Xiao and Xuefeng Xiao and Jiangqiao Yan and Ceyuan Yang and Jianchao Yang and Runkai Yang and Tao Yang and Yihang Yang and Zilyu Ye and Xuejiao Zeng and Yan Zeng and Heng Zhang and Yang Zhao and Xiaozheng Zheng and Peihao Zhu and Jiaxin Zou and Feilong Zuo },
  journal={arXiv preprint arXiv:2506.09113},
  year={ 2025 }
}
Comments on this paper