ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.04145
  4. Cited By
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

7 November 2023
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Zhanyue Qin
Xiang Wang
Deli Zhao
Jingren Zhou
    DiffM
    VGen
ArXivPDFHTML

Papers citing "I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models"

50 / 157 papers shown
Title
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models
EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models
Hu Yue
Siyuan Huang
Yue Liao
Shengcong Chen
Pengfei Zhou
Liliang Chen
Maoqing Yao
Guanghui Ren
VGen
34
0
0
14 May 2025
ACT-R: Adaptive Camera Trajectories for 3D Reconstruction from Single Image
ACT-R: Adaptive Camera Trajectories for 3D Reconstruction from Single Image
Yishuo Wang
Mingrui Zhao
Ali Mahdavi Amiri
Hao Zhang
26
0
0
13 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
31
0
0
09 May 2025
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
57
0
0
07 May 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
91
0
0
24 Apr 2025
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Ying Li
Xiaobao Wei
Xiaowei Chi
Y. K. Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
VGen
41
0
0
23 Apr 2025
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
Xuming Hu
Yiming Li
Jiajun Li
Aiwei Liu
WIGM
VGen
58
1
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
Xuzhao Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Wenjie Qu
Ji Wan
Jiadong Wang
VGen
67
1
0
22 Apr 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang
Yuxi Zhou
Duo Peng
Joo-Hwee Lim
Zhigang Tu
De Wen Soh
Lin Geng Foo
DiffM
47
1
0
19 Apr 2025
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Xiao Jin
Zihan Wang
Zhenhua Yu
Changrak Choi
Kalind Carpenter
T. Nanayakkara
40
0
0
17 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
J. Xu
Yuanmin Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Yuhui Zhang
Rui Feng
Weidi Xie
DiffM
51
1
0
16 Apr 2025
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
Xinyu Wang
Shiwei Zhang
Longxiang Tang
Yang Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
33
0
0
15 Apr 2025
Taming Consistency Distillation for Accelerated Human Image Animation
Taming Consistency Distillation for Accelerated Human Image Animation
Xinyu Wang
Shiwei Zhang
Hangjie Yuan
Yujie Wei
Yang Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
32
0
0
15 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
46
0
0
11 Apr 2025
CamContextI2V: Context-aware Controllable Video Generation
CamContextI2V: Context-aware Controllable Video Generation
Luis Denninger
Sina Mokhtarzadeh Azar
Juergen Gall
VGen
38
0
0
08 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Yuyao Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
54
1
0
01 Apr 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Yansen Wang
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
87
3
0
27 Mar 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
39
0
0
25 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
70
0
0
23 Mar 2025
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan
Ting Zhang
Ying Deng
Jiapei Zhang
Yeshuang Zhu
Zexi Jia
Jie Zhou
Jinchao Zhang
VGen
41
0
0
22 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
65
0
0
21 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Y. Liu
DiffM
VGen
72
0
0
21 Mar 2025
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
Minghan Li
C. Xie
Yongpeng Wu
Lei Zhang
Hao Wu
DiffM
VGen
59
0
0
17 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
75
0
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yanjie Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
72
0
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
76
3
0
13 Mar 2025
NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models
Mert Albaba
Chenhao Li
Markos Diomataris
Omid Taheri
Andreas Krause
M. Black
VGen
63
0
0
13 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
67
1
0
12 Mar 2025
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
48
0
0
11 Mar 2025
VACE: All-in-One Video Creation and Editing
Zeyinzi Jiang
Zhen Han
Chaojie Mao
J. Zhang
Yulin Pan
Yu Liu
DiffM
VGen
56
5
0
10 Mar 2025
Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu
Zeyu Zhu
Mike Zheng Shou
VGen
80
2
0
10 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
75
0
0
08 Mar 2025
FaceShot: Bring Any Character into Life
Junyao Gao
Yanan Sun
Fei Shen
Xin Jiang
Zhening Xing
Kai-xiang Chen
Cairong Zhao
CVBM
3DH
47
1
0
02 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Wei Wei
Sichen Liu
Yu-Xi Cheng
DiffM
VGen
44
0
0
02 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
92
0
0
27 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
J. Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVM
VGen
AI4TS
62
0
0
25 Feb 2025
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Yunuo Chen
Junli Cao
Anil Kag
Vidit Goel
Sergei Korolev
Chenfanfu Jiang
Sergey Tulyakov
Jian Ren
DiffM
VGen
90
1
0
05 Feb 2025
IPO: Iterative Preference Optimization for Text-to-Video Generation
IPO: Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang
Zhiyu Tan
Xuecheng Nie
VGen
109
1
0
04 Feb 2025
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Runyi Hu
Jun Zhang
Yongbin Li
Jiwei Li
Qing Guo
Han Qiu
Tianwei Zhang
WIGM
VGen
81
5
0
24 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffM
VGen
39
0
0
11 Jan 2025
MEt3R: Measuring Multi-View Consistency in Generated Images
MEt3R: Measuring Multi-View Consistency in Generated Images
Mohammad Asim
Christopher Wewer
Thomas Wimmer
Bernt Schiele
J. E. Lenssen
EGVM
3DGS
VGen
48
7
0
10 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
45
10
0
08 Jan 2025
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
Rui Xie
Yinhong Liu
Penghao Zhou
Chen Zhao
Jun Zhou
Kaicheng Zhang
Zhenru Zhang
Jian Yang
Zhengyuan Yang
Ying Tai
VGen
DiffM
41
2
0
06 Jan 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
Hao Zhang
Tat-Seng Chua
Shuicheng Yan
64
39
0
31 Dec 2024
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Ting Zhang
Zhiqiang Yuan
Yeshuang Zhu
Jinchao Zhang
DiffM
101
0
0
31 Dec 2024
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained
  Ego-Motion, Object Dynamics, and Scene Composition Control
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
...
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
77
5
0
15 Dec 2024
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention
  Mechanism
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Jun Zheng
Jing Wang
Fuwei Zhao
Xujie Zhang
Xiaodan Liang
DiffM
VGen
73
0
0
13 Dec 2024
Owl-1: Omni World Model for Consistent Long Video Generation
Owl-1: Omni World Model for Consistent Long Video Generation
Yuanhui Huang
Wenzhao Zheng
Yuan Gao
Xin Tao
Pengfei Wan
Di Zhang
Jie Zhou
Jiwen Lu
VGen
VLM
87
0
0
12 Dec 2024
InfinityDrive: Breaking Time Limits in Driving World Models
InfinityDrive: Breaking Time Limits in Driving World Models
Xi Guo
C. Ding
Haoxuan Dou
Xin Zhang
Weixuan Tang
Wei Wu
VGen
86
5
0
02 Dec 2024
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Yuhang Zhang
Yuan Zhou
Zeyu Liu
Yuxuan Cai
Qiuyue Wang
Aidong Men
Huan Yang
VGen
DiffM
84
0
0
29 Nov 2024
1234
Next