ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.21705
  4. Cited By
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
v1v2v3v4 (latest)

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

31 July 2024
Zhenghao Zhang
Junchao Liao
Menghao Li
Zuozhuo Dai
Bingxue Qiu
Hao Hu
Shaowei Cai
Weizhi Wang
    VGen
ArXiv (abs)PDFHTML

Papers citing "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"

50 / 55 papers shown
Title
SketchVideo: Sketch-based Video Generation and Editing
SketchVideo: Sketch-based Video Generation and Editing
Feng-Lin Liu
Hongbo Fu
Xintao Wang
Weicai Ye
Pengfei Wan
Di Zhang
Lin Gao
DiffMVGen
111
0
0
30 Mar 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
184
3
0
03 Jan 2025
Open-Sora: Democratizing Efficient Video Production for All
Open-Sora: Democratizing Efficient Video Production for All
Zangwei Zheng
Xiangyu Peng
Tianji Yang
Chenhui Shen
Shenggui Li
Hongxin Liu
Yukun Zhou
Tianyi Li
Yang You
VGen
154
252
0
31 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
216
2
0
12 Dec 2024
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
Jiahao Cui
Hui Li
Yun Zhan
Hanlin Shang
K. Cheng
Yuqi Ma
Shan Mu
Hang Zhou
Jingdong Wang
Siyu Zhu
ViTVGen
167
9
0
01 Dec 2024
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Koichi Namekata
Sherwin Bahmani
Ziyi Wu
Yash Kant
Igor Gilitschenski
David B. Lindell
VGen
148
16
0
07 Nov 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
Jiajian Li
H. Ling
Furu Wei
VGenDiffM
143
7
0
27 Oct 2024
WorldSimBench: Towards Video Generation Models as World Simulators
WorldSimBench: Towards Video Generation Models as World Simulators
Yiran Qin
Zhelun Shi
Jiwen Yu
Xijun Wang
Enshen Zhou
...
Lu Sheng
Jing Shao
Junlin Wu
Wanli Ouyang
Ruimao Zhang
EGVMVGen
192
471
0
23 Oct 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffMVGen
233
558
0
12 Aug 2024
ReVideo: Remake a Video with Motion and Content Control
ReVideo: Remake a Video with Motion and Content Control
Chong Mou
Mingdeng Cao
Xintao Wang
Zhaoyang Zhang
Ying Shan
Jian Zhang
DiffMVGen
44
31
0
22 May 2024
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator
  with Diffusion Models
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models
Fan Bao
Chendong Xiang
Gang Yue
Guande He
Hongzhou Zhu
Kaiwen Zheng
Min Zhao
Shilong Liu
Yaole Wang
Jun Zhu
VGen
179
71
0
07 May 2024
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video
  Dense Captioning
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning
Lin Xu
Yilin Zhao
Daquan Zhou
Zhijie Lin
See Kiong Ng
Jiashi Feng
MLLMVLM
80
184
0
25 Apr 2024
DragAnything: Motion Control for Anything using Entity Representation
DragAnything: Motion Control for Anything using Entity Representation
Wejia Wu
Zhuang Li
Yuchao Gu
Rui Zhao
Yefei He
David Junhao Zhang
Mike Zheng Shou
Yan Li
Yan Li
Di Zhang
VGen
112
61
0
12 Mar 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
138
207
0
29 Feb 2024
Motion-I2V: Consistent and Controllable Image-to-Video Generation with
  Explicit Motion Modeling
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Xiaoyu Shi
Zhaoyang Huang
Fu-Yun Wang
Weikang Bian
Dasong Li
...
Ka Chun Cheung
Simon See
Hongwei Qin
Jifeng Da
Hongsheng Li
VGenDiffM
97
93
0
29 Jan 2024
Lumiere: A Space-Time Diffusion Model for Video Generation
Lumiere: A Space-Time Diffusion Model for Video Generation
Omer Bar-Tal
Hila Chefer
Omer Tov
Charles Herrmann
Roni Paiss
...
T. Michaeli
Oliver Wang
Deqing Sun
Tali Dekel
Inbar Mosseri
VGen
195
252
0
23 Jan 2024
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
W. Ma
J. P. Lewis
W. Kleijn
DiffMVGen
85
42
0
31 Dec 2023
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Zhiwu Qing
Biao Gong
Yingya Zhang
Yujun Shen
Changxin Gao
Nong Sang
DiffMVGen
93
28
0
25 Dec 2023
MotionCtrl: A Unified and Flexible Motion Controller for Video
  Generation
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Zhouxia Wang
Ziyang Yuan
Xintao Wang
Tianshui Chen
Menghan Xia
Ping Luo
Ying Shan
DiffMVGen
115
229
0
06 Dec 2023
VMC: Video Motion Customization using Temporal Attention Adaption for
  Text-to-Video Diffusion Models
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Hyeonho Jeong
Geon Yeong Park
Jong Chul Ye
VGenDiffM
144
59
0
01 Dec 2023
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
...
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
272
1,183
0
25 Nov 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Zhan Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffMVGen
110
230
0
07 Nov 2023
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
Haoxin Chen
Menghan Xia
Yin-Yin He
Yong Zhang
Xiaodong Cun
...
Yaofang Liu
Qifeng Chen
Xintao Wang
Chao-Liang Weng
Ying Shan
DiffM
70
309
0
30 Oct 2023
PUCA: Patch-Unshuffle and Channel Attention for Enhanced Self-Supervised
  Image Denoising
PUCA: Patch-Unshuffle and Channel Attention for Enhanced Self-Supervised Image Denoising
Hyemi Jang
Junsung Park
Dahuin Jung
Jaihyun Lew
Ho Bae
Sung-Hoon Yoon
43
17
0
16 Oct 2023
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Rui Zhao
Yuchao Gu
Jay Zhangjie Wu
David Junhao Zhang
Jia-Wei Liu
Weijia Wu
Jussi Keppo
Mike Zheng Shou
DiffMVGen
88
117
0
12 Oct 2023
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Lijun Yu
José Lezama
N. B. Gundavarapu
Luca Versari
Kihyuk Sohn
...
Boqing Gong
Ming-Hsuan Yang
Irfan Essa
David A. Ross
Lu Jiang
107
323
0
09 Oct 2023
DragNUWA: Fine-grained Control in Video Generation by Integrating Text,
  Image, and Trajectory
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
Sheng-Siang Yin
Chenfei Wu
Jian Liang
Jie Shi
Houqiang Li
Gong Ming
Nan Duan
VGen
128
145
0
16 Aug 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
245
2,440
0
04 Jul 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
VideoComposer: Compositional Video Synthesis with Motion Controllability
Xiang Wang
Hangjie Yuan
Shiwei Zhang
Dayou Chen
Jiuniu Wang
Yingya Zhang
Yujun Shen
Deli Zhao
Jingren Zhou
VGenDiffM
98
340
0
03 Jun 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGenDiffM
111
252
0
22 May 2023
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
  Generators
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan
A. Movsisyan
Vahram Tadevosyan
Roberto Henschel
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
VGen
74
574
0
23 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
1.5K
14,699
0
15 Mar 2023
Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene
  Flow, Optical Flow and Stereo
Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo
Lukas Mehl
Jenny Schmalfuss
Azin Jahedi
Yaroslava Nalivayko
Andrés Bruhn
VGen
98
62
0
03 Mar 2023
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for
  Text-to-Image Diffusion Models
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Chong Mou
Xintao Wang
Liangbin Xie
Yanze Wu
Shuai Liu
Zhongang Qi
Ying Shan
Xiaohu Qie
DiffM
128
1,030
0
16 Feb 2023
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
118
2,418
0
19 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffMVGen
77
248
0
10 Dec 2022
Unifying Flow, Stereo and Depth Estimation
Unifying Flow, Stereo and Depth Estimation
Haofei Xu
Jing Zhang
Jianfei Cai
Hamid Rezatofighi
Feng Yu
Dacheng Tao
Andreas Geiger
MDE
117
216
0
10 Nov 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
162
1,541
0
05 Oct 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffMVGen
83
1,428
0
29 Sep 2022
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving
  Cameras in the Wild
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild
Wang Zhao
Shaohui Liu
Hengkai Guo
Wenping Wang
Yang Liu
120
65
0
19 Jul 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLMDiffM
413
6,908
0
13 Apr 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffMVGen
209
1,638
0
07 Apr 2022
Diffusion Models Beat GANs on Image Synthesis
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal
Alex Nichol
265
7,938
0
11 May 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffMVGen
80
242
0
30 Apr 2021
Occluded Video Instance Segmentation: A Benchmark
Occluded Video Instance Segmentation: A Benchmark
Jiyang Qi
Yan Gao
Yao Hu
Xinggang Wang
Xiaoyu Liu
Xiang Bai
Serge Belongie
Alan Yuille
Philip Torr
S. Bai
VOSVLM
70
140
0
02 Feb 2021
Decision-Making with Auto-Encoding Variational Bayes
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
500
10,591
0
17 Feb 2020
Virtual KITTI 2
Virtual KITTI 2
Yohann Cabon
Naila Murray
Martin Humenberger
3DPC
69
288
0
29 Jan 2020
Learning Multi-Human Optical Flow
Learning Multi-Human Optical Flow
Anurag Ranjan
David T. Hoffmann
Dimitrios Tzionas
Siyu Tang
Javier Romero
Michael J. Black
3DH
48
43
0
24 Oct 2019
Towards Accurate Generative Models of Video: A New Metric & Challenges
Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVMVGen
91
745
0
03 Dec 2018
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
361
2,233
0
22 Sep 2017
12
Next