Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.03638
Cited By
v1
v2
v3
v4 (latest)
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
7 April 2022
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer"
50 / 176 papers shown
Title
Audio-Sync Video Generation with Multi-Stream Temporal Control
Shuchen Weng
Haojie Zheng
Zheng Chang
Si Li
Boxin Shi
Xinlong Wang
DiffM
VGen
28
0
0
09 Jun 2025
LumosFlow: Motion-Guided Long Video Generation
Jiahao Chen
Hangjie Yuan
Yichen Qian
Jingyun Liang
Jiazheng Xing
Pengwei Liu
Weihua Chen
Fan Wang
Bing Su
VGen
47
0
0
03 Jun 2025
Multiverse Through Deepfakes: The MultiFakeVerse Dataset of Person-Centric Visual and Conceptual Manipulations
Parul Gupta
Shreya Ghosh
Tom Gedeon
Thanh-Toan Do
Abhinav Dhall
48
0
0
01 Jun 2025
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffM
VGen
243
0
0
18 May 2025
Generative Pre-trained Autoregressive Diffusion Transformer
Yuan Zhang
Jiacheng Jiang
Guoqing Ma
Zhiying Lu
Haoyang Huang
Jianlong Yuan
Nan Duan
VGen
136
2
0
12 May 2025
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
109
3
0
30 Apr 2025
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo
Matthew Wallingford
Ali Farhadi
Noah Snavely
Wei-Chiu Ma
VGen
148
1
0
10 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
181
13
0
07 Apr 2025
JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation
Fangda Chen
Shanshan Zhao
Chuanfu Xu
Long Lan
VGen
91
2
0
31 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
176
11
0
25 Mar 2025
LongDiff: Training-Free Long Video Generation in One Go
Zhuoling Li
Hossein Rahmani
Qiuhong Ke
Jing Liu
DiffM
VGen
VLM
101
0
0
23 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
119
2
0
20 Mar 2025
MusicInfuser: Making Video Diffusion Listen and Dance
Susung Hong
Ira Kemelmacher-Shlizerman
Brian L. Curless
Steven M. Seitz
VGen
114
0
0
18 Mar 2025
R
^R
R
FLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
80
0
0
13 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Fu Liu
Peng Jia
Xianpeng Lang
Xiaolong Sun
VGen
437
1
0
12 Mar 2025
Neighboring Autoregressive Modeling for Efficient Visual Generation
Yefei He
Yuanyu He
Shaoxuan He
Feng Chen
Hong Zhou
Kai Zhang
Bohan Zhuang
116
5
0
12 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
Qingbin Liu
DiffM
VGen
152
6
0
10 Mar 2025
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
Min Zhao
Guande He
Yixiao Chen
Hongzhou Zhu
Chong Li
Jun Zhu
VGen
130
11
0
21 Feb 2025
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
Yunlong Yuan
Yuanfan Guo
Chunwei Wang
Wei Zhang
Hang Xu
L. Zhang
DiffM
VGen
207
3
0
20 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
533
3
0
20 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
117
0
0
18 Feb 2025
MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching
Yen-Siang Wu
Chi-Pin Huang
Fu-En Yang
Yu-Jie Wang
DiffM
VGen
125
1
0
18 Feb 2025
VILP: Imitation Learning with Latent Video Planning
Zhengtong Xu
Qiang Qiu
Yu She
VGen
173
1
0
03 Feb 2025
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Yuntao Chen
Yuqi Wang
Zhaoxiang Zhang
465
11
0
24 Dec 2024
VidTwin: Video VAE with Decoupled Structure and Dynamics
Yuchi Wang
Junliang Guo
Xinyi Xie
Tianyu He
Xu Sun
Li Zhao
DRL
VGen
161
5
0
23 Dec 2024
Parallelized Autoregressive Visual Generation
Yanjie Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
191
17
0
19 Dec 2024
Owl-1: Omni World Model for Consistent Long Video Generation
Yuanhui Huang
Wenzhao Zheng
Yuan Gao
Xin Tao
Pengfei Wan
Di Zhang
Jie Zhou
Jiwen Lu
VGen
VLM
198
3
0
12 Dec 2024
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Tianwei Yin
Qiang Zhang
Richard Zhang
William T. Freeman
F. Durand
Eli Shechtman
Xun Huang
VGen
DiffM
188
11
0
10 Dec 2024
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
Yizhuo Li
Yuying Ge
Yixiao Ge
Ping Luo
Ying Shan
DiffM
VGen
186
0
0
05 Dec 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
VGen
DiffM
211
4
0
25 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
250
1
0
25 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
191
14
0
08 Nov 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Zhiyong Yang
Yingnian Wu
Lijuan Wang
VGen
118
8
0
30 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
107
12
0
28 Oct 2024
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li
Shujie Hu
Shujie Liu
Long Zhou
Jeongsoo Choi
Lingwei Meng
Xun Guo
Jiajian Li
H. Ling
Furu Wei
VGen
DiffM
150
7
0
27 Oct 2024
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
Onkar Susladkar
Jishu Sen Gupta
Chirag Sehgal
Sparsh Mittal
Rekha Singhal
DiffM
VGen
105
0
0
10 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
161
35
0
03 Oct 2024
COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation
Mingzhen Sun
Weining Wang
Xinxin Zhu
Jing Liu
VGen
DiffM
61
0
0
02 Oct 2024
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
161
4
0
26 Sep 2024
JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation
Hadrien Reynaud
Matthew Baugh
Mischa Dombrowski
Sarah Cechnicka
Qingjie Meng
Bernhard Kainz
VLM
66
0
0
21 Sep 2024
Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy
Somayeh Pakdelmoez
Saba Omidikia
Seyyed Ali Seyyedsalehi
Seyyede Zohreh Seyyedsalehi
MedIm
81
1
0
11 Sep 2024
1M-Deepfakes Detection Challenge
Zhixi Cai
Abhinav Dhall
Shreya Ghosh
Munawar Hayat
D. Kollias
Kalin Stefanov
Usman Tariq
102
2
0
11 Sep 2024
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Zhuoyan Luo
Fengyuan Shi
Yixiao Ge
Yujiu Yang
Limin Wang
Ying Shan
VLM
160
59
0
06 Sep 2024
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model
Liuhan Chen
Zongjian Li
Bin Lin
Bin Zhu
Qian Wang
Shenghai Yuan
X. Zhou
Xinhua Cheng
Li Yuan
DiffM
163
16
0
02 Sep 2024
DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving
Yongjie Fu
Anmol Jain
Xuan Di
Xu Chen
Zhaobin Mo
VGen
107
6
0
29 Aug 2024
GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng
Xitong Yang
Zhen Xing
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
106
7
0
27 Aug 2024
DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays
Yiran Sun
Hana Baroudi
Tucker Netherton
Laurence Court
Osama Mawlawi
Ashok Veeraraghavan
Guha Balakrishnan
DiffM
MedIm
86
5
0
27 Aug 2024
Real-Time Video Generation with Pyramid Attention Broadcast
Xuanlei Zhao
Xiaolong Jin
Kai Wang
Yang You
VGen
DiffM
180
45
0
22 Aug 2024
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Zhichao Zhang
Xinyue Li
Wei Sun
Jun Jia
Xiongkuo Min
...
Puyi Wang
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Guangtao Zhai
EGVM
68
5
0
31 Jul 2024
Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos
Jiahe Liu
Youran Qu
Qi Yan
Fangyin Wei
Lele Wang
Renjie Liao
VGen
EGVM
69
15
0
23 Jul 2024
1
2
3
4
Next