Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.05199
Cited By
v1
v2 (latest)
MAGVIT: Masked Generative Video Transformer
10 December 2022
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
Huiwen Chang
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MAGVIT: Masked Generative Video Transformer"
50 / 194 papers shown
Title
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Junyeob Baek
Hosung Lee
Christopher Hoang
Mengye Ren
Sungjin Ahn
17
0
0
17 Jun 2025
TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy
Héctor Carrión
Yutong Bai
Víctor A. Hernández Castro
Kishan Panaganti
Ayush Zenith
Matthew Trang
Tony Zhang
Pietro Perona
Jitendra Malik
VGen
17
0
0
12 Jun 2025
MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation
Yijie Deng
Shuaihang Yuan
Congcong Wen
Hao Huang
Anthony Tzes
Geeta Chandra Raju Bethala
Yi Fang
19
0
0
09 Jun 2025
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity
Jinxi Li
Ziyang Song
Siyuan Zhou
Bo Yang
AI4CE
23
0
0
09 Jun 2025
Concept-Centric Token Interpretation for Vector-Quantized Generative Models
Tianze Yang
Yucheng Shi
Mengnan Du
Xuansheng Wu
Qiaoyu Tan
Jin Sun
Ninghao Liu
14
0
0
31 May 2025
Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots
Guangting Zheng
Yehao Li
Yingwei Pan
Jiajun Deng
Ting Yao
Yanyong Zhang
Tao Mei
DiffM
29
0
0
26 May 2025
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang
Xiangtai Li
Lu Qi
X. Lin
Jinbin Bai
Qianyu Zhou
Yunhai Tong
DiffM
87
1
0
22 May 2025
Consistent World Models via Foresight Diffusion
Yu Zhang
Xingzhuo Guo
Haoran Xu
Mingsheng Long
57
0
0
22 May 2025
Intentional Gesture: Deliver Your Intentions with Gestures for Speech
Pinxin Liu
Haiyang Liu
Luchuan Song
Chenliang Xu
SLR
66
1
0
21 May 2025
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
Zhicheng Chen
Shibo Feng
Xi Xiao
Zhong Zhang
Qing Li
Xingyu Gao
Peilin Zhao
56
0
0
20 May 2025
MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning
Jinhua Zhang
Wei Long
Minghao Han
Weiyi You
Shuhang Gu
BDL
85
0
0
19 May 2025
VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation
Huawei Lin
Tong Geng
Zhaozhuo Xu
Weijie Zhao
VLM
169
1
0
19 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
297
1
0
05 May 2025
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
Tianwei Xiong
Jun Hao Liew
Zilong Huang
Jiashi Feng
Xihui Liu
87
1
0
11 Apr 2025
MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
Yilin Wang
Chuan Guo
Yuxuan Mu
Muhammad Gohar Javed
Wei Ji
Juwei Lu
Hai Jiang
Li Cheng
VGen
61
0
0
11 Apr 2025
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Team Seawead
Ceyuan Yang
Zhijie Lin
Yang Zhao
Shanchuan Lin
...
Zuquan Song
Zhenheng Yang
Jiashi Feng
Jianchao Yang
Lu Jiang
DiffM
184
22
0
11 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Yize Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
113
1
0
01 Apr 2025
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li
Lefei Zhang
Zedong Wang
Juanxi Tian
Cheng Tan
...
Chang Yu
Qingsong Xie
Haonan Lu
Haoqian Wang
Zhen Lei
108
2
0
01 Apr 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Yize Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
122
14
0
27 Mar 2025
Synthetic Video Enhances Physical Fidelity in Video Synthesis
Qi Zhao
Xingyu Ni
Ziyu Wang
Feng Cheng
Ziyan Yang
Lu Jiang
Bohan Wang
VGen
93
3
0
26 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
101
3
0
21 Mar 2025
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xintao Wang
Pengfei Wan
Di Zhang
Xihui Liu
VGen
100
4
0
21 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yanjie Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
96
5
0
20 Mar 2025
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
Ziyao Guo
Kai Zhang
Michael Qizhe Shieh
62
0
0
20 Mar 2025
Fast Autoregressive Video Generation with Diagonal Decoding
Yang Ye
Junliang Guo
Haoyu Wu
Tianyu He
Tim Pearce
Tabish Rashid
Katja Hofmann
Li Zhao
DiffM
VGen
115
2
0
18 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
175
0
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yue Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
82
1
0
14 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yanjie Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
151
3
0
13 Mar 2025
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Yuanxin Liu
Rui Zhu
Shuhuai Ren
Jiacong Wang
Haoyuan Guo
Xu Sun
Lu Jiang
372
1
0
13 Mar 2025
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
98
1
0
13 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
161
17
0
13 Mar 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Yu Guo
114
6
0
13 Mar 2025
Neighboring Autoregressive Modeling for Efficient Visual Generation
Yefei He
Yuanyu He
Shaoxuan He
Feng Chen
Hong Zhou
Kai Zhang
Bohan Zhuang
116
5
0
12 Mar 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
Xianrui Li
Jason Kuen
Hong Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe Lin
Marios Savvides
158
2
0
11 Mar 2025
V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation
Guiwei Zhang
Tianyu Zhang
Mohan Zhou
Yalong Bai
Biye Li
139
0
0
10 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
94
10
0
07 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
140
2
0
06 Mar 2025
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
Xingzhuo Guo
Yu Zhang
Baixu Chen
Haoran Xu
Jianmin Wang
Mingsheng Long
DiffM
AI4TS
145
2
0
02 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Xiaoye Qu
Sichen Liu
Yu Cheng
DiffM
VGen
81
4
0
02 Mar 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao
Gengwei Zhang
Yinlong Qian
Jiancheng Huang
Yao Zhao
Humphrey Shi
Lin Ma
Y. X. Wei
Zequn Jie
VLM
101
6
0
27 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
Junlin Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVM
VGen
AI4TS
94
0
0
25 Feb 2025
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffM
VGen
128
7
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
174
2
0
24 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
528
3
0
20 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
115
0
0
18 Feb 2025
From Principles to Applications: A Comprehensive Survey of Discrete Tokenizers in Generation, Comprehension, Recommendation, and Information Retrieval
Jian Jia
Jingtong Gao
Ben Xue
Junhao Wang
Qingpeng Cai
Quan Chen
Xiangyu Zhao
Peng Jiang
Kun Gai
OffRL
140
2
0
18 Feb 2025
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
T. Pham
Zhang Kang
Ji Woo Hong
Xuran Zheng
Chang D. Yoo
133
0
0
13 Feb 2025
Pre-Trained Video Generative Models as World Simulators
Haoran He
Yang Zhang
Liang Lin
Zhihao Xu
Ling Pan
VGen
155
5
0
10 Feb 2025
History-Guided Video Diffusion
Kiwhan Song
Boyuan Chen
Max Simchowitz
Yilun Du
Russ Tedrake
Vincent Sitzmann
VGen
210
18
0
10 Feb 2025
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Yunuo Chen
Junli Cao
Anil Kag
Vidit Goel
Sergei Korolev
Chenfanfu Jiang
Sergey Tulyakov
Jian Ren
DiffM
VGen
120
2
0
05 Feb 2025
1
2
3
4
Next