Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.05199
Cited By
v1
v2 (latest)
MAGVIT: Masked Generative Video Transformer
10 December 2022
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
Huiwen Chang
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MAGVIT: Masked Generative Video Transformer"
50 / 194 papers shown
Title
VILP: Imitation Learning with Latent Video Planning
Zhengtong Xu
Qiang Qiu
Yu She
VGen
168
1
0
03 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
Jun Zhu
164
2
0
28 Jan 2025
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
Shengshi Yao
Jincheng Dai
Xiaoqi Qin
Sixian Wang
Siye Wang
K. Niu
Ping Zhang
131
0
0
22 Jan 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
100
9
0
21 Jan 2025
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
Zheng Chong
Wenqing Zhang
Shiyue Zhang
Jun Zheng
Xiao Dong
Haoxiang Li
Yiling Wu
D. Jiang
Xiaodan Liang
DiffM
78
2
0
20 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
130
11
0
13 Jan 2025
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
253
10
0
19 Dec 2024
Parallelized Autoregressive Visual Generation
Yanjie Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
189
17
0
19 Dec 2024
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
Hao Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
157
1
0
18 Dec 2024
Self-control: A Better Conditional Mechanism for Masked Autoregressive Model
Qiaoying Qu
Shiyu Shen
DiffM
137
0
0
18 Dec 2024
DINO-Foresight
\texttt{DINO-Foresight}
DINO-Foresight
: Looking into the Future with DINO
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
AI4CE
145
3
0
16 Dec 2024
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Hongjie Wang
Chih-Yao Ma
Yen-Cheng Liu
Ji Hou
Tao Xu
...
Peizhao Zhang
Tingbo Hou
Peter Vajda
N. Jha
Xiaoliang Dai
LMTD
VGen
VLM
DiffM
197
11
0
13 Dec 2024
[MASK] is All You Need
Vincent Tao Hu
Bjorn Ommer
DiffM
214
5
0
09 Dec 2024
Navigation World Models
Amir Bar
G. Zhou
Danny Tran
Trevor Darrell
Yann LeCun
VGen
EgoV
210
33
0
04 Dec 2024
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang
Tianyuan Zhang
Fujun Luan
Yunze Man
Hao Tan
Kai Zhang
William T. Freeman
Yu-Xiong Wang
VGen
132
20
0
02 Dec 2024
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Xianrui Li
Kai Qiu
Hong Chen
Jason Kuen
Jiuxiang Gu
Jiadong Wang
Zhe Lin
Bhiksha Raj
VLM
213
9
0
02 Dec 2024
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
Florinel-Alin Croitoru
Andrei Iulian Hiji
Vlad Hondru
Nicolae-Cătălin Ristea
Paul Irofti
Marius Popescu
Cristian Rusu
Radu Tudor Ionescu
Fahad Shahbaz Khan
Mubarak Shah
135
5
0
29 Nov 2024
StableAnimator: High-Quality Identity-Preserving Human Image Animation
Shuyuan Tu
Zhen Xing
Xintong Han
Zhi-Qi Cheng
Qi Dai
Chong Luo
Zuxuan Wu
VGen
210
23
0
26 Nov 2024
Representation Collapsing Problems in Vector Quantization
Wenhao Zhao
Qiran Zou
Rushi Shah
Dianbo Liu
109
2
0
25 Nov 2024
Extending Video Masked Autoencoders to 128 frames
N. B. Gundavarapu
Luke Friedman
Raghav Goyal
Chaitra Hegde
Eirikur Agustsson
...
Mikhail Sirotenko
Ming-Hsuan Yang
Tobias Weyand
Boqing Gong
Leonid Sigal
118
1
0
20 Nov 2024
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Xiaofeng Wang
Kang Zhao
Fan Liu
Jiayu Wang
Guosheng Zhao
Xiaoyi Bao
Zheng Hua Zhu
Yingya Zhang
Xingang Wang
VGen
117
10
0
13 Nov 2024
World Models: The Safety Perspective
Zifan Zeng
Chongzhe Zhang
Feng Liu
Joseph Sifakis
Qunli Zhang
Shiming Liu
Peng Wang
KELM
LLMAG
78
2
0
12 Nov 2024
Improved Video VAE for Latent Video Diffusion Model
Pingyu Wu
Kai Zhu
Yu Liu
Liming Zhao
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
VGen
DiffM
86
5
0
10 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
191
14
0
08 Nov 2024
Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
Ke Fan
Jing Zhang
Ran Yi
Jingyu Gong
Yabiao Wang
Yating Wang
Xin Tan
Chengjie Wang
Lizhuang Ma
82
3
0
06 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
Hao Luo
Bohan Zhou
Zongqing Lu
68
2
0
05 Nov 2024
Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
Ao Fu
Yi Zhou
Tao Zhou
Yue Yang
Bojun Gao
Qun Li
Guobin Wu
Ling Shao
VGen
100
3
0
05 Nov 2024
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
138
40
1
01 Nov 2024
Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting
Chiu-Wai Yan
Shi Quan Foo
Van Hoan Trinh
Dit-Yan Yeung
Ka-Hing Wong
W. Wong
60
2
0
30 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
105
12
0
28 Oct 2024
Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion
Emiel Hoogeboom
Thomas Mensink
Jonathan Heek
Kay Lamerigts
Ruiqi Gao
Tim Salimans
465
13
0
25 Oct 2024
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu
Xiaokang Chen
Z. F. Wu
Yiyang Ma
Xingchao Liu
...
Wen Liu
Zhenda Xie
Xingkai Yu
Chong Ruan
Ping Luo
AI4TS
127
115
0
17 Oct 2024
Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Jiwan Hur
Dong-Jae Lee
Gyojin Han
Jaehyun Choi
Yunho Jeon
Junmo Kim
DiffM
99
0
0
17 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
125
9
0
14 Oct 2024
ElasticTok: Adaptive Tokenization for Image and Video
Wilson Yan
Matei A. Zaharia
Volodymyr Mnih
Pieter Abbeel
Aleksandra Faust
Hao Liu
VGen
103
11
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
228
0
0
10 Oct 2024
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
Onkar Susladkar
Jishu Sen Gupta
Chirag Sehgal
Sparsh Mittal
Rekha Singhal
DiffM
VGen
102
0
0
10 Oct 2024
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction
Hyungjin Chung
Dohun Lee
Jong Chul Ye
VGen
DiffM
63
2
0
07 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation
Ziyu Yao
Jialin Li
Yifeng Zhou
Yong Liu
Xi Jiang
Chengjie Wang
Feng Zheng
Yuexian Zou
Lei Li
DiffM
146
15
0
07 Oct 2024
ECHOPulse: ECG controlled echocardio-grams video generation
Yiwei Li
Sekeun Kim
Zihao Wu
Hanqi Jiang
Yi Pan
...
Sifan Song
Yucheng Shi
Tianming Liu
Quanzheng Li
Xiang Li
VGen
62
1
0
04 Oct 2024
Zebra: In-Context Generative Pretraining for Solving Parametric PDEs
Louis Serrano
Armand K. Koupai
Thomas X. Wang
Pierre Erbacher
Patrick Gallinari
AI4CE
107
5
0
04 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
159
35
0
03 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe Lin
VLM
101
30
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
170
3
0
02 Oct 2024
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Jie Cheng
Ruixi Qiao
Gang Xiong
Binhua Li
Yingwei Ma
Binhua Li
Yongbin Li
Yisheng Lv
OffRL
OnRL
LM&Ro
126
4
0
01 Oct 2024
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and Generation
Kun Su
Xiulong Liu
Eli Shlizerman
VGen
160
7
0
27 Sep 2024
MaskBit: Embedding-free Image Generation via Bit Tokens
Mark Weber
Lijun Yu
Qihang Yu
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
DiffM
102
40
0
24 Sep 2024
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond
Hong Chen
Xin Wang
Yuwei Zhou
Bin Huang
Yipeng Zhang
Wei Feng
Houlun Chen
Zeyang Zhang
Siao Tang
Wenwu Zhu
DiffM
124
9
0
23 Sep 2024
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation
Zhuoyan Luo
Fengyuan Shi
Yixiao Ge
Yujiu Yang
Limin Wang
Ying Shan
VLM
158
59
0
06 Sep 2024
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model
Liuhan Chen
Zongjian Li
Bin Lin
Bin Zhu
Qian Wang
Shenghai Yuan
X. Zhou
Xinhua Cheng
Li Yuan
DiffM
161
16
0
02 Sep 2024
Previous
1
2
3
4
Next