Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.05199
Cited By
MAGVIT: Masked Generative Video Transformer
10 December 2022
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
Huiwen Chang
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MAGVIT: Masked Generative Video Transformer"
50 / 190 papers shown
Title
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao
Yuanjun Xiong
Philipp Krahenbuhl
45
17
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
66
227
0
10 Jun 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
49
14
0
10 Jun 2024
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
Dejia Xu
Weili Nie
Chao Liu
Sifei Liu
Jan Kautz
Zhangyang Wang
Arash Vahdat
DiffM
VGen
81
52
0
04 Jun 2024
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Sijie Zhao
Yong Zhang
Xiaodong Cun
Shaoshu Yang
Muyao Niu
Xiaoyu Li
Wenbo Hu
Ying Shan
DiffM
61
23
0
30 May 2024
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Haoxing Chen
Yan Hong
Zizheng Huang
Zhuoer Xu
Zhangxuan Gu
...
Jun Lan
Huijia Zhu
Jianfu Zhang
Weiqiang Wang
Huaxiong Li
Mamba
83
14
0
30 May 2024
Video Prediction Models as General Visual Encoders
James Maier
Nishanth Mohankumar
VGen
40
0
0
25 May 2024
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Jialong Wu
Shaofeng Yin
Ningya Feng
Xu He
Dong Li
Haifeng Zhang
Mingsheng Long
VGen
49
22
0
24 May 2024
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang
Zhi-Wei Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
43
3
0
23 May 2024
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers
Andrew Marmon
Grant Schindler
José Lezama
Dan Kondratyuk
Bryan Seybold
Irfan Essa
VGen
ViT
DiffM
34
3
0
21 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
40
7
0
13 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
57
11
0
07 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
87
38
0
06 May 2024
Beyond Deepfake Images: Detecting AI-Generated Videos
Danial Samadi Vahdati
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
63
11
0
24 Apr 2024
On the Content Bias in Fréchet Video Distance
Jason S. Hoffman
Aniruddha Mahapatra
Gaurav Parmar
Jun-Yan Zhu
Jia-Bin Huang
EGVM
50
15
0
18 Apr 2024
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Tarasha Khurana
Deva Ramanan
AI4TS
55
0
0
17 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi-Xin Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
42
250
0
03 Apr 2024
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
Yao Ni
Piotr Koniusz
AI4CE
GAN
40
1
0
31 Mar 2024
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Binyuan Huang
Yuqing Wen
Yucheng Zhao
Yaosi Hu
Yingfei Liu
...
Tiancai Wang
Chi Zhang
Chang Wen Chen
Zhenzhong Chen
Xiangyu Zhang
46
15
0
28 Mar 2024
SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
Rui Zhu
Yingwei Pan
Yehao Li
Ting Yao
Zhenglong Sun
Tao Mei
C. Chen
50
24
0
25 Mar 2024
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffM
VGen
22
9
0
24 Mar 2024
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
Sihyun Yu
Weili Nie
De-An Huang
Boyi Li
Jinwoo Shin
A. Anandkumar
VGen
DiffM
34
15
0
21 Mar 2024
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Fu-Yun Wang
Xiaoshi Wu
Zhaoyang Huang
Xiaoyu Shi
Dazhong Shen
Guanglu Song
Yu Liu
Hongsheng Li
DiffM
37
13
0
20 Mar 2024
Generalized Predictive Model for Autonomous Driving
Jiazhi Yang
Shenyuan Gao
Yihang Qiu
Li Chen
Tianyu Li
...
Ping Luo
Jun Zhang
Andreas Geiger
Yu Qiao
Hongyang Li
VGen
73
57
0
14 Mar 2024
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala
Jianfeng Gao
Jianwei Yang
VGen
DiffM
33
3
0
07 Mar 2024
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Ming-hui Li
Shuai Li
Xindong Zhang
Lei Zhang
VOS
47
16
0
28 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
75
260
0
27 Feb 2024
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
74
144
0
23 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
42
57
0
22 Feb 2024
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Jianhong Bai
Tianyu He
Yuchi Wang
Junliang Guo
Haoji Hu
Zuozhu Liu
Jiang Bian
VGen
31
26
0
20 Feb 2024
Rolling Diffusion Models
David Ruhe
Jonathan Heek
Tim Salimans
Emiel Hoogeboom
DiffM
35
32
0
12 Feb 2024
Cross-view Masked Diffusion Transformers for Person Image Synthesis
T. Pham
Zhang Kang
Chang D. Yoo
48
6
0
02 Feb 2024
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
Fu-Yun Wang
Zhaoyang Huang
Xiaoyu Shi
Weikang Bian
Guanglu Song
Yu Liu
Hongsheng Li
13
16
0
01 Feb 2024
ActAnywhere: Subject-Aware Video Background Generation
Boxiao Pan
Zhan Xu
Chun-Hao Paul Huang
Krishna Kumar Singh
Yang Zhou
Leonidas J. Guibas
Jimei Yang
VGen
DiffM
29
3
0
19 Jan 2024
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Xiaofeng Wang
Zheng Zhu
Guan Huang
Boyuan Wang
Xinze Chen
Jiwen Lu
VGen
37
32
0
18 Jan 2024
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGen
DiffM
38
35
0
17 Jan 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen
Yong Zhang
Xiaodong Cun
Menghan Xia
Xintao Wang
Chao-Liang Weng
Ying Shan
VGen
DiffM
126
277
0
17 Jan 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee
Yinxiao Li
Junjie Ke
Innfarn Yoo
Han Zhang
...
Junfeng He
Gang Li
Sangpil Kim
Irfan Essa
Feng Yang
EGVM
38
18
0
11 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
39
14
0
31 Dec 2023
FlashVideo: A Framework for Swift Inference in Text-to-Video Generation
Bin Lei
Le Chen
Caiwen Ding
VGen
28
1
0
30 Dec 2023
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Xiang Wang
Shiwei Zhang
Hangjie Yuan
Zhiwu Qing
Biao Gong
Yingya Zhang
Yujun Shen
Changxin Gao
Nong Sang
DiffM
VGen
33
26
0
25 Dec 2023
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Lijun Yu
Xiuye Gu
José Lezama
Jonathan Huang
...
Irfan Essa
Huisheng Wang
David A. Ross
Bryan Seybold
Lu Jiang
VGen
20
237
0
21 Dec 2023
MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers
Haoyu Ma
Shahin Mahdizadehaghdam
Bichen Wu
Zhipeng Fan
Yuchao Gu
Wenliang Zhao
Lior Shapira
Xiaohui Xie
DiffM
VGen
27
4
0
19 Dec 2023
LatentMan: Generating Consistent Animated Characters using Image Diffusion Models
Abdelrahman Eldesokey
Peter Wonka
26
4
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
59
174
0
11 Dec 2023
Counterfactual World Modeling for Physical Dynamics Understanding
Rahul Venkatesh
Honglin Chen
Kevin T. Feigelis
Daniel M. Bear
Khaled Jedoui
...
Wanhee Lee
Sherry Liu
Kevin A. Smith
Judith E. Fan
Daniel L. K. Yamins
VGen
40
1
0
11 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
42
48
0
07 Dec 2023
MoMask: Generative Masked Modeling of 3D Human Motions
Chuan Guo
Yuxuan Mu
Muhammad Gohar Javed
Sen Wang
Li Cheng
VGen
37
120
0
29 Nov 2023
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang
Yinan He
Jiashuo Yu
Fan Zhang
Chenyang Si
...
Xinyuan Chen
Limin Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
77
351
0
29 Nov 2023
SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
Yuwei Guo
Ceyuan Yang
Anyi Rao
Maneesh Agrawala
Dahua Lin
Bo Dai
DiffM
VGen
28
114
0
28 Nov 2023
Previous
1
2
3
4
Next