ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.05199
  4. Cited By
MAGVIT: Masked Generative Video Transformer
v1v2 (latest)

MAGVIT: Masked Generative Video Transformer

10 December 2022
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
Huiwen Chang
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
    DiffMVGen
ArXiv (abs)PDFHTML

Papers citing "MAGVIT: Masked Generative Video Transformer"

50 / 194 papers shown
Title
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec
  Transformer
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang
Haoyue Zhan
Liwei Liu
Ruihong Zeng
Haotian Guo
Jiachen Zheng
Qiang Zhang
Shunsi Zhang
Shunsi Zhang
Zhizheng Wu
129
61
0
01 Sep 2024
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed
  Representations
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Can Qin
Congying Xia
Krithika Ramakrishnan
Michael S Ryoo
Lifu Tu
...
Silvio Savarese
Juan Carlos Niebles
Zeyuan Chen
Ran Xu
Caiming Xiong
VGenDiffM
134
3
0
22 Aug 2024
Masked Image Modeling: A Survey
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
172
8
0
13 Aug 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffMVGen
284
565
0
12 Aug 2024
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Zhenghao Zhang
Junchao Liao
Menghao Li
Zuozhuo Dai
Bingxue Qiu
Hao Hu
Shaowei Cai
Weizhi Wang
VGen
167
57
0
31 Jul 2024
CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density
  Forecasting
CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting
Ryoske Fujii
Ryo Hachiuma
Hideo Saito
127
1
0
20 Jul 2024
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Tsung-Han Wu
Giscard Biamby
Jerome Quenum
Ritwik Gupta
Joseph E. Gonzalez
Trevor Darrell
David M. Chan
VLM
90
0
0
18 Jul 2024
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang
Junliang Guo
Tianyu He
Li Zhao
Linli Xu
Jiang Bian
118
4
0
10 Jul 2024
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Yuang Zhang
Jiaxi Gu
L. Wang
Han Wang
Junqi Cheng
Yuefeng Zhu
Fangyuan Zou
VGen
161
85
0
28 Jun 2024
IRASim: Learning Interactive Real-Robot Action Simulators
IRASim: Learning Interactive Real-Robot Action Simulators
Fangqi Zhu
Hongtao Wu
Song Guo
Yuxiao Liu
Chilam Cheang
Tao Kong
127
22
0
20 Jun 2024
Autoregressive Image Generation without Vector Quantization
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
156
238
0
17 Jun 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing
  Reliability,Reproducibility, and Practicality
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
Tianle Zhang
Langtian Ma
Yuchen Yan
Yuchen Zhang
Kai Wang
...
Wenqi Shao
Yang You
Yu Qiao
Ping Luo
Kaipeng Zhang
VGen
140
2
0
13 Jun 2024
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
Ivan Skorokhodov
Willi Menapace
Aliaksandr Siarohin
Sergey Tulyakov
VGen
79
10
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLMViT
169
104
0
11 Jun 2024
Image and Video Tokenization with Binary Spherical Quantization
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao
Yuanjun Xiong
Philipp Krahenbuhl
94
24
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
134
301
0
10 Jun 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video
  Prediction
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
132
14
0
10 Jun 2024
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
Dejia Xu
Weili Nie
Chao Liu
Sifei Liu
Jan Kautz
Zhangyang Wang
Arash Vahdat
DiffMVGen
134
59
0
04 Jun 2024
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Sijie Zhao
Yong Zhang
Xiaodong Cun
Shaoshu Yang
Muyao Niu
Xiaoyu Li
Wenbo Hu
Ying Shan
DiffM
123
28
0
30 May 2024
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo
  Benchmark
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Haoxing Chen
Yan Hong
Zizheng Huang
Zhuoer Xu
Zhangxuan Gu
...
Jun Lan
Huijia Zhu
Jianfu Zhang
Weiqiang Wang
Huaxiong Li
Mamba
162
21
0
30 May 2024
Video Prediction Models as General Visual Encoders
Video Prediction Models as General Visual Encoders
James Maier
Nishanth Mohankumar
VGen
42
0
0
25 May 2024
iVideoGPT: Interactive VideoGPTs are Scalable World Models
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Jialong Wu
Shaofeng Yin
Ningya Feng
Xu He
Dong Li
Haifeng Zhang
Mingsheng Long
VGen
114
40
0
24 May 2024
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang
Zhi-Wei Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
89
4
0
23 May 2024
CamViG: Camera Aware Image-to-Video Generation with Multimodal
  Transformers
CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers
Andrew Marmon
Grant Schindler
José Lezama
Dan Kondratyuk
Bryan Seybold
Irfan Essa
VGenViTDiffM
66
3
0
21 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species
  Genomic Sequence Modeling
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
68
8
0
13 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
94
12
0
07 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGenLM&Ro
172
48
0
06 May 2024
Beyond Deepfake Images: Detecting AI-Generated Videos
Beyond Deepfake Images: Detecting AI-Generated Videos
Danial Samadi Vahdati
Tai D. Nguyen
Aref Azizpour
Matthew C. Stamm
114
16
0
24 Apr 2024
On the Content Bias in Fréchet Video Distance
On the Content Bias in Fréchet Video Distance
Jason S. Hoffman
Aniruddha Mahapatra
Gaurav Parmar
Jun-Yan Zhu
Jia-Bin Huang
EGVM
82
20
0
18 Apr 2024
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Predicting Long-horizon Futures by Conditioning on Geometry and Time
Tarasha Khurana
Deva Ramanan
AI4TS
81
0
0
17 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
124
347
0
03 Apr 2024
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz
  continuity constrAIned Normalization
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
Yao Ni
Piotr Koniusz
AI4CEGAN
103
2
0
31 Mar 2024
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject
  Control
SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control
Binyuan Huang
Yuqing Wen
Yucheng Zhao
Yaosi Hu
Yingfei Liu
...
Tiancai Wang
Chi Zhang
Chang Wen Chen
Zhenzhong Chen
Xiangyu Zhang
83
16
0
28 Mar 2024
SD-DiT: Unleashing the Power of Self-supervised Discrimination in
  Diffusion Transformer
SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
Rui Zhu
Yingwei Pan
Yehao Li
Ting Yao
Zhenglong Sun
Tao Mei
C. Chen
117
26
0
25 Mar 2024
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
Xiangpeng Yang
Linchao Zhu
Hehe Fan
Yi Yang
DiffMVGen
85
10
0
24 Mar 2024
Efficient Video Diffusion Models via Content-Frame Motion-Latent
  Decomposition
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
Sihyun Yu
Weili Nie
De-An Huang
Boyi Li
Jinwoo Shin
A. Anandkumar
VGenDiffM
100
19
0
21 Mar 2024
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific
  Adaptation
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Fu-Yun Wang
Xiaoshi Wu
Zhaoyang Huang
Xiaoyu Shi
Dazhong Shen
Guanglu Song
Yu Liu
Hongsheng Li
DiffM
74
14
0
20 Mar 2024
Generalized Predictive Model for Autonomous Driving
Generalized Predictive Model for Autonomous Driving
Jiazhi Yang
Shenyuan Gao
Yihang Qiu
Li Chen
Tianyu Li
...
Ping Luo
Jun Zhang
Andreas Geiger
Yu Qiao
Hongyang Li
VGen
135
76
0
14 Mar 2024
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala
Jianfeng Gao
Jianwei Yang
VGenDiffM
76
3
0
07 Mar 2024
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Ming-hui Li
Shuai Li
Xindong Zhang
Lei Zhang
VOS
105
18
0
28 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLMVGenEGVM
180
300
0
27 Feb 2024
Genie: Generative Interactive Environments
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGenVLM
153
188
0
23 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video
  Synthesis
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
129
62
0
22 Feb 2024
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance
  Editing
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Jianhong Bai
Tianyu He
Yuchi Wang
Junliang Guo
Haoji Hu
Zuozhu Liu
Jiang Bian
VGen
97
30
0
20 Feb 2024
Rolling Diffusion Models
Rolling Diffusion Models
David Ruhe
Jonathan Heek
Tim Salimans
Emiel Hoogeboom
DiffM
100
41
0
12 Feb 2024
Cross-view Masked Diffusion Transformers for Person Image Synthesis
Cross-view Masked Diffusion Transformers for Person Image Synthesis
T. Pham
Zhang Kang
Chang D. Yoo
108
6
0
02 Feb 2024
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models
  and Adapters with Decoupled Consistency Learning
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
Fu-Yun Wang
Zhaoyang Huang
Xiaoyu Shi
Weikang Bian
Guanglu Song
Yu Liu
Hongsheng Li
62
16
0
01 Feb 2024
ActAnywhere: Subject-Aware Video Background Generation
ActAnywhere: Subject-Aware Video Background Generation
Boxiao Pan
Zhan Xu
Chun-Hao Paul Huang
Krishna Kumar Singh
Yang Zhou
Leonidas Guibas
Jimei Yang
VGenDiffM
58
3
0
19 Jan 2024
WorldDreamer: Towards General World Models for Video Generation via
  Predicting Masked Tokens
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Xiaofeng Wang
Zheng Zhu
Guan Huang
Boyuan Wang
Xinze Chen
Jiwen Lu
VGen
76
41
0
18 Jan 2024
Vlogger: Make Your Dream A Vlog
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang
Kunchang Li
Xinyuan Chen
Yaohui Wang
Ziwei Liu
Yu Qiao
Yali Wang
VGenDiffM
81
38
0
17 Jan 2024
Previous
1234
Next