Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.13195
Cited By
FitVid: Overfitting in Pixel-Level Video Prediction
24 June 2021
Mohammad Babaeizadeh
M. Saffar
Suraj Nair
Sergey Levine
Chelsea Finn
D. Erhan
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FitVid: Overfitting in Pixel-Level Video Prediction"
32 / 32 papers shown
Title
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
84
2
0
25 Mar 2025
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
Yanting Yang
Minghao Chen
Qibo Qiu
Jiahao Wu
Wenxiao Wang
Binbin Lin
Ziyu Guan
Xiaofei He
LM&Ro
45
2
0
20 Jul 2024
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Jialong Wu
Shaofeng Yin
Ningya Feng
Xu He
Dong Li
Haifeng Zhang
Mingsheng Long
VGen
49
22
0
24 May 2024
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
V.Ya. Arkhipkin
Zein Shaheen
Viacheslav Vasilev
E. Dakhova
Andrey Kuznetsov
Denis Dimitrov
DiffM
VGen
29
5
0
22 Nov 2023
USTEP: Spatio-Temporal Predictive Learning under A Unified View
Cheng Tan
Jue Wang
Zhangyang Gao
Siyuan Li
Stan Z. Li
38
1
0
09 Oct 2023
Structured World Models from Human Videos
Russell Mendonca
Shikhar Bahl
Deepak Pathak
LM&Ro
47
86
0
21 Aug 2023
Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes
Aran Nayebi
R. Rajalingham
M. Jazayeri
G. R. Yang
36
17
0
19 May 2023
3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive Physics under Challenging Scenes
Haotian Xue
Antonio Torralba
J. Tenenbaum
Daniel L. K. Yamins
Yunzhu Li
H. Tung
PINN
VGen
AI4CE
61
8
0
22 Apr 2023
Multi-modal learning for geospatial vegetation forecasting
V. Benson
Claire Robin
C. Requena-Mesa
Lazaro Alonso
Nuno Carvalhais
José A. Cortés
Zhihan Gao
Nora Linscheid
M. Weynants
Markus Reichstein
30
11
0
28 Mar 2023
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo
Semin Kim
Doyup Lee
Chiheon Kim
Seunghoon Hong
31
3
0
20 Mar 2023
Long-horizon video prediction using a dynamic latent hierarchy
Alexey Zakharov
Qinghai Guo
Z. Fountas
31
4
0
29 Dec 2022
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
38
224
0
10 Dec 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
56
37
0
23 Nov 2022
Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer
Haowen Shi
Zhijie Xu
Kailun Yang
Xiaoyue Yin
Ze Wang
Kaiwei Wang
ViT
43
5
0
21 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
39
372
0
20 Nov 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
113
146
0
05 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffM
VGen
56
371
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
49
1,476
0
05 Oct 2022
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan
Zhangyang Gao
Lirong Wu
Yongjie Xu
Jun Xia
Siyuan Li
Stan Z. Li
46
107
0
24 Jun 2022
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
112
111
0
23 Jun 2022
Diffusion Models for Video Prediction and Infilling
Tobias Hoppe
Arash Mehrjou
Stefan Bauer
Didrik Nielsen
Andrea Dittadi
DiffM
VGen
27
131
0
15 Jun 2022
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation
Vikram S. Voleti
Alexia Jolicoeur-Martineau
Christopher Pal
DiffM
VGen
13
291
0
19 May 2022
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
44
1,514
0
07 Apr 2022
Reinforcement Learning with Action-Free Pre-Training from Videos
Younggyo Seo
Kimin Lee
Stephen James
Pieter Abbeel
SSL
OnRL
18
117
0
25 Mar 2022
Stochastic Video Prediction with Structure and Motion
Adil Kaan Akan
Sadra Safadoust
Fatma Guney
VGen
24
9
0
20 Mar 2022
Transframer: Arbitrary Frame Prediction with Generative Models
C. Nash
João Carreira
Jacob Walker
Iain Barr
Andrew Jaegle
Mateusz Malinowski
Peter W. Battaglia
ViT
27
37
0
17 Mar 2022
Diffusion Probabilistic Modeling for Video Generation
Ruihan Yang
Prakhar Srivastava
Stephan Mandt
DiffM
VGen
59
256
0
16 Mar 2022
Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
Sihyun Yu
Jihoon Tack
Sangwoo Mo
Hyunsu Kim
Junho Kim
Jung-Woo Ha
Jinwoo Shin
DiffM
VGen
35
199
0
21 Feb 2022
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
245
484
0
20 Apr 2021
ArrowGAN : Learning to Generate Videos by Learning Arrow of Time
Kibeom Hong
Youngjung Uh
H. Byun
GAN
128
9
0
11 Jan 2021
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline Luc
Aidan Clark
Sander Dieleman
Diego de Las Casas
Yotam Doron
Albin Cassirer
Karen Simonyan
VGen
231
86
0
09 Mar 2020
Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
Vincent Le Guen
Nicolas Thome
AI4CE
PINN
89
289
0
03 Mar 2020
1