ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.12822
  4. Cited By
AVID: Adapting Video Diffusion Models to World Models

AVID: Adapting Video Diffusion Models to World Models

1 October 2024
Marc Rigter
Tarun Gupta
Agrin Hilmkil
Chao Ma
    VGen
ArXivPDFHTML

Papers citing "AVID: Adapting Video Diffusion Models to World Models"

23 / 23 papers shown
Title
Pandora: Towards General World Model with Natural Language Actions and
  Video States
Pandora: Towards General World Model with Natural Language Actions and Video States
Jiannan Xiang
Guangyi Liu
Yi Gu
Qiyue Gao
Yuting Ning
...
Shibo Hao
Yemin Shi
Zhengzhong Liu
Eric P. Xing
Zhiting Hu
VGen
82
38
0
12 Jun 2024
Learning to Act without Actions
Learning to Act without Actions
Dominik Schmidt
Minqi Jiang
OffRL
75
32
0
17 Dec 2023
World Models via Policy-Guided Trajectory Diffusion
World Models via Policy-Guided Trajectory Diffusion
Marc Rigter
Jun Yamada
Ingmar Posner
66
21
0
13 Dec 2023
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Zhan Qin
Xiang Wang
Deli Zhao
Jingren Zhou
DiffM
VGen
87
218
0
07 Nov 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
72
230
0
29 Sep 2023
VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by
  Using Diffusion Model with ControlNet
VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNet
Zhihao Hu
Dong Xu
DiffM
VGen
64
65
0
26 Jul 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
531
13,788
0
15 Mar 2023
An Image is Worth One Word: Personalizing Text-to-Image Generation using
  Textual Inversion
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
84
1,837
0
02 Aug 2022
Classifier-Free Diffusion Guidance
Classifier-Free Diffusion Guidance
Jonathan Ho
Tim Salimans
FaML
84
3,830
0
26 Jul 2022
Compositional Visual Generation with Composable Diffusion Models
Compositional Visual Generation with Composable Diffusion Models
Nan Liu
Shuang Li
Yilun Du
Antonio Torralba
J. Tenenbaum
DiffM
CoGe
118
510
0
03 Jun 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
159
801
0
12 May 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
140
1,563
0
07 Apr 2022
Controllable and Compositional Generation with Latent-Space Energy-Based
  Models
Controllable and Compositional Generation with Latent-Space Energy-Based Models
Weili Nie
Arash Vahdat
Anima Anandkumar
49
79
0
21 Oct 2021
Cascaded Diffusion Models for High Fidelity Image Generation
Cascaded Diffusion Models for High Fidelity Image Generation
Jonathan Ho
Chitwan Saharia
William Chan
David J. Fleet
Mohammad Norouzi
Tim Salimans
120
1,196
0
30 May 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
673
28,659
0
26 Feb 2021
Mastering Atari with Discrete World Models
Mastering Atari with Discrete World Models
Danijar Hafner
Timothy Lillicrap
Mohammad Norouzi
Jimmy Ba
DRL
78
834
0
05 Oct 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
498
41,106
0
28 May 2020
World Models
World Models
David R Ha
Jürgen Schmidhuber
SyDa
100
1,062
0
27 Mar 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
297
11,610
0
11 Jan 2018
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
199
7,961
0
22 May 2017
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.2K
76,547
0
18 May 2015
Going Deeper with Convolutions
Going Deeper with Convolutions
Christian Szegedy
Wei Liu
Yangqing Jia
P. Sermanet
Scott E. Reed
Dragomir Anguelov
D. Erhan
Vincent Vanhoucke
Andrew Rabinovich
301
43,511
0
17 Sep 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
928
99,991
0
04 Sep 2014
1