ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.12417
  4. Cited By
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

24 November 2021
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
    ViT
    VGen
ArXivPDFHTML

Papers citing "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion"

50 / 50 papers shown
Title
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
Rundong Luo
Matthew Wallingford
Ali Farhadi
Noah Snavely
Wei-Chiu Ma
VGen
120
1
0
10 Apr 2025
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
Yunlong Yuan
Yuanfan Guo
Chunwei Wang
Wei Zhang
Hang Xu
L. Zhang
DiffM
VGen
171
3
0
20 Feb 2025
Grid Diffusion Models for Text-to-Video Generation
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee
Soyeong Kwon
Taehwan Kim
103
6
0
31 Dec 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
204
1
0
25 Nov 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
121
0
0
10 Oct 2024
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
Lin Geng Foo
Hossein Rahmani
Jing Liu
143
31
0
27 Aug 2023
CCVS: Context-aware Controllable Video Synthesis
CCVS: Context-aware Controllable Video Synthesis
G. L. Moing
Jean Ponce
Cordelia Schmid
76
79
0
16 Jul 2021
CogView: Mastering Text-to-Image Generation via Transformers
CogView: Mastering Text-to-Image Generation via Transformers
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
...
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
99
779
0
26 May 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffM
VGen
67
240
0
30 Apr 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
295
498
0
20 Apr 2021
Paint by Word
Paint by Word
A. Andonian
David Bau
Audrey Cui
YeonHwan Park
Ali Jahanian
Antonio Torralba
A. Oliva
DiffM
65
125
0
19 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
900
29,372
0
26 Feb 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
391
4,941
0
24 Feb 2021
Cross-Modal Contrastive Learning for Text-to-Image Generation
Cross-Modal Contrastive Learning for Text-to-Image Generation
Han Zhang
Jing Yu Koh
Jason Baldridge
Honglak Lee
Yinfei Yang
GAN
125
363
0
12 Jan 2021
Taming Transformers for High-Resolution Image Synthesis
Taming Transformers for High-Resolution Image Synthesis
Patrick Esser
Robin Rombach
Bjorn Ommer
ViT
119
2,950
0
17 Dec 2020
Latent Video Transformer
Latent Video Transformer
Ruslan Rakhimov
Denis Volkhonskiy
Alexey Artemov
Denis Zorin
Evgeny Burnaev
VGen
88
120
0
18 Jun 2020
Transformation-based Adversarial Video Prediction on Large-Scale Data
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline Luc
Aidan Clark
Sander Dieleman
Diego de Las Casas
Yotam Doron
Albin Cassirer
Karen Simonyan
VGen
276
87
0
09 Mar 2020
Stochastic Latent Residual Video Prediction
Stochastic Latent Residual Video Prediction
Jean-Yves Franceschi
E. Delasalles
Mickaël Chen
Sylvain Lamprier
Patrick Gallinari
VGen
59
159
0
21 Feb 2020
Axial Attention in Multidimensional Transformers
Axial Attention in Multidimensional Transformers
Jonathan Ho
Nal Kalchbrenner
Dirk Weissenborn
Tim Salimans
107
530
0
20 Dec 2019
Segmentation Transformer: Object-Contextual Representations for Semantic
  Segmentation
Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation
Yuhui Yuan
Xiaokang Chen
Xilin Chen
Jingdong Wang
ViT
224
1,417
0
24 Sep 2019
Stand-Alone Self-Attention in Vision Models
Stand-Alone Self-Attention in Vision Models
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLM
SLR
ViT
89
1,214
0
13 Jun 2019
What Does BERT Look At? An Analysis of BERT's Attention
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
215
1,594
0
11 Jun 2019
Scaling Autoregressive Video Models
Scaling Autoregressive Video Models
Dirk Weissenborn
Oscar Täckström
Jakob Uszkoreit
DiffM
VGen
85
201
0
06 Jun 2019
Generating Long Sequences with Sparse Transformers
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
112
1,896
0
23 Apr 2019
VATEX: A Large-Scale, High-Quality Multilingual Dataset for
  Video-and-Language Research
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Eric Wang
Jiawei Wu
Junkun Chen
Lei Li
Yuan-fang Wang
William Yang Wang
93
549
0
06 Apr 2019
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image
  Synthesis
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
Minfeng Zhu
Pingbo Pan
Wei Chen
Yi Yang
GAN
52
580
0
02 Apr 2019
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
Taesung Park
Ming-Yuan Liu
Ting-Chun Wang
Jun-Yan Zhu
153
2,685
0
18 Mar 2019
VideoFlow: A Conditional Flow-Based Model for Stochastic Video
  Generation
VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
Manoj Kumar
Mohammad Babaeizadeh
D. Erhan
Chelsea Finn
Sergey Levine
Laurent Dinh
Durk Kingma
VGen
84
132
0
04 Mar 2019
Towards Accurate Generative Models of Video: A New Metric & Challenges
Towards Accurate Generative Models of Video: A New Metric & Challenges
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVM
VGen
91
728
0
03 Dec 2018
The Open Images Dataset V4: Unified image classification, object
  detection, and visual relationship detection at scale
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
...
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
ObjD
VLM
96
1,348
0
02 Nov 2018
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Andrew Brock
Jeff Donahue
Karen Simonyan
259
5,392
0
28 Sep 2018
Stochastic Adversarial Video Prediction
Stochastic Adversarial Video Prediction
Alex X. Lee
Richard Y. Zhang
F. Ebert
Pieter Abbeel
Chelsea Finn
Sergey Levine
DRL
VGen
GAN
65
453
0
04 Apr 2018
Stochastic Video Generation with a Learned Prior
Stochastic Video Generation with a Learned Prior
Emily L. Denton
Rob Fergus
VGen
80
526
0
21 Feb 2018
Image Transformer
Image Transformer
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
ViT
131
1,679
0
15 Feb 2018
Moments in Time Dataset: one million videos for event understanding
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
...
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
92
548
0
09 Jan 2018
AttnGAN: Fine-Grained Text to Image Generation with Attentional
  Generative Adversarial Networks
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
GAN
ViT
108
1,716
0
28 Nov 2017
Neural Discrete Representation Learning
Neural Discrete Representation Learning
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
226
5,008
0
02 Nov 2017
Stochastic Variational Video Prediction
Stochastic Variational Video Prediction
Mohammad Babaeizadeh
Chelsea Finn
D. Erhan
R. Campbell
Sergey Levine
DRL
VGen
73
542
0
30 Oct 2017
Self-Supervised Visual Planning with Temporal Skip Connections
Self-Supervised Visual Planning with Temporal Skip Connections
F. Ebert
Chelsea Finn
Alex X. Lee
Sergey Levine
SSL
73
321
0
15 Oct 2017
MoCoGAN: Decomposing Motion and Content for Video Generation
MoCoGAN: Decomposing Motion and Content for Video Generation
Sergey Tulyakov
Ming-Yuan Liu
Xiaodong Yang
Jan Kautz
GAN
129
1,147
0
17 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
687
131,526
0
12 Jun 2017
The Kinetics Human Action Video Dataset
The Kinetics Human Action Video Dataset
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
...
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
250
3,802
0
19 May 2017
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
Aaron van den Oord
Nal Kalchbrenner
Oriol Vinyals
L. Espeholt
Alex Graves
Koray Kavukcuoglu
VLM
202
2,509
0
16 Jun 2016
Improved Techniques for Training GANs
Improved Techniques for Training GANs
Tim Salimans
Ian Goodfellow
Wojciech Zaremba
Vicki Cheung
Alec Radford
Xi Chen
GAN
478
9,048
0
10 Jun 2016
Unsupervised Learning for Physical Interaction through Video Prediction
Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn
Ian Goodfellow
Sergey Levine
70
1,043
0
23 May 2016
Pixel Recurrent Neural Networks
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
463
2,568
0
25 Jan 2016
Unsupervised Representation Learning with Deep Convolutional Generative
  Adversarial Networks
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford
Luke Metz
Soumith Chintala
GAN
OOD
250
14,008
0
19 Nov 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,039
0
22 Dec 2014
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
1.7K
39,525
0
01 Sep 2014
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
413
43,638
0
01 May 2014
1