Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding
Jinzhao Zhou
Yiqun Duan
Fred Chang
T. Do
Yu-Kai Wang
Chin-Teng Lin
78
5
0
28 Aug 2024
AEMLO: AutoEncoder-Guided Multi-Label Oversampling
Ao Zhou
Bin Liu
Jin Wang
K. Sun
Kelin Liu
SyDa
72
0
0
23 Aug 2024
A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse
Zhongliang Guo
Lei Fang
Jingyu Lin
Yifei Qian
Shuai Zhao
Zeyu Wang
Zeyu Wang
Cunjian Chen
Ognjen Arandjelović
Chun Pong Lau
DiffM
AAML
139
9
0
20 Aug 2024
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Jiasong Feng
Ao Ma
Jing Wang
Bo Cheng
Xiaodan Liang
Dawei Leng
Yuhui Yin
DiffM
VGen
103
6
0
15 Aug 2024
DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion
Yujia Wu
Yiming Shi
Jiwei Wei
Chengwei Sun
Yuyang Zhou
Yang Yang
Heng Tao Shen
115
3
0
13 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
73
8
0
12 Aug 2024
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
Jisoo Kim
Jungbin Cho
Joonho Park
Soonmin Hwang
Da Eun Kim
Geon Kim
Youngjae Yu
148
1
0
12 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
177
59
0
05 Aug 2024
PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance
Aoming Liu
Zhong Li
Zhang Chen
Nannan Li
Yinghao Xu
Bryan A. Plummer
97
7
0
04 Aug 2024
LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation
Dwij Mehta
Aditya Mehta
Pratik Narang
DiffM
94
0
0
04 Aug 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
Qian Zhang
Xiangzi Dai
Ninghua Yang
Xiang An
Ziyong Feng
Xingyu Ren
VLM
CLIP
132
22
0
02 Aug 2024
Informed Correctors for Discrete Diffusion Models
Yixiu Zhao
Jiaxin Shi
F. Chen
Shaul Druckmann
Lester W. Mackey
Scott W. Linderman
150
15
0
30 Jul 2024
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Chak Tou Leong
Hongru Cai
Wenjie Wang
Leigang Qu
Yinwei Wei
Wenjie Li
Liqiang Nie
Tat-Seng Chua
DiffM
79
1
0
24 Jul 2024
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
Zirui Shao
Feiyu Gao
Hangdi Xing
Zepeng Zhu
Zhi Yu
Jiajun Bu
Qi Zheng
Cong Yao
61
3
0
22 Jul 2024
Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation
Zhe Zhao
Mengshi Qi
Huadong Ma
DRL
94
3
0
19 Jul 2024
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
Yuanzhi Zhu
Xingchao Liu
Qiang Liu
90
10
0
17 Jul 2024
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
Han Zhou
Wei Dong
Xiaohong Liu
Shuaicheng Liu
Xiongkuo Min
Guangtao Zhai
Jun Chen
110
17
0
17 Jul 2024
Generating 3D House Wireframes with Semantics
Xueqi Ma
Yilin Liu
Wenjun Zhou
Ruowei Wang
Hui Huang
3DV
89
0
0
17 Jul 2024
Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data
Tim Elsner
Paula Usinger
Victor Czech
Gregor Kobsik
Yanjiang He
I. Lim
Leif Kobbelt
83
2
0
16 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serrà
DiffM
VGen
95
16
0
15 Jul 2024
RTMW: Real-Time Multi-Person 2D and 3D Whole-body Pose Estimation
Tao Jiang
Xinchen Xie
Yining Li
3DH
92
4
0
11 Jul 2024
Several questions of visual generation in 2024
Shuyang Gu
78
1
0
11 Jul 2024
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task
Yiran Yang
Jinchao Zhang
Ying Deng
Jie Zhou
DiffM
82
0
0
09 Jul 2024
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
156
0
0
09 Jul 2024
Balance of Number of Embedding and their Dimensions in Vector Quantization
Hang Chen
Sankepally Sainath Reddy
Ziwei Chen
Dianbo Liu
93
2
0
06 Jul 2024
NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
E. Nowara
Pedro H. O. Pinheiro
Sai Pooja Mahajan
Omar Mahmood
Andrew Watkins
Saeed Saremi
Michael R. Maser
BDL
DiffM
70
2
0
03 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Li
Fan Ma
Zongxin Yang
Yue Yang
184
11
0
02 Jul 2024
Grouped Discrete Representation Guides Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCL
95
1
0
01 Jul 2024
Deep learning for automated detection of breast cancer in deep ultraviolet fluorescence images with diffusion probabilistic model
Sepehr Salem Ghahfarokhi
Tyrell To
Julie M Jorns
T. Yen
Bing Yu
Dong Hye Ye
DiffM
MedIm
29
2
0
01 Jul 2024
Efficient World Models with Context-Aware Tokenization
Vincent Micheli
Eloi Alonso
François Fleuret
OffRL
VLM
82
6
0
27 Jun 2024
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
William Berman
A. Peysakhovich
91
4
0
26 Jun 2024
LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models
Mengdan Zhu
Raasikh Kanjiani
Jiahui Lu
Andrew Choi
Qirui Ye
Liang Zhao
DiffM
101
1
0
21 Jun 2024
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images
Rushikesh Zawar
Shaurya Dewan
Andrew F. Luo
Margaret M. Henderson
Michael J. Tarr
Leila Wehbe
VGen
CoGe
78
1
0
19 Jun 2024
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
201
238
0
17 Jun 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Lei Zhu
Fangyun Wei
Yanye Lu
Dong Chen
VLM
100
40
0
17 Jun 2024
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes
Aghiles Kebaili
J. Lapuyade-Lahorgue
Pierre Vera
S. Ruan
MedIm
75
1
0
17 Jun 2024
STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft
Zhonghan Zhao
Wenhao Chai
Xuan Wang
Ke Ma
Kewei Chen
Dongxu Guo
Tian Ye
Yanting Zhang
Hongwei Wang
Gaoang Wang
LLMAG
LM&Ro
99
7
0
17 Jun 2024
Graph Knowledge Distillation to Mixture of Experts
Pavel Rumiantsev
Mark Coates
98
0
0
17 Jun 2024
ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Zhe Lin
Rita Singh
Bhiksha Raj
DiffM
97
27
0
14 Jun 2024
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu
Zhanpeng Zeng
Ju He
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
117
22
0
13 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
129
46
0
13 Jun 2024
The Significance of Latent Data Divergence in Predicting System Degradation
Miguel Fernandes
Catarina Silva
Alberto Cardoso
Bernardete Ribeiro
33
0
0
13 Jun 2024
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation
Raphael Tang
Xinyu Crystina Zhang
Lixinyu Xu
Yao Lu
Wenyan Li
Pontus Stenetorp
Jimmy Lin
Ferhan Ture
105
0
0
12 Jun 2024
Diffusion-Promoted HDR Video Reconstruction
Yuanshen Guan
Ruikang Xu
Mingde Yao
Ruisheng Gao
Lizhi Wang
Zhiwei Xiong
95
2
0
12 Jun 2024
To be Continuous, or to be Discrete, Those are Bits of Questions
Yiran Wang
Masao Utiyama
86
4
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
182
104
0
11 Jun 2024
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Aoxiong Yin
Haoyuan Li
Kai Shen
Siliang Tang
Yueting Zhuang
SLR
86
6
0
11 Jun 2024
Towards Realistic Data Generation for Real-World Super-Resolution
Long Peng
Wenbo Li
Renjing Pei
Jingjing Ren
Xueyang Fu
Yang Wang
Yang Cao
Zheng-Jun Zha
154
20
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
155
301
0
10 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
88
17
0
08 Jun 2024
Previous
1
2
3
4
5
6
...
21
22
23
Next