Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,107 papers shown
Title
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
Hao Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
108
1
0
18 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hongyu Chen
Zihan Wang
Xianrui Li
Xingchen Sun
Fangyi Chen
Jiang Liu
Jie Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
114
7
0
14 Dec 2024
A Decade of Deep Learning: A Survey on The Magnificent Seven
Dilshod Azizov
Muhammad Arslan Manzoor
Velibor Bojkovic
Yingxu Wang
Zihan Wang
...
Liang Li
Siwei Liu
Yu Zhong
Wei Liu
Shangsong Liang
OOD
AI4TS
MedIm
129
0
0
13 Dec 2024
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
92
3
0
12 Dec 2024
Unsupervised Cross-Domain Regression for Fine-grained 3D Game Character Reconstruction
Qi Wen
Xiang Wen
Hao Jiang
Siqi Yang
Bingfeng Han
Tianlei Hu
Gang Chen
Shuang Li
3DH
78
0
0
11 Dec 2024
CoMA: Compositional Human Motion Generation with Multi-modal Agents
Shanlin Sun
Gabriel De Araujo
Jiaqi Xu
S. Kevin Zhou
Hanwen Zhang
Ziheng Huang
Chenyu You
Xiaohui Xie
102
4
0
10 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
250
0
0
07 Dec 2024
LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Xiang Chen
DiffM
81
0
0
05 Dec 2024
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Xianrui Li
Kai Qiu
Hongyu Chen
Jason Kuen
Jiuxiang Gu
Jie Wang
Zhe-nan Lin
Bhiksha Raj
VLM
128
3
0
02 Dec 2024
3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes
Tejaswini Medi
Arianna Rampini
Pradyumna Reddy
P. Jayaraman
M. Keuper
DiffM
84
0
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
137
6
0
28 Nov 2024
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling
J. Hyung
Kinam Kim
Susung Hong
M. Kim
Jaegul Choo
VGen
90
3
0
27 Nov 2024
Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency
Y. Wang
Jiajie Teng
Jiajiong Cao
Yuming Li
Chenguang Ma
Hongteng Xu
Dixin Luo
VGen
DiffM
79
0
0
25 Nov 2024
VQalAttent: a Transparent Speech Generation Pipeline based on Transformer-learned VQ-VAE Latent Space
Armani Rodriguez
S. Kokalj-Filipovic
75
0
0
22 Nov 2024
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiñski
Antoni Kowalczuk
Franziska Boenisch
Adam Dziedzic
77
1
0
19 Nov 2024
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation
Tim Elsner
Paula Usinger
Julius Nehring-Wirxel
Gregor Kobsik
Victor Czech
Yanjiang He
I. Lim
Leif Kobbelt
39
1
0
15 Nov 2024
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings
Aditya Sanghi
Aliasghar Khani
Pradyumna Reddy
Arianna Rampini
Derek Cheung
Kamal Rahimi Malekshan
Kanika Madan
Hooman Shayani
48
3
0
12 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
63
4
0
11 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Min Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
53
9
0
08 Nov 2024
Analyzing The Language of Visual Tokens
David M. Chan
Rodolfo Corona
J. S. Park
Cheol Jun Cho
Yutong Bai
Trevor Darrell
28
2
0
07 Nov 2024
Image Understanding Makes for A Good Tokenizer for Image Generation
Luting Wang
Yang Zhao
Zijian Zhang
Jiashi Feng
Si Liu
Bingyi Kang
VLM
47
4
0
07 Nov 2024
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning
G. Zhou
Hengkai Pan
Yann LeCun
Lerrel Pinto
VGen
LM&Ro
OffRL
35
15
0
07 Nov 2024
Training on test proteins improves fitness, structure, and function prediction
Anton Bushuiev
Roman Bushuiev
Nikola Zadorozhny
Raman Samusevich
Hannes Stärk
Jiri Sedlar
Tomáš Pluskal
Josef Sivic
31
0
0
04 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
B. Li
Yifei Xin
Linli Xu
44
10
0
04 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
56
0
0
02 Nov 2024
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
59
31
1
01 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
51
0
0
01 Nov 2024
α
α
α
-TCVAE: On the relationship between Disentanglement and Diversity
Cristian Meo
Louis Mahon
Anirudh Goyal
Justin Dauwels
DRL
67
8
0
01 Nov 2024
Diffusion-nested Auto-Regressive Synthesis of Heterogeneous Tabular Data
Hengrui Zhang
Liancheng Fang
Qitian Wu
Philip S. Yu
DiffM
LMTD
39
1
0
28 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
31
10
0
28 Oct 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
220
0
0
28 Oct 2024
Equivariant Blurring Diffusion for Hierarchical Molecular Conformer Generation
Jiwoong Park
Yang Shen
DiffM
42
0
0
26 Oct 2024
Image Generation from Image Captioning -- Invertible Approach
Nandakishore S Menon
Chandramouli Kamanchi
Raghuram Bharadwaj Diddigi
14
0
0
26 Oct 2024
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Liulei Li
Wenguan Wang
Yuqing Yang
42
7
0
26 Oct 2024
Augmenting Training Data with Vector-Quantized Variational Autoencoder for Classifying RF Signals
Srihari Kamesh Kompella
Kemal Davaslioglu
Y. Sagduyu
Sastry Kompella
21
1
0
23 Oct 2024
MotionGlot: A Multi-Embodied Motion Generation Model
Sudarshan Harithas
Srinath Sridhar
82
1
0
22 Oct 2024
Conjuring Semantic Similarity
Tian Yu Liu
Stefano Soatto
DiffM
32
0
0
21 Oct 2024
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
37
3
0
21 Oct 2024
SeisLM: a Foundation Model for Seismic Waveforms
Tianlin Liu
Jannes Münchmeyer
Laura Laurenti
C. Marone
Maarten V. de Hoop
Ivan Dokmanić
VLM
28
4
0
21 Oct 2024
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation
Juntao Li
Zhenxi Song
Jiaqi Wang
Min Zhang
Honghai Liu
Min Zhang
Zhiguo Zhang
38
1
0
19 Oct 2024
SNAC: Multi-Scale Neural Audio Codec
Hubert Siuzdak
Florian Grötschla
Luca A. Lanzendörfer
27
10
0
18 Oct 2024
Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion
Bac Nguyen
and Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Stefano Ermon
Yuki Mitsufuji
66
0
0
18 Oct 2024
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
64
5
0
18 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
39
3
0
17 Oct 2024
Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Jiwan Hur
Dong-Jae Lee
Gyojin Han
Jaehyun Choi
Yunho Jeon
Junmo Kim
DiffM
35
0
0
17 Oct 2024
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Yongxin Zhu
B. Li
Hang Zhang
Xin Li
Linli Xu
Lidong Bing
DiffM
42
9
0
16 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
42
6
0
14 Oct 2024
Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior
Mingyuan Yan
Jiawei Wu
Rushi Shah
Dianbo Liu
28
0
0
14 Oct 2024
AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior
Guoqiang Liang
Qingnan Fan
Bingtao Fu
Jinwei Chen
Hong Gu
Lin Wang
DiffM
34
0
0
13 Oct 2024
Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings
Di Wu
Siyuan Li
Chen Feng
Lu Cao
Yujie Zhang
Jie Yang
Mohamad Sawan
33
0
0
13 Oct 2024
Previous
1
2
3
4
5
6
...
21
22
23
Next