Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
All Papers
Title
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,155 papers shown
Title
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
301
17
0
19 Dec 2024
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Zhihang Yuan
Yuzhang Shang
Hao Zhang
Tongcheng Fang
Rui Xie
Bingxin Xu
Yan Yan
Shengen Yan
Guohao Dai
Yu Wang
DiffM
222
3
0
18 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hong Chen
Zihan Wang
Xianrui Li
Xingwu Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
364
17
0
14 Dec 2024
A Decade of Deep Learning: A Survey on The Magnificent Seven
Dilshod Azizov
Muhammad Arslan Manzoor
Velibor Bojkovic
Yingxu Wang
Peng Wang
...
Liang Li
Siwei Liu
Yu Zhong
Wei Liu
Shangsong Liang
OOD
AI4TS
MedIm
208
0
0
13 Dec 2024
OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs
Yuanzhi Zhu
R. Wang
Shilin Lu
Junnan Li
Hanshu Yan
Peng Sun
SupR
242
8
0
12 Dec 2024
Unsupervised Cross-Domain Regression for Fine-grained 3D Game Character Reconstruction
Qi Wen
Xiang Wen
Hao Jiang
Siqi Yang
Bingfeng Han
Tianlei Hu
Gang Chen
Shuang Li
3DH
145
0
0
11 Dec 2024
CoMA: Compositional Human Motion Generation with Multi-modal Agents
Shanlin Sun
Gabriel De Araujo
Jiaqi Xu
S. Kevin Zhou
Hanwen Zhang
Ziheng Huang
Chenyu You
Xiaohui Xie
215
5
0
10 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
582
2
0
07 Dec 2024
LMDM:Latent Molecular Diffusion Model For 3D Molecule Generation
Xiang Chen
DiffM
155
0
0
05 Dec 2024
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation
Xianrui Li
Kai Qiu
Hong Chen
Jason Kuen
Jiuxiang Gu
Jiadong Wang
Zhe Lin
Bhiksha Raj
VLM
257
10
0
02 Dec 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
334
16
0
28 Nov 2024
3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes
Tejaswini Medi
Arianna Rampini
Pradyumna Reddy
P. Jayaraman
Margret Keuper
DiffM
238
1
0
28 Nov 2024
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling
J. Hyung
Kinam Kim
Susung Hong
M. Kim
Jaegul Choo
VGen
186
6
0
27 Nov 2024
Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency
Y. Wang
Jiajie Teng
Jiajiong Cao
Yuming Li
Chenguang Ma
Hongteng Xu
Dixin Luo
VGen
DiffM
156
2
0
25 Nov 2024
VQalAttent: a Transparent Speech Generation Pipeline based on Transformer-learned VQ-VAE Latent Space
Armani Rodriguez
S. Kokalj-Filipovic
153
1
0
22 Nov 2024
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiñski
Antoni Kowalczuk
Franziska Boenisch
Adam Dziedzic
188
3
0
19 Nov 2024
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation
Tim Elsner
Paula Usinger
Julius Nehring-Wirxel
Gregor Kobsik
Victor Czech
Yanjiang He
I. Lim
Leif Kobbelt
119
1
0
15 Nov 2024
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings
Aditya Sanghi
Aliasghar Khani
Pradyumna Reddy
Arianna Rampini
Derek Cheung
Kamal Rahimi Malekshan
Kanika Madan
Hooman Shayani
147
5
0
12 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
131
8
0
11 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
245
23
0
08 Nov 2024
Analyzing The Language of Visual Tokens
David M. Chan
Rodolfo Corona
J. S. Park
Cheol Jun Cho
Yutong Bai
Trevor Darrell
61
7
0
07 Nov 2024
Image Understanding Makes for A Good Tokenizer for Image Generation
Luting Wang
Yang Zhao
Zijian Zhang
Jiashi Feng
Si Liu
Bingyi Kang
VLM
103
7
0
07 Nov 2024
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning
G. Zhou
Hengkai Pan
Yann LeCun
Lerrel Pinto
VGen
LM&Ro
OffRL
146
53
0
07 Nov 2024
Training on test proteins improves fitness, structure, and function prediction
Anton Bushuiev
Roman Bushuiev
Nikola Zadorozhny
Raman Samusevich
Hannes Stärk
Jiri Sedlar
Tomáš Pluskal
Josef Sivic
70
0
0
04 Nov 2024
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Yongxin Zhu
Bing Li
Yifei Xin
Zhihua Xia
Linli Xu
174
25
0
04 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
324
4
0
02 Nov 2024
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
172
56
1
01 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
173
1
0
01 Nov 2024
α
α
α
-TCVAE: On the relationship between Disentanglement and Diversity
Cristian Meo
Louis Mahon
Anirudh Goyal
Justin Dauwels
DRL
183
8
0
01 Nov 2024
Diffusion-nested Auto-Regressive Synthesis of Heterogeneous Tabular Data
Hengrui Zhang
Liancheng Fang
Qitian Wu
Philip S. Yu
DiffM
LMTD
86
4
0
28 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
158
17
0
28 Oct 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
644
1
0
28 Oct 2024
Equivariant Blurring Diffusion for Hierarchical Molecular Conformer Generation
Jiwoong Park
Yang Shen
DiffM
141
2
0
26 Oct 2024
Image Generation from Image Captioning -- Invertible Approach
Nandakishore S Menon
Chandramouli Kamanchi
Raghuram Bharadwaj Diddigi
19
0
0
26 Oct 2024
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Liulei Li
Wenguan Wang
Yue Yang
134
14
0
26 Oct 2024
Augmenting Training Data with Vector-Quantized Variational Autoencoder for Classifying RF Signals
Srihari Kamesh Kompella
Kemal Davaslioglu
Y. Sagduyu
Sastry Kompella
59
2
0
23 Oct 2024
MotionGlot: A Multi-Embodied Motion Generation Model
Sudarshan Harithas
Srinath Sridhar
226
3
0
22 Oct 2024
Conjuring Semantic Similarity
Tian Yu Liu
Stefano Soatto
DiffM
266
0
0
21 Oct 2024
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
101
4
0
21 Oct 2024
SeisLM: a Foundation Model for Seismic Waveforms
Tianlin Liu
Jannes Münchmeyer
Laura Laurenti
C. Marone
Maarten V. de Hoop
Ivan Dokmanić
VLM
137
7
0
21 Oct 2024
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation
Jilong Li
Zhenxi Song
Jiaqi Wang
Meishan Zhang
Honghai Liu
Min Zhang
Zhiguo Zhang
153
2
0
19 Oct 2024
SNAC: Multi-Scale Neural Audio Codec
Hubert Siuzdak
Florian Grötschla
Luca A. Lanzendörfer
85
28
0
18 Oct 2024
Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion
Bac Nguyen
and Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Stefano Ermon
Yuki Mitsufuji
145
1
0
18 Oct 2024
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
281
15
0
18 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
140
11
0
17 Oct 2024
Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Jiwan Hur
Dong-Jae Lee
Gyojin Han
Jaehyun Choi
Yunho Jeon
Junmo Kim
DiffM
138
0
0
17 Oct 2024
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Yongxin Zhu
Bing Li
Hang Zhang
Xin Li
Linli Xu
Lidong Bing
DiffM
148
14
0
16 Oct 2024
From Real Artifacts to Virtual Reference: A Robust Framework for Translating Endoscopic Images
Junyang Wu
F. Xie
Jiayuan Sun
Yun Gu
Guang-Zhong Yang
MedIm
86
0
0
15 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
159
13
0
14 Oct 2024
Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior
Mingyuan Yan
Jiawei Wu
Rushi Shah
Dianbo Liu
68
1
0
14 Oct 2024
Previous
1
2
3
4
5
...
22
23
24
Next