Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
Randomized Autoregressive Visual Generation
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VGen
DiffM
151
40
1
01 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
143
1
0
01 Nov 2024
α
α
α
-TCVAE: On the relationship between Disentanglement and Diversity
Cristian Meo
Louis Mahon
Anirudh Goyal
Justin Dauwels
DRL
159
8
0
01 Nov 2024
Diffusion-nested Auto-Regressive Synthesis of Heterogeneous Tabular Data
Hengrui Zhang
Liancheng Fang
Qitian Wu
Philip S. Yu
DiffM
LMTD
80
3
0
28 Oct 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
538
1
0
28 Oct 2024
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Hanyu Wang
Saksham Suri
Yixuan Ren
Hao Chen
Abhinav Shrivastava
VGen
119
12
0
28 Oct 2024
Equivariant Blurring Diffusion for Hierarchical Molecular Conformer Generation
Jiwoong Park
Yang Shen
DiffM
112
1
0
26 Oct 2024
Image Generation from Image Captioning -- Invertible Approach
Nandakishore S Menon
Chandramouli Kamanchi
Raghuram Bharadwaj Diddigi
19
0
0
26 Oct 2024
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Liulei Li
Wenguan Wang
Yue Yang
108
8
0
26 Oct 2024
Augmenting Training Data with Vector-Quantized Variational Autoencoder for Classifying RF Signals
Srihari Kamesh Kompella
Kemal Davaslioglu
Y. Sagduyu
Sastry Kompella
47
1
0
23 Oct 2024
MotionGlot: A Multi-Embodied Motion Generation Model
Sudarshan Harithas
Srinath Sridhar
187
2
0
22 Oct 2024
Conjuring Semantic Similarity
Tian Yu Liu
Stefano Soatto
DiffM
202
0
0
21 Oct 2024
Elucidating the design space of language models for image generation
Xuantong Liu
Shaozhe Hao
Xianbiao Qi
Tianyang Hu
Jun Wang
Rong Xiao
Yuan Yao
VLM
90
3
0
21 Oct 2024
SeisLM: a Foundation Model for Seismic Waveforms
Tianlin Liu
Jannes Münchmeyer
Laura Laurenti
C. Marone
Maarten V. de Hoop
Ivan Dokmanić
VLM
131
6
0
21 Oct 2024
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation
Jilong Li
Zhenxi Song
Jiaqi Wang
Meishan Zhang
Honghai Liu
Min Zhang
Zhiguo Zhang
119
2
0
19 Oct 2024
SNAC: Multi-Scale Neural Audio Codec
Hubert Siuzdak
Florian Grötschla
Luca A. Lanzendörfer
58
19
0
18 Oct 2024
Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion
Bac Nguyen
and Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Stefano Ermon
Yuki Mitsufuji
123
0
0
18 Oct 2024
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
167
10
0
18 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
111
4
0
17 Oct 2024
Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
Jiwan Hur
Dong-Jae Lee
Gyojin Han
Jaehyun Choi
Yunho Jeon
Junmo Kim
DiffM
116
0
0
17 Oct 2024
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Yongxin Zhu
Bing Li
Hang Zhang
Xin Li
Linli Xu
Lidong Bing
DiffM
120
9
0
16 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
137
9
0
14 Oct 2024
Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior
Mingyuan Yan
Jiawei Wu
Rushi Shah
Dianbo Liu
60
0
0
14 Oct 2024
AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior
Guoqiang Liang
Qingnan Fan
Bingtao Fu
Jinwei Chen
Hong Gu
Lin Wang
DiffM
75
1
0
13 Oct 2024
Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings
Di Wu
Siyuan Li
Chen Feng
Lu Cao
Yize Zhang
Jie Yang
Mohamad Sawan
102
1
0
13 Oct 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffM
VGen
110
29
0
10 Oct 2024
Improved Variational Inference in Discrete VAEs using Error Correcting Codes
María Martínez-García
Grace Villacrés
David Mitchell
Pablo M. Olmos
DRL
112
0
0
10 Oct 2024
ElasticTok: Adaptive Tokenization for Image and Video
Wilson Yan
Matei A. Zaharia
Volodymyr Mnih
Pieter Abbeel
Aleksandra Faust
Hao Liu
VGen
123
11
0
10 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
124
4
0
09 Oct 2024
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li
Weihao Yuan
Yisheng He
Lingteng Qiu
Shenhao Zhu
Xiaodong Gu
Weichao Shen
Yuan Dong
Zilong Dong
Laurence T. Yang
109
10
0
09 Oct 2024
Geometric Representation Condition Improves Equivariant Molecule Generation
Zian Li
Cai Zhou
Xiyuan Wang
Xingang Peng
Muhan Zhang
132
2
0
04 Oct 2024
SGW-based Multi-Task Learning in Vision Tasks
Ruiyuan Zhang
Yuyao Chen
Yuchi Huo
Jiaxiang Liu
Dianbing Xi
Jie Liu
Chao Wu
88
1
0
03 Oct 2024
CaLMFlow: Volterra Flow Matching using Causal Language Models
Shiyang Zhang
Daniel Levine
Ivan Vrkic
Marco Francesco Bressana
David Zhang
S. Rizvi
Yangtian Zhang
E. Zappala
David van Dijk
64
1
0
03 Oct 2024
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
Wanpeng Zhang
Zilong Xie
Yicheng Feng
Yijiang Li
Xingrun Xing
Sipeng Zheng
Zongqing Lu
MLLM
126
1
0
03 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe Lin
VLM
119
30
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
177
3
0
02 Oct 2024
Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
Lilac Atassi
112
0
0
01 Oct 2024
Diverse Code Query Learning for Speech-Driven Facial Animation
Chunzhi Gu
Shigeru Kuriyama
Katsuya Hotta
DiffM
68
0
0
27 Sep 2024
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis
Chirag Vashist
Shichong Peng
Ke Li
DiffM
98
1
0
26 Sep 2024
Exploring Semantic Clustering in Deep Reinforcement Learning for Video Games
Liang Zhang
Justin Lieffers
A. Pyarelal
122
0
0
25 Sep 2024
Single Image, Any Face: Generalisable 3D Face Generation
Wenqing Wang
Haosen Yang
Josef Kittler
Xiatian Zhu
3DH
161
0
0
25 Sep 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
Yu Zhang
Ziyue Jiang
Ruiqi Li
Changhao Pan
Jinzheng He
Rongjie Huang
Chuxin Wang
Zhou Zhao
DiffM
VLM
192
8
0
24 Sep 2024
Multi-modal Generative AI: Multi-modal LLMs, Diffusions and the Unification
X. Wang
Yuwei Zhou
Bin Huang
Hong Chen
Wenwu Zhu
DiffM
177
9
0
23 Sep 2024
LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling
Xin Li
Anand Sarwate
58
0
0
16 Sep 2024
Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder
Seunghwan Kim
Seungkyu Lee
DRL
72
0
0
14 Sep 2024
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection
Hina Otake
Yoshihiro Fukuhara
Yoshiki Kubotani
Shigeo Morishima
ViT
95
0
0
13 Sep 2024
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors
Md Ferdous Alam
Faez Ahmed
DiffM
126
9
0
08 Sep 2024
Blended Latent Diffusion under Attention Control for Real-World Video Editing
Deyin Liu
Lin Yuanbo Wu
Xianghua Xie
DiffM
66
0
0
05 Sep 2024
VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Yixuan Zhou
Xing Xu
Zhe Sun
Jingkuan Song
A. Cichocki
Heng Tao Shen
139
1
0
02 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
113
8
0
31 Aug 2024
Previous
1
2
3
4
5
...
21
22
23
Next