Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,107 papers shown
Title
Balance of Number of Embedding and their Dimensions in Vector Quantization
Hang Chen
Sankepally Sainath Reddy
Ziwei Chen
Dianbo Liu
51
1
0
06 Jul 2024
NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
E. Nowara
Pedro H. O. Pinheiro
Sai Pooja Mahajan
Omar Mahmood
Andrew Watkins
Saeed Saremi
Michael R. Maser
BDL
DiffM
49
2
0
03 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Li
Fan Ma
Zongxin Yang
Yue Yang
104
11
0
02 Jul 2024
Grouped Discrete Representation Guides Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCL
41
2
0
01 Jul 2024
Deep learning for automated detection of breast cancer in deep ultraviolet fluorescence images with diffusion probabilistic model
Sepehr Salem Ghahfarokhi
Tyrell To
Julie M Jorns
T. Yen
Bing Yu
Dong Hye Ye
DiffM
MedIm
23
2
0
01 Jul 2024
Efficient World Models with Context-Aware Tokenization
Vincent Micheli
Eloi Alonso
François Fleuret
OffRL
VLM
34
6
0
27 Jun 2024
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
William Berman
A. Peysakhovich
39
4
0
26 Jun 2024
LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models
Mengdan Zhu
Raasikh Kanjiani
Jiahui Lu
Andrew Choi
Qirui Ye
Liang Zhao
DiffM
44
1
0
21 Jun 2024
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images
Rushikesh Zawar
Shaurya Dewan
Andrew F. Luo
Margaret M. Henderson
Michael J. Tarr
Leila Wehbe
VGen
CoGe
46
1
0
19 Jun 2024
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
62
184
0
17 Jun 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Lei Zhu
Fangyun Wei
Yanye Lu
Dong Chen
VLM
48
34
0
17 Jun 2024
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes
Aghiles Kebaili
J. Lapuyade-Lahorgue
Pierre Vera
S. Ruan
MedIm
42
0
0
17 Jun 2024
STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft
Zhonghan Zhao
Wenhao Chai
Xuan Wang
Ke Ma
Kewei Chen
Dongxu Guo
Tian Ye
Yanting Zhang
Hongwei Wang
Gaoang Wang
LLMAG
LM&Ro
39
6
0
17 Jun 2024
Graph Knowledge Distillation to Mixture of Experts
Pavel Rumiantsev
Mark Coates
28
0
0
17 Jun 2024
ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Zhe-nan Lin
Rita Singh
Bhiksha Raj
DiffM
43
21
0
14 Jun 2024
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu
Zhanpeng Zeng
Ju He
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
53
21
0
13 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
80
38
0
13 Jun 2024
The Significance of Latent Data Divergence in Predicting System Degradation
Miguel Fernandes
Catarina Silva
Alberto Cardoso
Bernardete Ribeiro
26
0
0
13 Jun 2024
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation
Raphael Tang
Xinyu Crystina Zhang
Lixinyu Xu
Yao Lu
Wenyan Li
Pontus Stenetorp
Jimmy Lin
Ferhan Ture
39
0
0
12 Jun 2024
Diffusion-Promoted HDR Video Reconstruction
Yuanshen Guan
Ruikang Xu
Mingde Yao
Ruisheng Gao
Lizhi Wang
Zhiwei Xiong
48
2
0
12 Jun 2024
To be Continuous, or to be Discrete, Those are Bits of Questions
Yiran Wang
Masao Utiyama
53
2
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
60
85
0
11 Jun 2024
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
Aoxiong Yin
Haoyuan Li
Kai Shen
Siliang Tang
Yueting Zhuang
SLR
58
2
0
11 Jun 2024
Towards Realistic Data Generation for Real-World Super-Resolution
Long Peng
Wenbo Li
Renjing Pei
Jingjing Ren
Xueyang Fu
Yang Wang
Yang Cao
Zheng-Jun Zha
40
17
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
68
230
0
10 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
37
14
0
08 Jun 2024
GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models
Diptanu De
Shankhanil Mitra
R. Soundararajan
47
2
0
07 Jun 2024
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan
Hayk Manukyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
61
1
0
06 Jun 2024
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
Inwoo Hwang
Yunhyeok Kwak
Suhyung Choi
Byoung-Tak Zhang
Sanghack Lee
45
1
0
05 Jun 2024
VQUNet: Vector Quantization U-Net for Defending Adversarial Atacks by Regularizing Unwanted Noise
Zhixun He
Mukesh Singhal
33
1
0
05 Jun 2024
Phy-Diff: Physics-guided Hourglass Diffusion Model for Diffusion MRI Synthesis
Juanhua Zhang
Ruodan Yan
Alessandro Perelli
Xi Chen
Chao Li
MedIm
DiffM
58
5
0
05 Jun 2024
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Erik Landolsi
Fredrik Kahl
DiffM
58
1
0
05 Jun 2024
Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion
Colin Hansen
Simas Glinskis
Ashwin Raju
Micha Kornreich
JinHyeong Park
Jayashri Pawar
Richard Herzog
Li Zhang
Benjamin Odry
MedIm
DiffM
67
3
0
04 Jun 2024
CoNav: A Benchmark for Human-Centered Collaborative Navigation
Changhao Li
Xinyu Sun
Peihao Chen
Jugang Fan
Zixu Wang
Yanxia Liu
Jinhui Zhu
Chuang Gan
Mingkui Tan
58
1
0
04 Jun 2024
Edit Distance Robust Watermarks for Language Models
Noah Golowich
Ankur Moitra
AAML
WaLM
44
5
0
04 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
57
5
0
04 Jun 2024
Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Riccardo Benaglia
Angelo Porrello
Pietro Buzzega
Simone Calderara
Rita Cucchiara
20
0
0
31 May 2024
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
Jiaben Chen
Xin Yan
Yihang Chen
Siyuan Cen
Qinwei Ma
Haoyu Zhen
Kaizhi Qian
Lie Lu
Chuang Gan
38
0
0
30 May 2024
Predicting Long-Term Human Behaviors in Discrete Representations via Physics-Guided Diffusion
Zhitian Zhang
Anjian Li
Angelica Lim
Mo Chen
41
3
0
29 May 2024
Self-Supervised Learning Based Handwriting Verification
Mihir Chauhan
Mohammad Abuzar Shaikh
Abhishek Satbhai
Mir Basheer Ali
B. Ramamurthy
Mingchen Gao
Siwei Lyu
Sargur Srihari
24
2
0
28 May 2024
BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Hao Wu
Xingjian Shi
Ziyue Huang
Penghao Zhao
Wei Xiong
Jinbao Xue
Yangyu Tao
Xiaomeng Huang
Weiyan Wang
AI4TS
63
1
0
27 May 2024
Di
2
Pose
\text{Di}^2\text{Pose}
Di
2
Pose
: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
Weiquan Wang
Jun Xiao
Chunping Wang
Wei Liu
Zhao Wang
Long Chen
DiffM
36
1
0
27 May 2024
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Yeongmin Kim
Kwanghyeon Lee
Minsang Park
Byeonghu Na
Il-Chul Moon
DiffM
44
2
0
27 May 2024
Variational Offline Multi-agent Skill Discovery
Jiayu Chen
Bhargav Ganguly
Tian-Shing Lan
OffRL
69
3
0
26 May 2024
Hierarchical Uncertainty Exploration via Feedforward Posterior Trees
E. Nehme
Rotem Mulayoff
T. Michaeli
UQCV
50
2
0
24 May 2024
ParamReL: Learning Parameter Space Representation via Progressively Encoding Bayesian Flow Networks
Zhangkai Wu
Xuhui Fan
Jin Li
Zhilin Zhao
Hui Chen
LongBing Cao
57
2
0
24 May 2024
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Gwanghyun Kim
Alonso Martinez
Yu-Chuan Su
Brendan Jou
José Lezama
...
Lijun Yu
Lu Jiang
A. Jansen
Jacob Walker
Krishna Somandepalli
32
8
0
22 May 2024
Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models
Xiyu Wang
Yufei Wang
Satoshi Tsutsui
Weisi Lin
Bihan Wen
Alex C. Kot
50
4
0
20 May 2024
Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
Hui Zheng
Haiteng Wang
Wei-Bang Jiang
Zhongtao Chen
Li He
Pei-Yang Lin
Peng-Hu Wei
Guo-Guang Zhao
Yun-Zhe Liu
52
1
0
19 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
45
7
0
13 May 2024
Previous
1
2
3
4
5
6
...
21
22
23
Next