Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
Bulat Gabdullin
Nina Konovalova
Nikolay Patakin
Dmitry Senushkin
Anton Konushin
MDE
92
1
0
01 Jul 2025
Watermarking Autoregressive Image Generation
Nikola Jovanović
Ismail Labiad
Tomáš Souček
Martin Vechev
Pierre Fernandez
WIGM
50
0
0
19 Jun 2025
Enhancing Vector Quantization with Distributional Matching: A Theoretical and Empirical Study
Xianghong Fang
Litao Guo
Hengchao Chen
Yuxuan Zhang
XiaofanXia
...
Yexin Liu
Hao Wang
Harry Yang
Yuan Yuan
Qiang Sun
MQ
39
0
0
18 Jun 2025
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Junyeob Baek
Hosung Lee
Christopher Hoang
Mengye Ren
Sungjin Ahn
35
0
0
17 Jun 2025
ViSAGe: Video-to-Spatial Audio Generation
Jaeyeon Kim
Heeseung Yun
Gunhee Kim
VGen
46
2
0
13 Jun 2025
Dynamic Sparse Training of Diagonally Sparse Networks
Abhishek Tyagi
Arjun Iyer
William H Renninger
Christopher Kanan
Yuhao Zhu
17
0
0
13 Jun 2025
SpectralAR: Spectral Autoregressive Visual Generation
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Yueqi Duan
Jie Zhou
Jiwen Lu
DiffM
VGen
134
0
0
12 Jun 2025
DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning
Dongxu Liu
Yuang Peng
Haomiao Tang
Yuwei Chen
Chunrui Han
Zheng Ge
Daxin Jiang
Mingxue Liao
DiffM
86
0
0
11 Jun 2025
VIVAT: Virtuous Improving VAE Training through Artifact Mitigation
Lev Novitskiy
Viacheslav Vasilev
Maria Kovaleva
V. Arkhipkin
Denis Dimitrov
VGen
23
0
0
09 Jun 2025
Highly Compressed Tokenizer Can Generate Without Training
Lukas Lao Beyer
T. Li
X. Chen
S. Karaman
K. He
DiffM
VLM
31
0
0
09 Jun 2025
LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer
Ying Shen
Zhiyang Xu
Jiuhai Chen
Shizhe Diao
Jiaxin Zhang
Yuguang Yao
Joy Rimchala
Ismini Lourentzou
Lifu Huang
OffRL
37
0
0
08 Jun 2025
GGBall: Graph Generative Model on Poincaré Ball
Tianci Bu
Chuanrui Wang
Hao Ma
Haoren Zheng
Xin Lu
Tailin Wu
41
0
0
08 Jun 2025
Continuous Semi-Implicit Models
L. Yu
Jiajun Zha
Tong Yang
Tianyu Xie
Xiangyu Zhang
S.-H. Gary Chan
Cheng Zhang
DiffM
22
0
0
07 Jun 2025
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
Hermann Kumbong
Xian Liu
Tsung-Yi Lin
Ming-Yu Liu
Xihui Liu
Ziwei Liu
Daniel Y. Fu
Christopher Ré
David W. Romero
DiffM
57
0
0
04 Jun 2025
TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation
Amin Karimi Monsefi
Mridul Khurana
R. Ramnath
Anuj Karpatne
Wei-Lun Chao
Cheng Zhang
79
0
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
78
0
0
02 Jun 2025
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues
Youngmin Kim
Jiwan Chung
Jisoo Kim
Sunghyun Lee
Sangkyu Lee
Junhyeok Kim
Cheoljong Yang
Youngjae Yu
VGen
46
0
0
01 Jun 2025
On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning
Magdalena Proszewska
Nikolay Malkin
N. Siddharth
DiffM
46
0
0
30 May 2025
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects
Huaijin Pi
Zhi Cen
Zhiyang Dou
Taku Komura
DiffM
77
0
0
27 May 2025
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
Jiru Deng
Tengjin Weng
Tianyu Yang
Wenhan Luo
Zhiheng Li
Wenhao Jiang
VLM
169
0
0
26 May 2025
DiSA: Diffusion Step Annealing in Autoregressive Image Generation
Qinyu Zhao
Jaskirat Singh
Ming Xu
Akshay Asthana
Stephen Gould
Liang Zheng
DiffM
83
0
0
26 May 2025
Plug-and-Play Context Feature Reuse for Efficient Masked Generation
Xuejie Liu
Anji Liu
Guy Van den Broeck
Yitao Liang
67
0
0
25 May 2025
Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning
Nicolas Castanet
Olivier Sigaud
Sylvain Lamprier
OffRL
122
0
0
23 May 2025
FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design
Renjie Wei
Songqiang Xu
Qingyu Guo
Meng Li
MQ
91
0
0
22 May 2025
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
Chaoyi Jiang
Sungwoo Kim
Lei Gao
Hossein Entezari Zarch
Won Woo Ro
Murali Annavaram
34
0
0
22 May 2025
Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling
Xinxing Shi
Xiaoyu Jiang
Mauricio A. Álvarez
BDL
122
0
0
22 May 2025
MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning
Jinhua Zhang
Wei Long
Minghao Han
Weiyi You
Shuhang Gu
BDL
89
0
0
19 May 2025
Context-Aware Autoregressive Models for Multi-Conditional Image Generation
Yixiao Chen
Zhiyuan Ma
Guoli Jia
Che Jiang
Jianjun Li
Bowen Zhou
DiffM
79
0
0
18 May 2025
MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
Gabriel Maldonado
Armin Danesh Pazho
Ghazal Alinezhad Noghre
Vinit Katariya
Hamed Tabkhi
CLIP
VGen
128
0
0
16 May 2025
An Introduction to Discrete Variational Autoencoders
Alan Jeffares
Liyuan Liu
DRL
BDL
CML
71
0
0
15 May 2025
Text-driven Motion Generation: Overview, Challenges and Directions
Ali Rida Sahili
Najett Neji
Hedi Tabia
VGen
81
0
0
14 May 2025
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
66
1
0
12 May 2025
H
3
^3
3
DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xinyu Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
111
1
0
12 May 2025
Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models
Mikhail Chaichuk
Sushant Gautam
Steven A. Hicks
Elena Tutubalina
DiffM
MedIm
130
0
0
08 May 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Qingfu Zhang
Zhenan Sun
Ying Shan
MLLM
VLM
159
5
0
08 May 2025
ELGAR: Expressive Cello Performance Motion Generation for Audio Rendition
Zhiping Qiu
Yitong Jin
Yijiao Wang
Yi Shi
Changbo Wang
Chao Tan
Xiaobing Li
Feng Yu
Tao Yu
Qionghai Dai
78
0
0
07 May 2025
DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction
Yiqun Lin
Hualiang Wang
Jixiang Chen
Jiewen Yang
Jiarong Guo
Xuelong Li
370
1
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
360
1
0
05 May 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
148
0
0
30 Apr 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Jianyu Wu
Yizhou Wang
Xiangyu Yue
Xinzhu Ma
Jinpei Guo
Dongzhan Zhou
Wanli Ouyang
Shixiang Tang
161
0
0
29 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
101
0
0
28 Apr 2025
Flow Along the K-Amplitude for Generative Modeling
Weitao Du
Shuning Chang
Jiasheng Tang
Yu Rong
F. Wang
Shengchao Liu
89
0
0
27 Apr 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
Yinqi Li
Hong Chang
Ruibing Hou
Shiguang Shan
Xilin Chen
DiffM
103
0
0
24 Apr 2025
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang
Jianmin Bao
Fangyun Wei
Dong Chen
DiffM
117
1
0
24 Apr 2025
Distilling semantically aware orders for autoregressive image generation
Rishav Pramanik
Antoine Poupon
Juan A. Rodriguez
Masih Aminbeidokhti
David Vazquez
Christopher Pal
Zhaozheng Yin
M. Pedersoli
98
0
0
23 Apr 2025
OccuEMBED: Occupancy Extraction Merged with Building Energy Disaggregation for Occupant-Responsive Operation at Scale
Yufei Zhang
Andrew Sonta
61
0
0
23 Apr 2025
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation
Evans Xu Han
Alice Qian Zhang
Hong Shen
Haiyi Zhu
Paul Pu Liang
Jane Hsieh
85
0
0
18 Apr 2025
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
156
1
0
17 Apr 2025
Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection
Long Zeng
Jianxiang Yu
Jiapeng Zhu
Qingsong Zhong
Xiang Li
82
0
0
17 Apr 2025
Wavelet-based Variational Autoencoders for High-Resolution Image Generation
Andrew Kiruluta
DiffM
76
0
0
16 Apr 2025
1
2
3
4
...
21
22
23
Next