Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim
Sotiris Anagnostidis
Yuming Du
Edgar Schönfeld
Jonas Kohler
Markos Georgopoulos
Albert Pumarola
Ali K. Thabet
A. Sanakoyeu
92
0
0
15 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffM
VGen
101
0
0
14 Apr 2025
MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
Yilin Wang
Chuan Guo
Yuxuan Mu
Muhammad Gohar Javed
Wei Ji
Juwei Lu
Hai Jiang
Li Cheng
VGen
70
0
0
11 Apr 2025
LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution
Danielle Sullivan-Pao
Nicole Tian
Pooya Khorrami
CLL
98
0
0
10 Apr 2025
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Constantinos Tsakonas
Konstantinos Chatzilygeroudis
75
0
0
10 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
82
0
0
03 Apr 2025
Explainable and Interpretable Forecasts on Non-Smooth Multivariate Time Series for Responsible Gameplay
Hussain Jagirdar
Rukma Talwadker
Aditya Pareek
Pulkit Agrawal
Tridib Mukherjee
AI4TS
214
2
0
03 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
133
4
0
03 Apr 2025
MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation
Zhaoyu Chen
Hualiang Wang
Chubin Ou
Xiaomeng Li
79
0
0
02 Apr 2025
A Theory of Machine Understanding via the Minimum Description Length Principle
Canlin Zhang
Xiuwen Liu
141
0
0
01 Apr 2025
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Xiaojiang Peng
Tianlong Chen
AI4CE
530
1
0
31 Mar 2025
Style Quantization for Data-Efficient GAN Training
Jian Wang
Xin Lan
Jizhe Zhou
Yuxin Tian
Jiancheng Lv
99
0
0
31 Mar 2025
HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation
Hongwei Zheng
Han Li
Wenrui Dai
Ziyang Zheng
Chenglin Li
Junni Zou
Hongkai Xiong
3DH
94
1
0
30 Mar 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jiadong Wang
Tao Dai
Shu-Tao Xia
Luca Benini
181
5
0
30 Mar 2025
SocialGen: Modeling Multi-Human Social Interaction with Language Models
Heng Yu
Juze Zhang
Changan Chen
Tiange Xiang
Yusu Fang
Juan Carlos Niebles
Ehsan Adeli
VGen
106
1
0
28 Mar 2025
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
Jike Zhong
Qilong Wu
Xinyue Li
Bo Zhang
Ming Li
...
Haoyang Li
Yu Qiao
Peng Gao
Bin Fu
Zhen Li
EGVM
92
1
0
27 Mar 2025
Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data
Masoumeh Sharafi
Emma Ollivier
Muhammad Osama Zeeshan
Soufiane Belharbi
M. Pedersoli
A. L. Koerich
Simon L Bacon
EricGranger
108
2
0
26 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
141
0
0
25 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
116
0
0
24 Mar 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu
Zanlin Ni
Yeguo Hua
Xin Deng
Xiao Ma
Cheng Zhong
Gao Huang
90
0
0
22 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
116
3
0
21 Mar 2025
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
Vittorio Pippi
Fabio Quattrini
S. Cascianelli
Alessio Tonioni
Rita Cucchiara
81
1
0
21 Mar 2025
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens
Panpan Wang
Liqiang Niu
Fandong Meng
Jinan Xu
Yufeng Chen
Jie Zhou
DiffM
128
0
0
21 Mar 2025
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xintao Wang
Pengfei Wan
Di Zhang
Xihui Liu
VGen
126
4
0
21 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yanjie Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
115
5
0
20 Mar 2025
Tokenize Image as a Set
Zigang Geng
Mengde Xu
Han Hu
Shuyang Gu
DiffM
92
0
0
20 Mar 2025
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
Ziyao Guo
Jianchao Tan
Michael Qizhe Shieh
77
0
0
20 Mar 2025
Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling
Yanchen Luo
Zhiyuan Liu
Yi Zhao
Changhao Nai
Kenji Kawaguchi
Tat-Seng Chua
Xiang Wang
Yang Zhang
Xiang Wang
MedIm
164
0
0
19 Mar 2025
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
Masud Ahmed
Zahid Hasan
Syed Arefinul Haque
A. Faridee
S. Purushotham
Suya You
Nirmalya Roy
199
0
0
19 Mar 2025
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
Wei Song
Yansen Wang
Zijia Song
Yadong Li
Haoze Sun
Xin Wu
Guosheng Dong
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
142
4
0
18 Mar 2025
Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
Shiran Yuan
Hao Zhao
DiffM
127
0
0
17 Mar 2025
Versatile Physics-based Character Control with Hybrid Latent Representation
Jinseok Bae
Jungdam Won
Donggeun Lim
I. Hwang
Y. Kim
92
0
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
110
1
0
17 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
190
0
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yue Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
87
1
0
14 Mar 2025
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
Xiaoming Zhao
Alexander Schwing
FaML
105
1
0
13 Mar 2025
Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size
Parisa Boodaghi Malidarreh
Jillur Rahman Saurav
T. Pham
Amir Hajighasemi
Anahita Samadi
Saurabh Shrinivas Maydeo
M. Nasr
Jacob M. Luber
105
0
0
13 Mar 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
Xianrui Li
Jason Kuen
Hong Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe Lin
Marios Savvides
190
2
0
11 Mar 2025
ADROIT: A Self-Supervised Framework for Learning Robust Representations for Active Learning
S. Banerjee
Vinay Kumar Verma
SSL
113
0
0
10 Mar 2025
Temporal Triplane Transformers as Occupancy World Models
Haoran Xu
Peixi Peng
Guang Tan
Yiqian Chang
Yisen Zhao
Yonghong Tian
196
2
0
10 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi Han
Yandong Tang
Liangqiong Qu
124
0
0
10 Mar 2025
Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter
Yanyu Zhu
Licheng Bai
Jintao Xu
Jiwei Tang
Wanshi Xu
83
0
0
09 Mar 2025
Removing Multiple Hybrid Adverse Weather in Video via a Unified Model
Yecong Wan
Mingwen Shao
Yuanshuo Cheng
Jun Shu
Shuigen Wang
100
0
0
08 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
108
10
0
07 Mar 2025
PathoPainter: Augmenting Histopathology Segmentation via Tumor-aware Inpainting
Hong Liu
Haosen Yang
Evi M. C. Huijben
Mark Schuiveling
Ruisheng Su
J. Pluim
M. Veta
MedIm
109
1
0
06 Mar 2025
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution
Ru Ito
Supatta Viriyavisuthisakul
K. Kawamoto
Hiroshi Kera
118
0
0
04 Mar 2025
ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models
Qinyu Zhao
Stephen Gould
Liang Zheng
DiffM
GAN
VGen
VLM
105
1
0
04 Mar 2025
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue
Zhaoyang Jia
Jiahao Li
Bin Li
Yuan Zhang
Yan Lu
113
2
0
03 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
190
2
0
02 Mar 2025
Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation
Zhi Cen
Huaijin Pi
Sida Peng
Qing Shuai
Yujun Shen
Hujun Bao
Xiaowei Zhou
Ruizhen Hu
VGen
OffRL
144
3
0
27 Feb 2025
Previous
1
2
3
4
5
...
21
22
23
Next