Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,105 papers shown
Title
MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
Gabriel Maldonado
Armin Danesh Pazho
Ghazal Alinezhad Noghre
Vinit Katariya
Hamed Tabkhi
CLIP
VGen
30
0
0
16 May 2025
Diffusion Model in Hyperspectral Image Processing and Analysis: A Review
Xing Hu
Xiangcheng Liu
Qianqian Duan
Danfeng Hong
Dawei Zhang
DiffM
26
0
0
16 May 2025
An Introduction to Discrete Variational Autoencoders
Alan Jeffares
Liyuan Liu
DRL
BDL
CML
41
0
0
15 May 2025
Text-driven Motion Generation: Overview, Challenges and Directions
Ali Rida Sahili
Najett Neji
Hedi Tabia
VGen
38
0
0
14 May 2025
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
31
1
0
12 May 2025
H
3
^{\mathbf{3}}
3
DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xueliang Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
31
0
0
12 May 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Qingfu Zhang
Zhenan Sun
Ying Shan
MLLM
VLM
70
0
0
08 May 2025
Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models
Mikhail Chaichuk
Sushant Gautam
Steven A. Hicks
Elena Tutubalina
DiffM
MedIm
55
0
0
08 May 2025
ELGAR: Expressive Cello Performance Motion Generation for Audio Rendition
Zhiping Qiu
Yitong Jin
Yufei Wang
Yi Shi
Changbo Wang
Chao Tan
Xiaobing Li
Feng Yu
Tao Yu
Qionghai Dai
34
0
0
07 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Jiahui Geng
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction
Yiqun Lin
Hualiang Wang
Jixiang Chen
Jiewen Yang
Jiarong Guo
Xuelong Li
151
0
0
05 May 2025
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Yunhao Li
Sijing Wu
Wei Sun
Zhichao Zhang
Yucheng Zhu
Zicheng Zhang
Huiyu Duan
Xiongkuo Min
Guangtao Zhai
EGVM
93
0
0
30 Apr 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Jianyu Wu
Yizhou Wang
Xiangyu Yue
Xinzhu Ma
J. Guo
Dongzhan Zhou
Wanli Ouyang
Shixiang Tang
75
0
0
29 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
57
0
0
28 Apr 2025
Flow Along the K-Amplitude for Generative Modeling
Weitao Du
Shuning Chang
Jiasheng Tang
Yu Rong
F. Wang
Shengchao Liu
51
0
0
27 Apr 2025
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang
Jianmin Bao
Fangyun Wei
Dong Chen
DiffM
80
0
0
24 Apr 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
Yinqi Li
Hong Chang
Ruibing Hou
Shiguang Shan
Xilin Chen
DiffM
57
0
0
24 Apr 2025
Distilling semantically aware orders for autoregressive image generation
Rishav Pramanik
Antoine Poupon
Juan A. Rodriguez
Masih Aminbeidokhti
David Vazquez
Christopher Pal
Zhaozheng Yin
M. Pedersoli
31
0
0
23 Apr 2025
OccuEMBED: Occupancy Extraction Merged with Building Energy Disaggregation for Occupant-Responsive Operation at Scale
Yufei Zhang
Andrew Sonta
34
0
0
23 Apr 2025
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation
Evans Xu Han
Alice Qian Zhang
Hong Shen
Haiyi Zhu
Paul Pu Liang
Jane Hsieh
40
0
0
18 Apr 2025
Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection
Long Zeng
Jianxiang Yu
Jiapeng Zhu
Qingsong Zhong
Xiang Li
34
0
0
17 Apr 2025
Image Editing with Diffusion Models: A Survey
Jia Wang
Jie Hu
Xiaoqi Ma
Hanghang Ma
Xiaoming Wei
Enhua Wu
74
0
0
17 Apr 2025
Wavelet-based Variational Autoencoders for High-Resolution Image Generation
Andrew Kiruluta
DiffM
37
0
0
16 Apr 2025
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim
Sotiris Anagnostidis
Yuming Du
Edgar Schönfeld
Jonas Kohler
Markos Georgopoulos
Albert Pumarola
Ali K. Thabet
A. Sanakoyeu
32
0
0
15 Apr 2025
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu
Linxuan Li
Kai Wang
Yaxing Wang
Jian Yang
Ming-Ming Cheng
DiffM
VGen
23
0
0
14 Apr 2025
MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
Yilin Wang
Chuan Guo
Yuxuan Mu
Muhammad Gohar Javed
Wei Ji
Juwei Lu
Hai Jiang
Li Cheng
VGen
35
0
0
11 Apr 2025
LoRAX: LoRA eXpandable Networks for Continual Synthetic Image Attribution
Danielle Sullivan-Pao
Nicole Tian
Pooya Khorrami
CLL
62
0
0
10 Apr 2025
Vector Quantized-Elites: Unsupervised and Problem-Agnostic Quality-Diversity Optimization
Constantinos Tsakonas
Konstantinos Chatzilygeroudis
34
0
0
10 Apr 2025
Explainable and Interpretable Forecasts on Non-Smooth Multivariate Time Series for Responsible Gameplay
Hussain Jagirdar
Rukma Talwadker
Aditya Pareek
Pulkit Agrawal
Tridib Mukherjee
AI4TS
43
1
0
03 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
33
0
0
03 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
74
1
0
03 Apr 2025
MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation
Zhaoyu Chen
Hualiang Wang
Chubin Ou
Xiaomeng Li
46
0
0
02 Apr 2025
Style Quantization for Data-Efficient GAN Training
Jian Wang
Xin Lan
Jizhe Zhou
Yuxin Tian
Jiancheng Lv
51
0
0
31 Mar 2025
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Kai Wang
Tianlong Chen
AI4CE
199
0
0
31 Mar 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jie Wang
Tao Dai
Shu-Tao Xia
Luca Benini
72
2
0
30 Mar 2025
HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation
Hongwei Zheng
Han Li
Wenrui Dai
Ziyang Zheng
Chenglin Li
Junni Zou
Hongkai Xiong
3DH
60
0
0
30 Mar 2025
SocialGen: Modeling Multi-Human Social Interaction with Language Models
Heng Yu
Juze Zhang
Changan Chen
Tiange Xiang
Yusu Fang
Juan Carlos Niebles
Ehsan Adeli
VGen
54
0
0
28 Mar 2025
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
Jike Zhong
Qilong Wu
Xinyue Li
Bo Zhang
Ming Li
...
Yiming Li
Yu Qiao
Peng Gao
Bin Fu
Zhen Li
EGVM
50
0
0
27 Mar 2025
Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data
Masoumeh Sharafi
Emma Ollivier
Muhammad Osama Zeeshan
Soufiane Belharbi
M. Pedersoli
A. L. Koerich
Simon L Bacon
EricGranger
79
1
0
26 Mar 2025
Scaling Down Text Encoders of Text-to-Image Diffusion Models
Lifu Wang
Daqing Liu
Xinchen Liu
Xiaodong He
VLM
49
0
0
25 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
57
0
0
24 Mar 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu
Zanlin Ni
Yeguo Hua
Xin Deng
Xiao Ma
Cheng Zhong
Gao Huang
47
0
0
22 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
52
1
0
21 Mar 2025
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xintao Wang
Pengfei Wan
Di Zhang
Xihui Liu
VGen
45
1
0
21 Mar 2025
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
Vittorio Pippi
Fabio Quattrini
S. Cascianelli
Alessio Tonioni
Rita Cucchiara
42
1
0
21 Mar 2025
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens
Panpan Wang
Liqiang Niu
Fandong Meng
Jinan Xu
Yufeng Chen
Jie Zhou
DiffM
52
0
0
21 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yanjie Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
53
0
0
20 Mar 2025
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
Ziyao Guo
Kaipeng Zhang
Michael Qizhe Shieh
43
0
0
20 Mar 2025
Tokenize Image as a Set
Zigang Geng
Mengde Xu
Han Hu
Shuyang Gu
DiffM
55
0
0
20 Mar 2025
Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling
Yanchen Luo
Zhiyuan Liu
Yi Zhao
Sihang Li
Kenji Kawaguchi
Tat-Seng Chua
Xuben Wang
MedIm
69
0
0
19 Mar 2025
1
2
3
4
...
21
22
23
Next