Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
All Papers
Title
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,154 papers shown
Title
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
Masud Ahmed
Zahid Hasan
Syed Arefinul Haque
A. Faridee
S. Purushotham
Suya You
Nirmalya Roy
237
0
0
19 Mar 2025
Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling
Yanchen Luo
Zhiyuan Liu
Yi Zhao
Changhao Nai
Kenji Kawaguchi
Tat-Seng Chua
Xiang Wang
Yang Zhang
Xiang Wang
MedIm
211
0
0
19 Mar 2025
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
Wei Song
Longji Xu
Zijia Song
Yadong Li
Haoze Sun
Xin Wu
Guosheng Dong
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
183
8
0
18 Mar 2025
Versatile Physics-based Character Control with Hybrid Latent Representation
Jinseok Bae
Jungdam Won
Donggeun Lim
I. Hwang
Y. Kim
146
2
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
153
4
0
17 Mar 2025
Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
Shiran Yuan
Hao Zhao
DiffM
147
0
0
17 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
215
2
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yue Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
111
4
0
14 Mar 2025
Dual Codebook VQ: Enhanced Image Reconstruction with Reduced Codebook Size
Parisa Boodaghi Malidarreh
Jillur Rahman Saurav
T. Pham
Amir Hajighasemi
Anahita Samadi
Saurabh Shrinivas Maydeo
M. Nasr
Jacob M. Luber
127
0
0
13 Mar 2025
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
Xiaoming Zhao
Alexander Schwing
FaML
131
1
0
13 Mar 2025
Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis
Kai Qiu
Xianrui Li
Jason Kuen
Hong Chen
Xiaohao Xu
Jiuxiang Gu
Yinyi Luo
Bhiksha Raj
Zhe Lin
Marios Savvides
303
4
0
11 Mar 2025
Temporal Triplane Transformers as Occupancy World Models
Haoran Xu
Peixi Peng
Guang Tan
Yiqian Chang
Yisen Zhao
Yonghong Tian
245
3
0
10 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi Han
Yandong Tang
Liangqiong Qu
178
0
0
10 Mar 2025
ADROIT: A Self-Supervised Framework for Learning Robust Representations for Active Learning
S. Banerjee
Vinay Kumar Verma
SSL
142
0
0
10 Mar 2025
NFIG: Autoregressive Image Generation with Next-Frequency Prediction
Zhihao Huang
Xi Qiu
Yukuo Ma
Yifu Zhou
Junjie Chen
Xuelong Li
Fangqiu Yi
Xuelong Li
VLM
170
0
0
10 Mar 2025
Removing Averaging: Personalized Lip-Sync Driven Characters Based on Identity Adapter
Yanyu Zhu
Licheng Bai
Jintao Xu
Jiwei Tang
121
0
0
09 Mar 2025
Removing Multiple Hybrid Adverse Weather in Video via a Unified Model
Yecong Wan
Mingwen Shao
Yuanshuo Cheng
Jun Shu
Shuigen Wang
131
0
0
08 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
137
13
0
07 Mar 2025
PathoPainter: Augmenting Histopathology Segmentation via Tumor-aware Inpainting
Hong Liu
Haosen Yang
Evi M. C. Huijben
Mark Schuiveling
Ruisheng Su
J. Pluim
M. Veta
MedIm
133
1
0
06 Mar 2025
ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models
Qinyu Zhao
Stephen Gould
Liang Zheng
DiffM
GAN
VGen
VLM
126
1
0
04 Mar 2025
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution
Ru Ito
Supatta Viriyavisuthisakul
K. Kawamoto
Hiroshi Kera
130
0
0
04 Mar 2025
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue
Zhaoyang Jia
Jiahao Li
Bin Li
Yuan Zhang
Yan Lu
143
2
0
03 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
236
3
0
02 Mar 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao
Gengwei Zhang
Yinlong Qian
Jiancheng Huang
Yao Zhao
Humphrey Shi
Lin Ma
Y. X. Wei
Zequn Jie
VLM
138
9
0
27 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
324
18
0
27 Feb 2025
Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation
Zhi Cen
Huaijin Pi
Sida Peng
Qing Shuai
Yujun Shen
Hujun Bao
Xiaowei Zhou
Ruizhen Hu
VGen
OffRL
181
7
0
27 Feb 2025
Diffusion Models for conditional MRI generation
Miguel Herencia García del Castillo
Ricardo Moya Garcia
Manuel Jesús Cerezo Mazón
Ekaitz Arriola Garcia
Pablo Menéndez Fernández-Miranda
MedIm
106
0
0
25 Feb 2025
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Ruikun Li
Huandong Wang
Qingmin Liao
Yong Li
71
2
0
24 Feb 2025
Machine learning and high dimensional vector search
Matthijs Douze
167
0
0
24 Feb 2025
SpecDM: Hyperspectral Dataset Synthesis with Pixel-level Semantic Annotations
Wen Liu
Pei Yang
Wenhui Hong
Xiaoguang Mei
Jiayi Ma
DiffM
132
1
0
24 Feb 2025
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Ziwei Shan
Yaoyu He
Chengfeng Zhao
Jiashen Du
Jingyan Zhang
Qixuan Zhang
Jingyi Yu
Lan Xu
161
1
0
22 Feb 2025
CoDiff: Conditional Diffusion Model for Collaborative 3D Object Detection
zhe Huang
Shuo Wang
Yanjie Wang
Lei Wang
DiffM
548
1
0
17 Feb 2025
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
595
1
0
08 Feb 2025
High-Fidelity Simultaneous Speech-To-Speech Translation
Tom Labiausse
Laurent Mazaré
Edouard Grave
P. Pérez
Alexandre Défossez
Neil Zeghidour
593
3
0
05 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSL
MQ
180
0
0
04 Feb 2025
End-to-end Training for Text-to-Image Synthesis using Dual-Text Embeddings
Yeruru Asrar Ahmed
Anurag Mittal
DiffM
167
0
0
03 Feb 2025
ConceptVAE: Self-Supervised Fine-Grained Concept Disentanglement from 2D Echocardiographies
C. Ciușdel
Alex Serban
Tiziano Passerini
CoGe
149
1
0
03 Feb 2025
Unveiling Discrete Clues: Superior Healthcare Predictions for Rare Diseases
Chuang Zhao
Hui Tang
Jiheng Zhang
Xiaomeng Li
111
1
0
23 Jan 2025
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
135
2
0
14 Jan 2025
AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
Samir Sadok
Simon Leglaive
Laurent Girin
Gaël Richard
Xavier Alameda-Pineda
147
4
0
10 Jan 2025
EditAR: Unified Conditional Generation with Autoregressive Models
Jiteng Mu
Nuno Vasconcelos
Xinyu Wang
DiffM
114
14
0
08 Jan 2025
Circuit Complexity Bounds for Visual Autoregressive Model
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
143
10
0
08 Jan 2025
Human Grasp Generation for Rigid and Deformable Objects with Decomposed VQ-VAE
Mengshi Qi
Zhe Zhao
Huadong Ma
154
1
0
08 Jan 2025
Learning the Language of Protein Structure
Benoit Gaujac
Jérémie Donà
Liviu Copoiu
Timothy Atkinson
Thomas Pierrot
Thomas D. Barrett
130
12
0
08 Jan 2025
Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
Théophane Vallaeys
Matthew Muckley
Jakob Verbeek
Matthijs Douze
MQ
127
3
0
06 Jan 2025
Neural Network Diffusion
Kaili Wang
Dongwen Tang
Boya Zeng
Yida Yin
Zhaopan Xu
Yukun Zhou
Zelin Zang
Trevor Darrell
Zhuang Liu
Yang You
DiffM
226
31
0
03 Jan 2025
Towards Human-AI Synergy in UI Design: Enhancing Multi-Agent Based UI Generation with Intent Clarification and Alignment
M. Yuan
Jieshan Chen
Yongquan Hu
Sidong Feng
Mulong Xie
Gelareh Mohammadi
Zhenchang Xing
Aaron Quigley
LLMAG
105
1
0
28 Dec 2024
Hierarchical Vector Quantization for Unsupervised Action Segmentation
Federico Spurio
Emad Bahrami
Gianpiero Francesca
Juergen Gall
170
5
0
23 Dec 2024
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization
Vivek Ramanujan
Kushal Tirumala
Armen Aghajanyan
Luke Zettlemoyer
Ali Farhadi
DiffM
151
3
0
20 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
301
16
0
19 Dec 2024
Previous
1
2
3
4
5
6
...
22
23
24
Next