Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models
Diptanu De
Shankhanil Mitra
R. Soundararajan
93
3
0
07 Jun 2024
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan
Hayk Manukyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
108
2
0
06 Jun 2024
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning
Inwoo Hwang
Yunhyeok Kwak
Jaein Kim
Byoung-Tak Zhang
Sanghack Lee
119
1
0
05 Jun 2024
VQUNet: Vector Quantization U-Net for Defending Adversarial Atacks by Regularizing Unwanted Noise
Zhixun He
Mukesh Singhal
79
1
0
05 Jun 2024
Phy-Diff: Physics-guided Hourglass Diffusion Model for Diffusion MRI Synthesis
Juanhua Zhang
Ruodan Yan
Alessandro Perelli
Xi Chen
Chao Li
MedIm
DiffM
143
6
0
05 Jun 2024
Tiny models from tiny data: Textual and null-text inversion for few-shot distillation
Erik Landolsi
Fredrik Kahl
DiffM
134
1
0
05 Jun 2024
Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion
Colin Hansen
Simas Glinskis
Ashwin Raju
Micha Kornreich
JinHyeong Park
Jayashri Pawar
Richard Herzog
Li Zhang
Benjamin Odry
MedIm
DiffM
91
4
0
04 Jun 2024
CoNav: A Benchmark for Human-Centered Collaborative Navigation
Changhao Li
Xinyu Sun
Peihao Chen
Jugang Fan
Zixu Wang
Yanxia Liu
Jinhui Zhu
Chuang Gan
Mingkui Tan
133
1
0
04 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
170
5
0
04 Jun 2024
Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Riccardo Benaglia
Angelo Porrello
Pietro Buzzega
Simone Calderara
Rita Cucchiara
62
0
0
31 May 2024
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
Jiaben Chen
Xin Yan
Yihang Chen
Siyuan Cen
Qinwei Ma
Haoyu Zhen
Kaizhi Qian
Lie Lu
Chuang Gan
75
0
0
30 May 2024
Predicting Long-Term Human Behaviors in Discrete Representations via Physics-Guided Diffusion
Zhitian Zhang
Anjian Li
Angelica Lim
Mo Chen
84
3
0
29 May 2024
Self-Supervised Learning Based Handwriting Verification
Mihir Chauhan
Mohammad Abuzar Shaikh
Abhishek Satbhai
Mir Basheer Ali
B. Ramamurthy
Mingchen Gao
Siwei Lyu
Sargur Srihari
83
2
0
28 May 2024
BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Hao Wu
Xingjian Shi
Ziyue Huang
Penghao Zhao
Wei Xiong
Jinbao Xue
Yangyu Tao
Xiaomeng Huang
Weiyan Wang
AI4TS
108
2
0
27 May 2024
Di
2
Pose
\text{Di}^2\text{Pose}
Di
2
Pose
: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
Weiquan Wang
Jun Xiao
Chunping Wang
Wei Liu
Zhao Wang
Long Chen
DiffM
108
1
0
27 May 2024
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Yeongmin Kim
Kwanghyeon Lee
Minsang Park
Byeonghu Na
Il-Chul Moon
DiffM
140
2
0
27 May 2024
Variational Offline Multi-agent Skill Discovery
Jiayu Chen
Bhargav Ganguly
Tian-Shing Lan
OffRL
145
3
0
26 May 2024
Hierarchical Uncertainty Exploration via Feedforward Posterior Trees
E. Nehme
Rotem Mulayoff
T. Michaeli
UQCV
89
2
0
24 May 2024
ParamReL: Learning Parameter Space Representation via Progressively Encoding Bayesian Flow Networks
Zhangkai Wu
Xuhui Fan
Jin Li
Zhilin Zhao
Hui Chen
LongBing Cao
87
2
0
24 May 2024
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Gwanghyun Kim
Alonso Martinez
Yu-Chuan Su
Brendan Jou
José Lezama
...
Lijun Yu
Lu Jiang
A. Jansen
Jacob Walker
Krishna Somandepalli
85
9
0
22 May 2024
Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models
Xiyu Wang
Yufei Wang
Satoshi Tsutsui
Weisi Lin
Bihan Wen
Alex C. Kot
117
6
0
20 May 2024
Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
Hui Zheng
Haiteng Wang
Wei-Bang Jiang
Zhongtao Chen
Li He
Pei-Yang Lin
Peng-Hu Wei
Guo-Guang Zhao
Yun-Zhe Liu
88
2
0
19 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
85
8
0
13 May 2024
A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds
Anton Orlichenko
Gang Qu
Ziyu Zhou
Anqi Liu
Hong-Wen Deng
Zhengming Ding
Julia M. Stephen
Tony W. Wilson
Vince D. Calhoun
Yu-Ping Wang
47
0
0
13 May 2024
Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data
Mahdi Morafah
M. Reisser
Bill Lin
Christos Louizos
FedML
88
6
0
13 May 2024
Generating Human Motion in 3D Scenes from Text Descriptions
Zhi Cen
Huaijin Pi
Sida Peng
Zehong Shen
Minghui Yang
Shuai Zhu
Hujun Bao
Xiaowei Zhou
86
21
0
13 May 2024
MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping
Mingyue Yuan
Jieshan Chen
Aaron Quigley
LLMAG
78
6
0
12 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
85
2
0
11 May 2024
Controllable Image Generation With Composed Parallel Token Prediction
Jamie Stirling
Noura Al-Moubayed
89
0
0
10 May 2024
Detecting music deepfakes is easy but actually hard
Darius Afchar
Gabriel Meseguer-Brocal
Romain Hennequin
115
9
0
07 May 2024
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View
Emmanuelle Bourigault
Pauline Bourigault
73
3
0
06 May 2024
Generated Contents Enrichment
Mahdi Naseri
Jiayan Qiu
Zhou Wang
124
0
0
06 May 2024
Towards Real-world Video Face Restoration: A New Benchmark
Ziyan Chen
Jingwen He
Xinqi Lin
Yu Qiao
Chao Dong
116
4
0
30 Apr 2024
Assessing Image Quality Using a Simple Generative Representation
Simon Raviv
Gal Chechik
85
0
0
28 Apr 2024
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi
Yu Sun
Priyanka Patel
Yao Feng
Michael J. Black
3DH
116
34
0
25 Apr 2024
HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression
Lei Lu
Yanyue Xie
Wei Jiang
Wei Wang
Xue Lin
Yanzhi Wang
86
5
0
20 Apr 2024
Lazy Diffusion Transformer for Interactive Image Editing
Yotam Nitzan
Zongze Wu
Richard Zhang
Eli Shechtman
Daniel Cohen-Or
Taesung Park
Michael Gharbi
90
11
0
18 Apr 2024
MIDGET: Music Conditioned 3D Dance Generation
Jinwu Wang
Wei Mao
Miaomiao Liu
81
0
0
18 Apr 2024
Large Language Models: From Notes to Musical Form
Lilac Atassi
100
0
0
18 Apr 2024
Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent
Wei Chen
Zhiyuan Li
LLMAG
58
5
0
17 Apr 2024
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption
Buzhen Huang
Chen Li
Chongyang Xu
Liang Pan
Yangang Wang
Gim Hee Lee
90
6
0
17 Apr 2024
Personalized Heart Disease Detection via ECG Digital Twin Generation
Yaojun Hu
Jintai Chen
Lianting Hu
Dantong Li
Jiahuan Yan
Haochao Ying
Huiying Liang
Jian Wu
74
5
0
17 Apr 2024
MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images
Yingjie Xi
Boyuan Cheng
Jingyao Cai
Jian Jun Zhang
Xiaosong Yang
MedIm
104
1
0
13 Apr 2024
Adapting LLaMA Decoder to Vision Transformer
Jiahao Wang
Wenqi Shao
Mengzhao Chen
Chengyue Wu
Yong Liu
Taiqiang Wu
Kaipeng Zhang
Songyang Zhang
Kai-xiang Chen
Ping Luo
MLLM
99
4
0
10 Apr 2024
Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Dazhong Shen
Guanglu Song
Zeyue Xue
Fu-Yun Wang
Yu Liu
DiffM
91
18
0
08 Apr 2024
Gull: A Generative Multifunctional Audio Codec
Yi Luo
Jianwei Yu
Hangting Chen
Rongzhi Gu
Chao Weng
AuLLM
94
3
0
07 Apr 2024
Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model
Zhonghan Zhao
Ke Ma
Wenhao Chai
Xuan Wang
Kewei Chen
Dongxu Guo
Yanting Zhang
Hongwei Wang
Gaoang Wang
90
20
0
06 Apr 2024
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization
Kailin Li
Jingbo Wang
Lixin Yang
Cewu Lu
Bo Dai
108
18
0
04 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
145
349
0
03 Apr 2024
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech
Jaehyeon Kim
Keon Lee
Seungjun Chung
Jaewoong Cho
135
44
0
03 Apr 2024
Previous
1
2
3
...
5
6
7
...
21
22
23
Next