Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.00446
Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2
2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
DRL
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generating Diverse High-Fidelity Images with VQ-VAE-2"
50 / 1,128 papers shown
Title
DreamTuner: Single Image is Enough for Subject-Driven Generation
Miao Hua
Jiawei Liu
Fei Ding
Wei Liu
Jie Wu
Qian He
72
31
0
21 Dec 2023
Sign Language Production with Latent Motion Transformer
Pan Xie
Taiying Peng
Yao Du
Qipeng Zhang
SLR
79
5
0
20 Dec 2023
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
Tariq Berrada
Jakob Verbeek
Camille Couprie
Alahari Karteek
93
9
0
20 Dec 2023
RadEdit: stress-testing biomedical vision models via diffusion image editing
Fernando Pérez-García
Sam Bond-Taylor
Pedro P. Sanchez
B. V. Breugel
Daniel Coelho De Castro
...
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
Maximilian Ilse
MedIm
146
11
0
20 Dec 2023
All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models
Seunghoo Hong
Juhun Lee
Simon S. Woo
120
20
0
20 Dec 2023
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Shweta Mahajan
Tanzila Rahman
Kwang Moo Yi
Leonid Sigal
DiffM
108
20
0
19 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
198
1
0
19 Dec 2023
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal Output Distributions
David D. Nguyen
David Liebowitz
Surya Nepal
S. Kanhere
OOD
UQCV
62
0
0
18 Dec 2023
Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation
YoungJoon Yoo
Jongwon Choi
BDL
82
2
0
15 Dec 2023
Towards Equipping Transformer with the Ability of Systematic Compositionality
Chen Huang
Peixin Qin
Wenqiang Lei
Jiancheng Lv
95
2
0
12 Dec 2023
Equivariant Flow Matching with Hybrid Probability Transport
Yuxuan Song
Jingjing Gong
Minkai Xu
Ziyao Cao
Yanyan Lan
Stefano Ermon
Hao Zhou
Wei-Ying Ma
DiffM
100
57
0
12 Dec 2023
Implicit Shape Modeling for Anatomical Structure Refinement of Volumetric Medical Images
Minghui Zhang
Shanshan Shi
Renao Yan
Liang Zhu
Tian Guan
MedIm
77
1
0
11 Dec 2023
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Maomao Li
Yu Li
Tianyu Yang
Yunfei Liu
Dongxu Yue
Zhihui Lin
Dong Xu
VGen
50
9
0
10 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
138
50
0
07 Dec 2023
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer
Bichen Wu
Edgar Schoenfeld
Xiaoliang Dai
Ji Hou
...
Jonas Kohler
Christian Rupprecht
Zorah Lähner
Peter Vajda
Jialiang Wang
DiffM
110
78
0
06 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
110
41
0
04 Dec 2023
ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation
Dar-Yen Chen
Hamish Tennent
Ching-Wen Hsu
DiffM
120
27
0
04 Dec 2023
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
Runze He
Shaofei Huang
Xuecheng Nie
Tianrui Hui
Luoqi Liu
Jiao Dai
Jizhong Han
Guanbin Li
Si Liu
DiffM
69
8
0
04 Dec 2023
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Rishi Upadhyay
Howard Zhang
Yunhao Ba
Ethan Yang
Blake Gella
Sicheng Jiang
Alex Wong
A. Kadambi
92
11
0
01 Dec 2023
Simple Transferability Estimation for Regression Tasks
Cuong N. Nguyen
Phong Tran
L. Ho
Vu C. Dinh
Anh Tran
Tal Hassner
Cuong V Nguyen
96
2
0
01 Dec 2023
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou
Spyros Gidaris
Konstantinos Karantzalos
N. Komodakis
ViT
OCL
135
16
0
01 Dec 2023
AV-RIR: Audio-Visual Room Impulse Response Estimation
Anton Ratnarajah
Sreyan Ghosh
Sonal Kumar
Purva Chiniya
Dinesh Manocha
78
15
0
30 Nov 2023
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Yanhui Wang
Jianmin Bao
Wenming Weng
Ruoyu Feng
Dacheng Yin
...
Yuhui Yuan
Chuanxin Tang
Xiaoyan Sun
Chong Luo
Baining Guo
DiffM
VGen
133
17
0
30 Nov 2023
DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis
Xin Gao
Liucheng Hu
Peng Zhang
Bang Zhang
Liefeng Bo
DiffM
108
4
0
30 Nov 2023
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi
Miao Wang
Hao-Bin Duan
Shao-Hua Guan
3DGS
125
96
0
30 Nov 2023
Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation
Liangcai Su
Fan Yan
Jieming Zhu
Xi Xiao
Haoyi Duan
Zhou Zhao
Zhenhua Dong
Ruiming Tang
72
10
0
30 Nov 2023
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications
Karren D. Yang
Anurag Ranjan
Jen-Hao Rick Chang
Raviteja Vemulapalli
Oncel Tuzel
108
9
0
30 Nov 2023
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay
M. Gwilliam
Yosuke Yamaguchi
Vatsal Agarwal
Namitha Padmanabhan
Archana Swaminathan
Dinesh Manocha
Abhinav Shrivastava
DiffM
145
14
1
29 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
57
3
0
27 Nov 2023
Unlearning via Sparse Representations
Vedant Shah
Frederik Trauble
Ashish Malik
Hugo Larochelle
Michael C. Mozer
Sanjeev Arora
Yoshua Bengio
Anirudh Goyal
MU
77
9
0
26 Nov 2023
Self-Supervised Music Source Separation Using Vector-Quantized Source Category Estimates
Marco Pasini
Stefan Lattner
George Fazekas
77
1
0
21 Nov 2023
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
Chenliang Zhou
Fangcheng Zhong
Param Hanji
Zhilin Guo
Kyle Fogarty
Alejandro Sztrajman
Hongyun Gao
Cengiz Öztireli
101
3
0
20 Nov 2023
hvEEGNet: exploiting hierarchical VAEs on EEG data for neuroscience applications
Giulia Cisotto
Alberto Zancanaro
I. Zoppis
Sara Manzoni
75
3
0
20 Nov 2023
SeaDSC: A video-based unsupervised method for dynamic scene change detection in unmanned surface vehicles
L. Trinh
Ali Anwar
Siegfried Mercelis
100
4
0
20 Nov 2023
Compact and Intuitive Airfoil Parameterization Method through Physics-aware Variational Autoencoder
Yu-Eop Kang
Dawoon Lee
K. Yee
68
0
0
18 Nov 2023
UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation
Yuwen Xiong
Wei-Chiu Ma
Jingkang Wang
R. Urtasun
87
47
0
02 Nov 2023
Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation
Vitjan Zavrtanik
Matej Kristan
D. Skočaj
86
16
0
02 Nov 2023
POS: A Prompts Optimization Suite for Augmenting Text-to-Video Generation
Shijie Ma
Huayi Xu
Mengjian Li
Weidong Geng
Yaxiong Wang
Meng Wang
DiffM
VGen
53
0
0
02 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
115
11
0
01 Nov 2023
Adaptive Latent Diffusion Model for 3D Medical Image to Image Translation: Multi-modal Magnetic Resonance Imaging Study
Jonghun Kim
Hyunjin Park
MedIm
131
40
0
01 Nov 2023
Generative Neural Fields by Mixtures of Neural Implicit Functions
Tackgeun You
Mijeong Kim
Jungtaek Kim
Bohyung Han
DiffM
69
6
0
30 Oct 2023
Blind Image Super-resolution with Rich Texture-Aware Codebooks
Rui Qin
Ming Sun
Fangyuan Zhang
Xingsen Wen
Bin Wang
68
7
0
26 Oct 2023
FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Lihe Yang
Xiaogang Xu
Bingyi Kang
Yinghuan Shi
Hengshuang Zhao
95
46
0
23 Oct 2023
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
Zhiyuan Liu
Yaorui Shi
An Zhang
Enzhi Zhang
Kenji Kawaguchi
Xiang Wang
Tat-Seng Chua
AI4CE
103
40
0
23 Oct 2023
Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection
Ruiying Lu
YuJie Wu
Long Tian
Dongsheng Wang
Bo Chen
Xiyang Liu
Ruimin Hu
113
43
0
22 Oct 2023
Learning Invariant Molecular Representation in Latent Discrete Space
Zhuang Xiang
Qiang Zhang
Keyan Ding
Yatao Bian
Xiao Wang
Jingsong Lv
Hongyang Chen
Huajun Chen
OOD
106
20
0
22 Oct 2023
HumanTOMATO: Text-aligned Whole-body Motion Generation
Shunlin Lu
Ling-Hao Chen
Ailing Zeng
Jing Lin
Ruimao Zhang
Lei Zhang
H. Shum
VGen
107
67
0
19 Oct 2023
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi
Adhiraj Ghosh
Hendrik P. A. Lensch
DiffM
79
1
0
16 Oct 2023
Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Jianhao Yuan
Jie Zhang
Shuyang Sun
Philip Torr
Bo Zhao
87
27
0
16 Oct 2023
Scene Graph Conditioning in Latent Diffusion
Frank Fundel
DiffM
61
0
0
16 Oct 2023
Previous
1
2
3
...
7
8
9
...
21
22
23
Next