ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.00446
  4. Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2

Generating Diverse High-Fidelity Images with VQ-VAE-2

2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
    DRLBDL
ArXiv (abs)PDFHTML

Papers citing "Generating Diverse High-Fidelity Images with VQ-VAE-2"

50 / 1,128 papers shown
Title
DreamTuner: Single Image is Enough for Subject-Driven Generation
DreamTuner: Single Image is Enough for Subject-Driven Generation
Miao Hua
Jiawei Liu
Fei Ding
Wei Liu
Jie Wu
Qian He
72
31
0
21 Dec 2023
Sign Language Production with Latent Motion Transformer
Sign Language Production with Latent Motion Transformer
Pan Xie
Taiying Peng
Yao Du
Qipeng Zhang
SLR
79
5
0
20 Dec 2023
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
Tariq Berrada
Jakob Verbeek
Camille Couprie
Alahari Karteek
93
9
0
20 Dec 2023
RadEdit: stress-testing biomedical vision models via diffusion image
  editing
RadEdit: stress-testing biomedical vision models via diffusion image editing
Fernando Pérez-García
Sam Bond-Taylor
Pedro P. Sanchez
B. V. Breugel
Daniel Coelho De Castro
...
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
Maximilian Ilse
MedIm
146
11
0
20 Dec 2023
All but One: Surgical Concept Erasing with Model Preservation in
  Text-to-Image Diffusion Models
All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models
Seunghoo Hong
Juhun Lee
Simon S. Woo
120
20
0
20 Dec 2023
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image
  Diffusion Models
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Shweta Mahajan
Tanzila Rahman
Kwang Moo Yi
Leonid Sigal
DiffM
108
20
0
19 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
198
1
0
19 Dec 2023
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal
  Output Distributions
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal Output Distributions
David D. Nguyen
David Liebowitz
Surya Nepal
S. Kanhere
OODUQCV
62
0
0
18 Dec 2023
Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided
  Document Generation
Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation
YoungJoon Yoo
Jongwon Choi
BDL
82
2
0
15 Dec 2023
Towards Equipping Transformer with the Ability of Systematic
  Compositionality
Towards Equipping Transformer with the Ability of Systematic Compositionality
Chen Huang
Peixin Qin
Wenqiang Lei
Jiancheng Lv
95
2
0
12 Dec 2023
Equivariant Flow Matching with Hybrid Probability Transport
Equivariant Flow Matching with Hybrid Probability Transport
Yuxuan Song
Jingjing Gong
Minkai Xu
Ziyao Cao
Yanyan Lan
Stefano Ermon
Hao Zhou
Wei-Ying Ma
DiffM
100
57
0
12 Dec 2023
Implicit Shape Modeling for Anatomical Structure Refinement of
  Volumetric Medical Images
Implicit Shape Modeling for Anatomical Structure Refinement of Volumetric Medical Images
Minghui Zhang
Shanshan Shi
Renao Yan
Liang Zhu
Tian Guan
MedIm
77
1
0
11 Dec 2023
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization
  Inversion for Zero-Shot Video Editing
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Maomao Li
Yu Li
Tianyu Yang
Yunfei Liu
Dongxu Yue
Zhihui Lin
Dong Xu
VGen
50
9
0
10 Dec 2023
Free3D: Consistent Novel View Synthesis without 3D Representation
Free3D: Consistent Novel View Synthesis without 3D Representation
Chuanxia Zheng
Andrea Vedaldi
3DV
138
50
0
07 Dec 2023
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer
Bichen Wu
Edgar Schoenfeld
Xiaoliang Dai
Ji Hou
...
Jonas Kohler
Christian Rupprecht
Zorah Lähner
Peter Vajda
Jialiang Wang
DiffM
110
78
0
06 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
110
41
0
04 Dec 2023
ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder
  and Explicit Adaptation
ArtAdapter: Text-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation
Dar-Yen Chen
Hamish Tennent
Ching-Wen Hsu
DiffM
120
27
0
04 Dec 2023
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via
  Local-Global Iterative Training
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
Runze He
Shaofei Huang
Xuecheng Nie
Tianrui Hui
Luoqi Liu
Jiao Dai
Jizhong Han
Guanbin Li
Si Liu
DiffM
69
8
0
04 Dec 2023
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Rishi Upadhyay
Howard Zhang
Yunhao Ba
Ethan Yang
Blake Gella
Sicheng Jiang
Alex Wong
A. Kadambi
92
11
0
01 Dec 2023
Simple Transferability Estimation for Regression Tasks
Simple Transferability Estimation for Regression Tasks
Cuong N. Nguyen
Phong Tran
L. Ho
Vu C. Dinh
Anh Tran
Tal Hassner
Cuong V Nguyen
96
2
0
01 Dec 2023
SPOT: Self-Training with Patch-Order Permutation for Object-Centric
  Learning with Autoregressive Transformers
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou
Spyros Gidaris
Konstantinos Karantzalos
N. Komodakis
ViTOCL
135
16
0
01 Dec 2023
AV-RIR: Audio-Visual Room Impulse Response Estimation
AV-RIR: Audio-Visual Room Impulse Response Estimation
Anton Ratnarajah
Sreyan Ghosh
Sonal Kumar
Purva Chiniya
Dinesh Manocha
78
15
0
30 Nov 2023
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Yanhui Wang
Jianmin Bao
Wenming Weng
Ruoyu Feng
Dacheng Yin
...
Yuhui Yuan
Chuanxin Tang
Xiaoyan Sun
Chong Luo
Baining Guo
DiffMVGen
133
17
0
30 Nov 2023
DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for
  Music-to-Dance Synthesis
DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis
Xin Gao
Liucheng Hu
Peng Zhang
Bang Zhang
Liefeng Bo
DiffM
108
4
0
30 Nov 2023
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi
Miao Wang
Hao-Bin Duan
Shao-Hua Guan
3DGS
125
96
0
30 Nov 2023
Beyond Two-Tower Matching: Learning Sparse Retrievable
  Cross-Interactions for Recommendation
Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation
Liangcai Su
Fan Yan
Jieming Zhu
Xi Xiao
Haoyi Duan
Zhou Zhao
Zhenhua Dong
Ruiming Tang
72
10
0
30 Nov 2023
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks,
  Methods, and Applications
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications
Karren D. Yang
Anurag Ranjan
Jen-Hao Rick Chang
Raviteja Vemulapalli
Oncel Tuzel
108
9
0
30 Nov 2023
Do text-free diffusion models learn discriminative visual
  representations?
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay
M. Gwilliam
Yosuke Yamaguchi
Vatsal Agarwal
Namitha Padmanabhan
Archana Swaminathan
Dinesh Manocha
Abhinav Shrivastava
DiffM
145
14
1
29 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image
  Generation
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
57
3
0
27 Nov 2023
Unlearning via Sparse Representations
Unlearning via Sparse Representations
Vedant Shah
Frederik Trauble
Ashish Malik
Hugo Larochelle
Michael C. Mozer
Sanjeev Arora
Yoshua Bengio
Anirudh Goyal
MU
77
9
0
26 Nov 2023
Self-Supervised Music Source Separation Using Vector-Quantized Source
  Category Estimates
Self-Supervised Music Source Separation Using Vector-Quantized Source Category Estimates
Marco Pasini
Stefan Lattner
George Fazekas
77
1
0
21 Nov 2023
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud
  Generation
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
Chenliang Zhou
Fangcheng Zhong
Param Hanji
Zhilin Guo
Kyle Fogarty
Alejandro Sztrajman
Hongyun Gao
Cengiz Öztireli
101
3
0
20 Nov 2023
hvEEGNet: exploiting hierarchical VAEs on EEG data for neuroscience
  applications
hvEEGNet: exploiting hierarchical VAEs on EEG data for neuroscience applications
Giulia Cisotto
Alberto Zancanaro
I. Zoppis
Sara Manzoni
75
3
0
20 Nov 2023
SeaDSC: A video-based unsupervised method for dynamic scene change
  detection in unmanned surface vehicles
SeaDSC: A video-based unsupervised method for dynamic scene change detection in unmanned surface vehicles
L. Trinh
Ali Anwar
Siegfried Mercelis
100
4
0
20 Nov 2023
Compact and Intuitive Airfoil Parameterization Method through
  Physics-aware Variational Autoencoder
Compact and Intuitive Airfoil Parameterization Method through Physics-aware Variational Autoencoder
Yu-Eop Kang
Dawoon Lee
K. Yee
68
0
0
18 Nov 2023
UltraLiDAR: Learning Compact Representations for LiDAR Completion and
  Generation
UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation
Yuwen Xiong
Wei-Chiu Ma
Jingkang Wang
R. Urtasun
87
47
0
02 Nov 2023
Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth
  Simulation
Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation
Vitjan Zavrtanik
Matej Kristan
D. Skočaj
86
16
0
02 Nov 2023
POS: A Prompts Optimization Suite for Augmenting Text-to-Video
  Generation
POS: A Prompts Optimization Suite for Augmenting Text-to-Video Generation
Shijie Ma
Huayi Xu
Mengjian Li
Weidong Geng
Yaxiong Wang
Meng Wang
DiffMVGen
53
0
0
02 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLMDiffM
115
11
0
01 Nov 2023
Adaptive Latent Diffusion Model for 3D Medical Image to Image
  Translation: Multi-modal Magnetic Resonance Imaging Study
Adaptive Latent Diffusion Model for 3D Medical Image to Image Translation: Multi-modal Magnetic Resonance Imaging Study
Jonghun Kim
Hyunjin Park
MedIm
131
40
0
01 Nov 2023
Generative Neural Fields by Mixtures of Neural Implicit Functions
Generative Neural Fields by Mixtures of Neural Implicit Functions
Tackgeun You
Mijeong Kim
Jungtaek Kim
Bohyung Han
DiffM
69
6
0
30 Oct 2023
Blind Image Super-resolution with Rich Texture-Aware Codebooks
Blind Image Super-resolution with Rich Texture-Aware Codebooks
Rui Qin
Ming Sun
Fangyuan Zhang
Xingsen Wen
Bin Wang
68
7
0
26 Oct 2023
FreeMask: Synthetic Images with Dense Annotations Make Stronger
  Segmentation Models
FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Lihe Yang
Xiaogang Xu
Bingyi Kang
Yinghuan Shi
Hengshuang Zhao
95
46
0
23 Oct 2023
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
Zhiyuan Liu
Yaorui Shi
An Zhang
Enzhi Zhang
Kenji Kawaguchi
Xiang Wang
Tat-Seng Chua
AI4CE
103
40
0
23 Oct 2023
Hierarchical Vector Quantized Transformer for Multi-class Unsupervised
  Anomaly Detection
Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection
Ruiying Lu
YuJie Wu
Long Tian
Dongsheng Wang
Bo Chen
Xiyang Liu
Ruimin Hu
113
43
0
22 Oct 2023
Learning Invariant Molecular Representation in Latent Discrete Space
Learning Invariant Molecular Representation in Latent Discrete Space
Zhuang Xiang
Qiang Zhang
Keyan Ding
Yatao Bian
Xiao Wang
Jingsong Lv
Hongyang Chen
Huajun Chen
OOD
106
20
0
22 Oct 2023
HumanTOMATO: Text-aligned Whole-body Motion Generation
HumanTOMATO: Text-aligned Whole-body Motion Generation
Shunlin Lu
Ling-Hao Chen
Ailing Zeng
Jing Lin
Ruimao Zhang
Lei Zhang
H. Shum
VGen
107
67
0
19 Oct 2023
ViPE: Visualise Pretty-much Everything
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi
Adhiraj Ghosh
Hendrik P. A. Lensch
DiffM
79
1
0
16 Oct 2023
Real-Fake: Effective Training Data Synthesis Through Distribution
  Matching
Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Jianhao Yuan
Jie Zhang
Shuyang Sun
Philip Torr
Bo Zhao
87
27
0
16 Oct 2023
Scene Graph Conditioning in Latent Diffusion
Scene Graph Conditioning in Latent Diffusion
Frank Fundel
DiffM
61
0
0
16 Oct 2023
Previous
123...789...212223
Next