ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.00446
  4. Cited By
Generating Diverse High-Fidelity Images with VQ-VAE-2

Generating Diverse High-Fidelity Images with VQ-VAE-2

2 June 2019
Ali Razavi
Aaron van den Oord
Oriol Vinyals
    DRL
    BDL
ArXivPDFHTML

Papers citing "Generating Diverse High-Fidelity Images with VQ-VAE-2"

50 / 1,107 papers shown
Title
Protect Before Generate: Error Correcting Codes within Discrete Deep
  Generative Models
Protect Before Generate: Error Correcting Codes within Discrete Deep Generative Models
María Martínez-García
Grace Villacrés
David Mitchell
Pablo Martínez Olmos
DRL
21
0
0
10 Oct 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image
  Animation
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffM
VGen
46
22
0
10 Oct 2024
ElasticTok: Adaptive Tokenization for Image and Video
ElasticTok: Adaptive Tokenization for Image and Video
Wilson Yan
Matei A. Zaharia
Volodymyr Mnih
Pieter Abbeel
Aleksandra Faust
Hao Liu
VGen
54
6
0
10 Oct 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large
  Vision-Language Models
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Rui Zhao
Hangjie Yuan
Yujie Wei
Shiwei Zhang
Yuchao Gu
...
Xiang Wang
Zhangjie Wu
Junhao Zhang
Yingya Zhang
Mike Zheng Shou
DiffM
VLM
55
4
0
09 Oct 2024
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li
Weihao Yuan
Yisheng He
Lingteng Qiu
Shenhao Zhu
Xiaodong Gu
Weichao Shen
Yuan Dong
Zilong Dong
Laurence T. Yang
31
8
0
09 Oct 2024
Geometric Representation Condition Improves Equivariant Molecule Generation
Geometric Representation Condition Improves Equivariant Molecule Generation
Zian Li
Cai Zhou
Xiyuan Wang
Xingang Peng
Muhan Zhang
50
2
0
04 Oct 2024
SGW-based Multi-Task Learning in Vision Tasks
SGW-based Multi-Task Learning in Vision Tasks
Ruiyuan Zhang
Yuyao Chen
Yuchi Huo
Jiaxiang Liu
Dianbing Xi
Jie Liu
Chao Wu
30
1
0
03 Oct 2024
CaLMFlow: Volterra Flow Matching using Causal Language Models
CaLMFlow: Volterra Flow Matching using Causal Language Models
Shiyang Zhang
Daniel Levine
Ivan Vrkic
Marco Francesco Bressana
David Zhang
S. Rizvi
Yangtian Zhang
E. Zappala
David van Dijk
27
0
0
03 Oct 2024
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
Wanpeng Zhang
Zilong Xie
Yicheng Feng
Yijiang Li
Xingrun Xing
Sipeng Zheng
Zongqing Lu
MLLM
30
0
0
03 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe-nan Lin
VLM
44
18
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
52
2
0
02 Oct 2024
Integrating Text-to-Music Models with Language Models: Composing Long
  Structured Music Pieces
Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
Lilac Atassi
49
0
0
01 Oct 2024
Diverse Code Query Learning for Speech-Driven Facial Animation
Diverse Code Query Learning for Speech-Driven Facial Animation
Chunzhi Gu
Shigeru Kuriyama
Katsuya Hotta
DiffM
33
0
0
27 Sep 2024
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image
  Synthesis
Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis
Chirag Vashist
Shichong Peng
Ke Li
DiffM
44
1
0
26 Sep 2024
Exploring Semantic Clustering in Deep Reinforcement Learning for Video
  Games
Exploring Semantic Clustering in Deep Reinforcement Learning for Video Games
Liang Zhang
Justin Lieffers
A. Pyarelal
29
0
0
25 Sep 2024
Single Image, Any Face: Generalisable 3D Face Generation
Single Image, Any Face: Generalisable 3D Face Generation
Wenqing Wang
Haosen Yang
Josef Kittler
Xiatian Zhu
3DH
78
0
0
25 Sep 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
Yu Zhang
Ziyue Jiang
Ruiqi Li
Changhao Pan
Jinzheng He
Rongjie Huang
Chuxin Wang
Zhou Zhao
DiffM
VLM
52
5
0
24 Sep 2024
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
Bulat Gabdullin
Nina Konovalova
Nikolay Patakin
Dmitry Senushkin
Anton Konushin
MDE
40
0
0
23 Sep 2024
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond
Hong Chen
Xin Wang
Yuwei Zhou
Bin Huang
Yipeng Zhang
Wei Feng
Houlun Chen
Zeyang Zhang
Siao Tang
Wenwu Zhu
DiffM
55
7
0
23 Sep 2024
LASERS: LAtent Space Encoding for Representations with Sparsity for
  Generative Modeling
LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling
Xin Li
Anand Sarwate
37
0
0
16 Sep 2024
Beta-Sigma VAE: Separating beta and decoder variance in Gaussian
  variational autoencoder
Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder
Seunghwan Kim
Seungkyu Lee
DRL
36
0
0
14 Sep 2024
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation
  for Synthetic Image Detection
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection
Hina Otake
Yoshihiro Fukuhara
Yoshiki Kubotani
Shigeo Morishima
ViT
56
0
0
13 Sep 2024
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors
Md Ferdous Alam
Faez Ahmed
DiffM
41
6
0
08 Sep 2024
Blended Latent Diffusion under Attention Control for Real-World Video
  Editing
Blended Latent Diffusion under Attention Control for Real-World Video Editing
Deyin Liu
Lin Yuanbo Wu
Xianghua Xie
DiffM
51
0
0
05 Sep 2024
VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via
  Hierarchical Vector Quantization
VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization
Yixuan Zhou
Xing Xu
Zhe Sun
Jingkuan Song
A. Cichocki
Heng Tao Shen
58
1
0
02 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
42
7
0
31 Aug 2024
BELT-2: Bootstrapping EEG-to-Language representation alignment for
  multi-task brain decoding
BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding
Jinzhao Zhou
Yiqun Duan
Fred Chang
T. Do
Yu-Kai Wang
Chin-Teng Lin
30
2
0
28 Aug 2024
AEMLO: AutoEncoder-Guided Multi-Label Oversampling
AEMLO: AutoEncoder-Guided Multi-Label Oversampling
Ao Zhou
Bin Liu
Jin Wang
K. Sun
Kelin Liu
SyDa
26
0
0
23 Aug 2024
A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse
A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse
Zhongliang Guo
Lei Fang
Jingyu Lin
Yifei Qian
Shuai Zhao
Zeyu Wang
Zeyu Wang
Cunjian Chen
Ognjen Arandjelović
Chun Pong Lau
DiffM
AAML
45
7
0
20 Aug 2024
FancyVideo: Towards Dynamic and Consistent Video Generation via
  Cross-frame Textual Guidance
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Jiasong Feng
Ao Ma
Jing Wang
Bo Cheng
Xiaodan Liang
Dawei Leng
Yuhui Yin
DiffM
VGen
39
6
0
15 Aug 2024
DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with
  Diffusion
DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion
Yujia Wu
Yiming Shi
Jiwei Wei
Chengwei Sun
Yuyang Zhou
Yang Yang
Heng Tao Shen
48
3
0
13 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
24
6
0
12 Aug 2024
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
Jisoo Kim
Jungbin Cho
Joonho Park
Soonmin Hwang
Da Eun Kim
Geon Kim
Youngjae Yu
62
1
0
12 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
76
48
0
05 Aug 2024
PanoFree: Tuning-Free Holistic Multi-view Image Generation with
  Cross-view Self-Guidance
PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance
Aoming Liu
Zhong Li
Zhang Chen
Nannan Li
Yinghao Xu
Bryan A. Plummer
42
4
0
04 Aug 2024
LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake
  Generation
LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation
Dwij Mehta
Aditya Mehta
Pratik Narang
DiffM
53
0
0
04 Aug 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
Qian Zhang
Xiangzi Dai
Ninghua Yang
Xiang An
Ziyong Feng
Xingyu Ren
VLM
CLIP
43
17
0
02 Aug 2024
Informed Correctors for Discrete Diffusion Models
Informed Correctors for Discrete Diffusion Models
Yixiu Zhao
Jiaxin Shi
Lester W. Mackey
Scott W. Linderman
Lester Mackey
Scott Linderman
59
9
0
30 Jul 2024
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken
  Generation
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Yongqi Li
Hongru Cai
Wenjie Wang
Leigang Qu
Yinwei Wei
Wenjie Li
Liqiang Nie
Tat-Seng Chua
DiffM
40
1
0
24 Jul 2024
WebRPG: Automatic Web Rendering Parameters Generation for Visual
  Presentation
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
Zirui Shao
Feiyu Gao
Hangdi Xing
Zepeng Zhu
Zhi Yu
Jiajun Bu
Qi Zheng
Cong Yao
31
2
0
22 Jul 2024
Decomposed Vector-Quantized Variational Autoencoder for Human Grasp
  Generation
Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation
Zhe Zhao
Mengshi Qi
Huadong Ma
DRL
44
2
0
19 Jul 2024
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow
Yuanzhi Zhu
Xingchao Liu
Qiang Liu
46
9
0
17 Jul 2024
GLARE: Low Light Image Enhancement via Generative Latent Feature based
  Codebook Retrieval
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
Han Zhou
Wei Dong
Xiaohong Liu
Shuaicheng Liu
Xiongkuo Min
Guangtao Zhai
Jun Chen
66
13
0
17 Jul 2024
Generating 3D House Wireframes with Semantics
Generating 3D House Wireframes with Semantics
Xueqi Ma
Yilin Liu
Wenjun Zhou
Ruowei Wang
Hui Huang
3DV
41
0
0
17 Jul 2024
Quantised Global Autoencoder: A Holistic Approach to Representing Visual
  Data
Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data
Tim Elsner
Paula Usinger
Victor Czech
Gregor Kobsik
Yanjiang He
I. Lim
Leif Kobbelt
46
1
0
16 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serrà
DiffM
VGen
47
15
0
15 Jul 2024
RTMW: Real-Time Multi-Person 2D and 3D Whole-body Pose Estimation
RTMW: Real-Time Multi-Person 2D and 3D Whole-body Pose Estimation
Tao Jiang
Xinchen Xie
Yining Li
3DH
46
2
0
11 Jul 2024
Several questions of visual generation in 2024
Several questions of visual generation in 2024
Shuyang Gu
40
1
0
11 Jul 2024
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for
  Text-to-Video Generation Task
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task
Yiran Yang
Jinchao Zhang
Ying Deng
Jie Zhou
DiffM
31
0
0
09 Jul 2024
Latent Space Imaging
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
65
0
0
09 Jul 2024
Previous
12345...212223
Next