ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.14822
  4. Cited By
Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis

29 November 2021
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
    DiffM
ArXivPDFHTML

Papers citing "Vector Quantized Diffusion Model for Text-to-Image Synthesis"

50 / 566 papers shown
Title
GraVITON: Graph based garment warping with attention guided inversion
  for Virtual-tryon
GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon
Sanhita Pathak
V. Kaushik
Brejesh Lall
DiffM
45
0
0
04 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
51
5
0
04 Jun 2024
Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Layout-Agnostic Scene Text Image Synthesis with Diffusion Models
Qilong Zhangli
Jindong Jiang
Di Liu
Licheng Yu
Xiaoliang Dai
Ankit Ramchandani
Guan Pang
Dimitris N. Metaxas
Praveen Krishnan
DiffM
45
8
0
03 Jun 2024
Mixed Diffusion for 3D Indoor Scene Synthesis
Mixed Diffusion for 3D Indoor Scene Synthesis
Siyi Hu
Diego Martin Arroyo
Stephanie Debats
Fabian Manhardt
Luca Carlone
Federico Tombari
DiffM
35
4
0
31 May 2024
Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent
  Codes
Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Riccardo Benaglia
Angelo Porrello
Pietro Buzzega
Simone Calderara
Rita Cucchiara
20
0
0
31 May 2024
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Lianghui Zhu
Zilong Huang
Bencheng Liao
Jun Hao Liew
Hanshu Yan
Jiashi Feng
Xinggang Wang
70
13
0
28 May 2024
Text Modality Oriented Image Feature Extraction for Detecting
  Diffusion-based DeepFake
Text Modality Oriented Image Feature Extraction for Detecting Diffusion-based DeepFake
Di Yang
Yihao Huang
Qing-Wu Guo
Felix Juefei Xu
Xiaojun Jia
Run Wang
G. Pu
Yang Liu
DiffM
34
0
0
28 May 2024
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for
  Text-to-Image Customization
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
Junjie Shentu
Matthew Watson
Noura Al Moubayed
DiffM
49
0
0
28 May 2024
Training-free Editioning of Text-to-Image Models
Training-free Editioning of Text-to-Image Models
Jinqi Wang
Yunfei Fu
Zhangcan Ding
Bailin Deng
Yu-Kun Lai
Yipeng Qin
DiffM
VLM
39
0
0
27 May 2024
$\text{Di}^2\text{Pose}$: Discrete Diffusion Model for Occluded 3D Human
  Pose Estimation
Di2Pose\text{Di}^2\text{Pose}Di2Pose: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
Weiquan Wang
Jun Xiao
Chunping Wang
Wei Liu
Zhao Wang
Long Chen
DiffM
36
1
0
27 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
64
2
0
27 May 2024
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Lateralization MLP: A Simple Brain-inspired Architecture for Diffusion
Zizhao Hu
Mohammad Rostami
34
0
0
25 May 2024
Learning to Discretize Denoising Diffusion ODEs
Learning to Discretize Denoising Diffusion ODEs
Vinh Tong
Anji Liu
Trung-Dung Hoang
Guy Van den Broeck
Mathias Niepert
DiffM
41
4
0
24 May 2024
SoundLoCD: An Efficient Conditional Discrete Contrastive Latent
  Diffusion Model for Text-to-Sound Generation
SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation
Xinlei Niu
Jing Zhang
Christian J. Walder
Charles Patrick Martin
19
2
0
24 May 2024
Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion
Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion
Aoxue Li
Mingyang Yi
Zhenguo Li
DiffM
48
0
0
24 May 2024
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang
Zhi-Wei Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
43
2
0
23 May 2024
How to Trace Latent Generative Model Generated Images without Artificial
  Watermark?
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Zhenting Wang
Vikash Sehwag
Chen Chen
Lingjuan Lyu
Dimitris N. Metaxas
Shiqing Ma
WIGM
38
5
0
22 May 2024
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
N. Sebe
Mubarak Shah
EGVM
89
6
0
22 May 2024
Beyond Traditional Single Object Tracking: A Survey
Beyond Traditional Single Object Tracking: A Survey
Omar Abdelaziz
Mohamed Shehata
Mohamed Mohamed
35
0
0
16 May 2024
VisioBlend: Sketch and Stroke-Guided Denoising Diffusion Probabilistic
  Model for Realistic Image Generation
VisioBlend: Sketch and Stroke-Guided Denoising Diffusion Probabilistic Model for Realistic Image Generation
Harshkumar Devmurari
Gautham Kuckian
Prajjwal Vishwakarma
Krunali Vartak
DiffM
28
0
0
15 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
44
2
0
11 May 2024
FlexEControl: Flexible and Efficient Multimodal Control for
  Text-to-Image Generation
FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation
Xuehai He
Jian Zheng
Jacob Zhiyuan Fang
Robinson Piramuthu
Mohit Bansal
Vicente Ordonez
Gunnar A. Sigurdsson
Nanyun Peng
Xin Eric Wang
DiffM
45
1
0
08 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
84
37
0
06 May 2024
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data
Yiyuan Yang
Ming Jin
Haomin Wen
Chaoli Zhang
Yuxuan Liang
...
Bin Yang
Zenglin Xu
Jiang Bian
Shirui Pan
Qingsong Wen
DiffM
AI4TS
SyDa
37
38
0
29 Apr 2024
MuseumMaker: Continual Style Customization without Catastrophic
  Forgetting
MuseumMaker: Continual Style Customization without Catastrophic Forgetting
Chenxi Liu
Gan Sun
Wenqi Liang
Jiahua Dong
Can Qin
Yang Cong
DiffM
50
3
0
25 Apr 2024
DeepFeatureX Net: Deep Features eXtractors based Network for
  discriminating synthetic from real images
DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images
Orazio Pontorno
Luca Guarnera
Sebastiano Battiato
30
4
0
24 Apr 2024
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
Xinlei Niu
Jing Zhang
Charles Patrick Martin
26
2
0
24 Apr 2024
Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
Nakul Sharma
Aditay Tripathi
Anirban Chakraborty
Anand Mishra
DiffM
33
3
0
18 Apr 2024
A Data-Driven Representation for Sign Language Production
A Data-Driven Representation for Sign Language Production
Harry Walsh
Abolfazl Ravanshad
Mariam Rahmani
Richard Bowden
SLR
21
3
0
17 Apr 2024
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Mude Hui
Siwei Yang
Bingchen Zhao
Yichun Shi
Heng Wang
Peng Wang
Yuyin Zhou
Cihang Xie
35
54
0
15 Apr 2024
in2IN: Leveraging individual Information to Generate Human INteractions
in2IN: Leveraging individual Information to Generate Human INteractions
Pablo Ruiz-Ponce
Germán Barquero
Cristina Palmero
Sergio Escalera
Jose J. García Rodríguez
VGen
DiffM
51
7
0
15 Apr 2024
E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors
  to New Generators Using Limited Data
E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors to New Generators Using Limited Data
Aref Azizpour
Tai D. Nguyen
Manil Shrestha
Kaidi Xu
Edward Kim
Matthew C. Stamm
34
4
0
12 Apr 2024
Latent Guard: a Safety Framework for Text-to-image Generation
Latent Guard: a Safety Framework for Text-to-image Generation
Runtao Liu
Ashkan Khakzar
Jindong Gu
Qifeng Chen
Philip H. S. Torr
Fabio Pizzati
23
23
0
11 Apr 2024
An Overview of Diffusion Models: Applications, Guided Generation,
  Statistical Rates and Optimization
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Minshuo Chen
Song Mei
Jianqing Fan
Mengdi Wang
VLM
MedIm
DiffM
37
48
0
11 Apr 2024
StoryImager: A Unified and Efficient Framework for Coherent Story
  Visualization and Completion
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
Ming Tao
Bing-Kun Bao
Hao Tang
Yaowei Wang
Changsheng Xu
DiffM
44
5
0
09 Apr 2024
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot
  Editing of Text-to-Video Diffusion Models
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models
Saman Motamed
Wouter Van Gansbeke
Luc Van Gool
VGen
DiffM
37
1
0
08 Apr 2024
Mixture of Low-rank Experts for Transferable AI-Generated Image
  Detection
Mixture of Low-rank Experts for Transferable AI-Generated Image Detection
Zihan Liu
Hanyi Wang
Yaoyu Kang
Shilin Wang
MoE
41
12
0
07 Apr 2024
Light the Night: A Multi-Condition Diffusion Framework for Unpaired
  Low-Light Enhancement in Autonomous Driving
Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving
Jinlong Li
Baolu Li
Zhengzhong Tu
Xinyu Liu
Qing-Wu Guo
Felix Juefei Xu
Runsheng Xu
Hongkai Yu
DiffM
47
18
0
07 Apr 2024
Which Model Generated This Image? A Model-Agnostic Approach for Origin
  Attribution
Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution
Fengyuan Liu
Haochen Luo
Yiming Li
Philip H. S. Torr
Jindong Gu
VLM
26
5
0
03 Apr 2024
A Unified and Interpretable Emotion Representation and Expression
  Generation
A Unified and Interpretable Emotion Representation and Expression Generation
Reni Paskaleva
Mykyta Holubakha
Andela Ilic
Saman Motamed
Luc Van Gool
D. Paudel
41
2
0
01 Apr 2024
Transformer based Pluralistic Image Completion with Reduced Information
  Loss
Transformer based Pluralistic Image Completion with Reduced Information Loss
Qiankun Liu
Yuqi Jiang
Zhentao Tan
Dongdong Chen
Ying Fu
Qi Chu
Gang Hua
Nenghai Yu
ViT
68
11
0
31 Mar 2024
Relation Rectification in Diffusion Model
Relation Rectification in Diffusion Model
Yinwei Wu
Xingyi Yang
Xinchao Wang
28
6
0
29 Mar 2024
Attention Calibration for Disentangled Text-to-Image Personalization
Attention Calibration for Disentangled Text-to-Image Personalization
Yanbing Zhang
Mengping Yang
Qin Zhou
Zhe Wang
29
15
0
27 Mar 2024
LayoutFlow: Flow Matching for Layout Generation
LayoutFlow: Flow Matching for Layout Generation
Julian Jorge Andrade Guerreiro
Naoto Inoue
Kento Masui
Mayu Otani
Hideki Nakayama
DiffM
36
7
0
27 Mar 2024
Fake or JPEG? Revealing Common Biases in Generated Image Detection
  Datasets
Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets
Patrick Grommelt
Louis Weiss
Franz-Josef Pfreundt
J. Keuper
34
18
0
26 Mar 2024
Refining Text-to-Image Generation: Towards Accurate Training-Free
  Glyph-Enhanced Image Generation
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
Sanyam Lakhanpal
Shivang Chopra
Vinija Jain
Aman Chadha
Man Luo
32
9
0
25 Mar 2024
CLIP-VQDiffusion : Langauge Free Training of Text To Image generation
  using CLIP and vector quantized diffusion model
CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model
S. Han
Joohee Kim
DiffM
CLIP
34
1
0
22 Mar 2024
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified &
  Accurate Image Editing
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
Yueru Jia
Yuhui Yuan
Aosong Cheng
Chuke Wang
Ji Li
Huizhu Jia
Shanghang Zhang
DiffM
31
7
0
21 Mar 2024
Open Knowledge Base Canonicalization with Multi-task Learning
Open Knowledge Base Canonicalization with Multi-task Learning
Bingchen Liu
Huang Peng
Weixin Zeng
Xiang Zhao
Shijun Liu
Li Pan
24
0
0
21 Mar 2024
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video
  Object Segmentation
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
Zixin Zhu
Xuelu Feng
Dongdong Chen
Junsong Yuan
Chunming Qiao
Gang Hua
DiffM
42
7
0
18 Mar 2024
Previous
12345...101112
Next