Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.00704
Cited By
Muse: Text-To-Image Generation via Masked Generative Transformers
2 January 2023
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
Lu Jiang
Ming-Hsuan Yang
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Muse: Text-To-Image Generation via Masked Generative Transformers"
50 / 142 papers shown
Title
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
Bulat Gabdullin
Nina Konovalova
Nikolay Patakin
Dmitry Senushkin
Anton Konushin
MDE
86
1
0
01 Jul 2025
Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization
Jingfeng Guo
Jian Liu
Jinnan Chen
Shiwei Mao
Changrong Hu
...
Jing Xu
Qi Liu
Lixin Xu
Zhuo Chen
Chunchao Guo
48
0
0
13 Jun 2025
Only-Style: Stylistic Consistency in Image Generation without Content Leakage
Tilemachos Aravanis
P. Filntisis
Petros Maragos
George Retsinas
83
0
0
11 Jun 2025
MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation
Yijie Deng
Shuaihang Yuan
Congcong Wen
Hao Huang
Anthony Tzes
Geeta Chandra Raju Bethala
Yi Fang
33
0
0
09 Jun 2025
Noise Consistency Regularization for Improved Subject-Driven Image Synthesis
Yao Ni
Song Wen
Piotr Koniusz
A. Cherian
28
0
0
06 Jun 2025
UniRes: Universal Image Restoration for Complex Degradations
Mo Zhou
Keren Ye
M. Delbracio
P. Milanfar
Vishal M. Patel
Hossein Talebi
43
0
0
05 Jun 2025
Native-Resolution Image Synthesis
Zidong Wang
Lei Bai
Xiangyu Yue
Wanli Ouyang
Yiyuan Zhang
81
0
0
03 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
63
0
0
02 Jun 2025
A Survey of Generative Categories and Techniques in Multimodal Large Language Models
Longzhen Han
Awes Mubarak
Almas Baimagambetov
Nikolaos Polatidis
Thar Baker
LRM
72
0
0
29 May 2025
Semantics-Aware Human Motion Generation from Audio Instructions
Zi-An Wang
Shihao Zou
Shiyao Yu
Mingyuan Zhang
Chao Dong
VGen
39
0
0
29 May 2025
Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance
Badr Moufad
Yazid Janati
Alain Durmus
Ahmed Ghorbel
Eric Moulines
Jimmy Olsson
DiffM
85
0
0
27 May 2025
ConsiStyle: Style Diversity in Training-Free Consistent T2I Generation
Yohai Mazuz
Janna Bruner
Lior Wolf
DiffM
66
0
0
27 May 2025
ReDDiT: Rehashing Noise for Discrete Visual Generation
Tianren Ma
Xiaosong Zhang
Boyu Yang
Junlan Feng
QiXiang Ye
DiffM
126
0
0
26 May 2025
Adaptive Diffusion Guidance via Stochastic Optimal Control
Iskander Azangulov
Peter Potaptchik
Qinyu Li
Eddie Aamari
George Deligiannidis
Judith Rousseau
29
0
0
25 May 2025
Learning Flexible Forward Trajectories for Masked Molecular Diffusion
Hyunjin Seo
Taewon Kim
Sihyun Yu
SungSoo Ahn
DiffM
AI4CE
181
0
0
22 May 2025
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
Chaoyi Jiang
Sungwoo Kim
Lei Gao
Hossein Entezari Zarch
Won Woo Ro
Murali Annavaram
34
0
0
22 May 2025
Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model
Xue Wang
Tian Zhou
Jinyang Gao
Bolin Ding
Jingren Zhou
AI4TS
AI4CE
LRM
19
0
0
20 May 2025
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai
Yiyou Sun
Wei Cheng
Haifeng Chen
AAML
98
0
0
02 May 2025
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
437
0
0
30 Apr 2025
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
Xiangyue Zhang
Jianfang Li
Jiaxu Zhang
Jianqiang Ren
Liefeng Bo
Zhigang Tu
89
0
0
12 Apr 2025
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
Hang Guo
Yawei Li
Taolin Zhang
Jiadong Wang
Tao Dai
Shu-Tao Xia
Luca Benini
172
5
0
30 Mar 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
Wentao Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
468
6
0
27 Mar 2025
MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation
Jinnan Chen
Lingting Zhu
Zeyu Hu
Shengju Qian
Yuxiao Chen
Xin Wang
G. Lee
210
2
0
26 Mar 2025
TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation
Yuheng Feng
Jianhui Wang
Kun Li
Sida Li
Tianyu Shi
Haoyue Han
Miao Zhang
Xueqian Wang
DiffM
496
0
0
22 Mar 2025
CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models
Yuyang Xue
Edward Moroshko
Feng Chen
Jingyu Sun
Steven McDonagh
Sotirios A. Tsaftaris
129
2
0
18 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
185
0
0
14 Mar 2025
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Ju He
Qihang Yu
Qihao Liu
Liang-Chieh Chen
153
1
0
13 Mar 2025
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment
Xing Xie
Jiawei Liu
Ziyue Lin
Huijie Fan
Zhi Han
Yandong Tang
Liangqiong Qu
117
0
0
10 Mar 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
203
1
0
03 Mar 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
214
3
0
21 Feb 2025
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
291
55
0
14 Feb 2025
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization
T. Pham
Zhang Kang
Ji Woo Hong
Xuran Zheng
Chang D. Yoo
144
0
0
13 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Zhiyong Yang
Mike Zheng Shou
MoE
214
1
0
10 Feb 2025
Visual Generation Without Guidance
Huayu Chen
Kai Jiang
Kaiwen Zheng
Jianfei Chen
Hang Su
Jun Zhu
168
2
0
28 Jan 2025
CE-SDWV: Effective and Efficient Concept Erasure for Text-to-Image Diffusion Models via a Semantic-Driven Word Vocabulary
Jiahang Tu
Qian Feng
Chufan Chen
Jiahua Dong
Hanbin Zhao
Chao Zhang
Hui Qian
121
4
0
28 Jan 2025
Taming Teacher Forcing for Masked Autoregressive Video Generation
Deyu Zhou
Quan Sun
Yuang Peng
Kun Yan
Runpei Dong
...
Zheng Ge
Nan Duan
Xiangyu Zhang
L. Ni
H. Shum
VGen
105
9
0
21 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
150
11
0
13 Jan 2025
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Minxing Luo
Zixun Xia
L. Chen
Zhenhang Li
Weichao Zeng
Jinqiao Wang
Wentao Cheng
Yaxing Wang
Yu Zhou
Jian Yang
DiffM
151
1
0
10 Jan 2025
Learning the Language of Protein Structure
Benoit Gaujac
Jérémie Donà
Liviu Copoiu
Timothy Atkinson
Thomas Pierrot
Thomas D. Barrett
103
12
0
08 Jan 2025
Large Language Models for Video Surveillance Applications
Ulindu De Silva
Leon Fernando
Billy Lau Pik Lik
Zann Koh
Sam Conrad Joyce
Belinda Yuen
Chau Yuen
62
1
0
06 Jan 2025
Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models
Emily Johnson
Noah Wilson
VLM
143
0
0
03 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffM
VGen
137
0
0
02 Jan 2025
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
Zhipeng Chen
Lan Yang
Yonggang Qi
Honggang Zhang
Kaiyue Pang
Ke Li
Yi-Zhe Song
DiffM
206
0
0
31 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
255
10
0
19 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
242
2
0
12 Dec 2024
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Xinghui Li
Qichao Sun
Pengze Zhang
Fulong Ye
Zhichao Liao
Wanquan Feng
Mingcong Liu
Qian He
DiffM
144
3
0
05 Dec 2024
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Anton Voronov
Denis Kuznedelev
Mikhail Khoroshikh
Valentin Khrulkov
Dmitry Baranchuk
269
4
0
02 Dec 2024
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Ziqi Pang
Tianyuan Zhang
Fujun Luan
Yunze Man
Hao Tan
Kai Zhang
William T. Freeman
Yu-Xiong Wang
VGen
137
20
0
02 Dec 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
188
1
0
28 Nov 2024
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Chaehun Shin
Jooyoung Choi
Heeseung Kim
Sungroh Yoon
DiffM
189
13
0
23 Nov 2024
1
2
3
Next