Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 899 papers shown
Title
Magic Insert: Style-Aware Drag-and-Drop
Nataniel Ruiz
Yuanzhen Li
Neal Wadhwa
Yael Pritch
Michael Rubinstein
David E. Jacobs
Shlomi Fruchter
DiffM
93
8
0
02 Jul 2024
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects
Raphael Bensadoun
Yanir Kleiman
Idan Azuri
Omri Harosh
Andrea Vedaldi
Natalia Neverova
Oran Gafni
105
30
0
02 Jul 2024
SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules
Suyi Li
Lingyun Yang
Xiaoxiao Jiang
Hanfeng Lu
Zhipeng Di
...
Tao Lan
Guodong Yang
Lin Qu
Liping Zhang
Wei Wang
64
4
0
02 Jul 2024
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat
Manuel Kansy
Otmar Hilliges
Romann M. Weber
94
14
0
02 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Li
Fan Ma
Zongxin Yang
Yue Yang
169
11
0
02 Jul 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Core Francisco Park
Maya Okawa
Andrew Lee
Ekdeep Singh Lubana
Hidenori Tanaka
110
21
0
27 Jun 2024
On Discrete Prompt Optimization for Diffusion Models
Ruochen Wang
Ting Liu
Cho-Jui Hsieh
Boqing Gong
DiffM
89
8
0
27 Jun 2024
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
William Berman
A. Peysakhovich
91
4
0
26 Jun 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
144
40
0
26 Jun 2024
Aligning Diffusion Models with Noise-Conditioned Perception
Alexander Gambashidze
Anton Kulikov
Yuriy Sosnin
Ilya Makarov
116
5
0
25 Jun 2024
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation
Katherine M. Collins
Najoung Kim
Yonatan Bitton
Verena Rieser
Shayegan Omidshafiei
...
Gang Li
Adrian Weller
Junfeng He
Deepak Ramachandran
Krishnamurthy Dvijotham
EGVM
84
3
0
24 Jun 2024
EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models
Zhiyu Tan
Xiaomeng Yang
Luozheng Qin
Mengping Yang
Cheng Zhang
Hao Li
106
8
0
24 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
185
39
0
24 Jun 2024
Reading Is Believing: Revisiting Language Bottleneck Models for Image Classification
Honori Udo
Takafumi Koshinaka
VLM
71
0
0
22 Jun 2024
MetaGreen: Meta-Learning Inspired Transformer Selection for Green Semantic Communication
Shubhabrata Mukherjee
Cory Beard
Sejun Song
68
0
0
22 Jun 2024
Evaluating Numerical Reasoning in Text-to-Image Models
Ivana Kajić
Olivia Wiles
Isabela Albuquerque
Matthias Bauer
Su Wang
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
ReLM
201
2
0
20 Jun 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
138
39
0
19 Jun 2024
Leveraging Large Language Models for Patient Engagement: The Power of Conversational AI in Digital Health
Bo Wen
R. Norel
Julia Liu
Thaddeus Stappenbeck
F. Zulkernine
Huamin Chen
AI4MH
LM&MA
73
4
0
19 Jun 2024
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
Jianyi Zhang
Yufan Zhou
Jiuxiang Gu
Curtis Wigington
Tong Yu
Yiran Chen
Tong Sun
Ruiyi Zhang
125
0
0
17 Jun 2024
Large Scale Transfer Learning for Tabular Data via Language Modeling
Josh Gardner
Juan C. Perdomo
Ludwig Schmidt
LMTD
107
24
0
17 Jun 2024
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Alireza Ganjdanesh
Reza Shirkavand
Shangqian Gao
Heng Huang
DiffM
VLM
149
5
0
17 Jun 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Ming Meng
Yufei Zhao
Bo Zhang
Yonggui Zhu
Weimin Shi
Maxwell Wen
Zhaoxin Fan
VGen
99
2
0
15 Jun 2024
Composing Parts for Expressive Object Generation
Harsh Rangwani
Aishwarya Agarwal
Kuldeep Kulkarni
R. Venkatesh Babu
Srikrishna Karanam
DiffM
110
2
0
14 Jun 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoE
VLM
MLLM
113
17
0
13 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
122
46
0
13 Jun 2024
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
Yufan Zhou
Ruiyi Zhang
Kaizhi Zheng
Nanxuan Zhao
Jiuxiang Gu
Zichao Wang
Xin Eric Wang
Tong Sun
DiffM
46
2
0
13 Jun 2024
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation
Weixi Feng
Jiachen Li
Michael Stephen Saxon
Tsu-Jui Fu
Wenhu Chen
William Yang Wang
EGVM
VGen
73
10
0
12 Jun 2024
FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion
George Cazenavette
Avneesh Sud
Thomas Leung
Ben Usman
DiffM
80
22
0
12 Jun 2024
What If We Recaption Billions of Web Images with LLaMA-3?
Xianhang Li
Haoqin Tu
Mude Hui
Zeyu Wang
Bingchen Zhao
...
Jieru Mei
Qing Liu
Huangjie Zheng
Yuyin Zhou
Cihang Xie
VLM
MLLM
92
42
0
12 Jun 2024
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences
Daiwei Chen
Yi Chen
Aniket Rege
Ramya Korlakai Vinayak
114
23
0
12 Jun 2024
An Image is Worth 32 Tokens for Reconstruction and Generation
Qihang Yu
Mark Weber
XueQing Deng
Xiaohui Shen
Daniel Cremers
Liang-Chieh Chen
VLM
ViT
172
104
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
134
301
0
10 Jun 2024
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
Yi Gu
Zhendong Wang
Yueqin Yin
Yujia Xie
Mingyuan Zhou
98
17
0
10 Jun 2024
OmniControlNet: Dual-stage Integration for Conditional Image Generation
Yilin Wang
Haiyang Xu
Xiang Zhang
Zeyuan Chen
Zhizhou Sha
Zirui Wang
Zhuowen Tu
VLM
88
15
0
09 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
84
17
0
08 Jun 2024
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
Lianyu Pang
Jian Yin
Baoquan Zhao
Feize Wu
Fu Lee Wang
Qing Li
Xudong Mao
DiffM
97
1
0
07 Jun 2024
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Sanjoy Chowdhury
Sayan Nag
K. J. Joseph
Balaji Vasan Srinivasan
Dinesh Manocha
DiffM
89
8
0
07 Jun 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Yang Sui
Yanyu Li
Anil Kag
Yerlan Idelbayev
Junli Cao
Ju Hu
Dhritiman Sagar
Bo Yuan
Sergey Tulyakov
Jian Ren
MQ
94
22
0
06 Jun 2024
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
L. Eyring
Shyamgopal Karthik
Karsten Roth
Alexey Dosovitskiy
Zeynep Akata
164
28
0
06 Jun 2024
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan
Hayk Manukyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
103
2
0
06 Jun 2024
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang
H. Wu
Chunyi Li
Yingjie Zhou
Wei Sun
Xiongkuo Min
Zijian Chen
Xiaohong Liu
Weisi Lin
Guangtao Zhai
EGVM
145
18
0
05 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
141
5
0
04 Jun 2024
Δ
Δ
Δ
-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers
Pengtao Chen
Mingzhu Shen
Peng Ye
Jianjian Cao
Chongjun Tu
C. Bouganis
Yiren Zhao
Tao Chen
127
44
0
03 Jun 2024
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Jiatao Gu
Ying Shen
Shuangfei Zhai
Yizhe Zhang
Navdeep Jaitly
J. Susskind
123
10
0
31 May 2024
Text Guided Image Editing with Automatic Concept Locating and Forgetting
Jia Li
Lijie Hu
Zhixian He
Jingfeng Zhang
Tianhang Zheng
Di Wang
DiffM
89
9
0
30 May 2024
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Vicky Zayats
Peter Chen
Melissa Ferrari
Dirk Padfield
AI4CE
77
1
0
29 May 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
Yuhui Zhang
Alyssa Unell
Xiaohan Wang
Dhruba Ghosh
Yuchang Su
Ludwig Schmidt
Serena Yeung-Levy
VLM
94
37
0
28 May 2024
Training-free Editioning of Text-to-Image Models
Jinqi Wang
Yunfei Fu
Zhangcan Ding
Bailin Deng
Yu-Kun Lai
Yipeng Qin
DiffM
VLM
66
0
0
27 May 2024
EM Distillation for One-step Diffusion Models
Sirui Xie
Zhisheng Xiao
Diederik P. Kingma
Tingbo Hou
Ying Nian Wu
Kevin Patrick Murphy
Tim Salimans
Ben Poole
Ruiqi Gao
VLM
DiffM
115
30
0
27 May 2024
TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing
Xinyu Zhang
Mengxue Kang
Fei Wei
Shuang Xu
Yuhe Liu
Lin Ma
MLLM
DiffM
75
2
0
27 May 2024
Previous
1
2
3
...
5
6
7
...
16
17
18
Next