Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 899 papers shown
Title
Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
Xiaodong Wang
Chenfei Wu
S. Yin
Minheng Ni
Jianfeng Wang
...
Fan Yang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
DiffM
88
9
0
21 Feb 2023
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Lianghua Huang
Di Chen
Yu Liu
Yujun Shen
Deli Zhao
Jingren Zhou
DiffM
110
292
0
20 Feb 2023
Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales
Martin Ruskov
DiffM
72
17
0
17 Feb 2023
Text-driven Visual Synthesis with Latent Diffusion Prior
Tingbo Liao
Songwei Ge
Yiran Xu
Yao-Chih Lee
Badour Albahar
Jia-Bin Huang
DiffM
78
6
0
16 Feb 2023
Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic Dimension
Henry Kvinge
Davis Brown
Charles Godfrey
DiffM
47
6
0
16 Feb 2023
DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization
Zhiqing Sun
Yiming Yang
DiffM
103
133
0
16 Feb 2023
PRedItOR: Text Guided Image Editing with Diffusion Prior
Hareesh Ravi
Sachin Kelkar
Midhun Harikumar
Ajinkya Kale
DiffM
107
12
0
15 Feb 2023
Self-Organising Neural Discrete Representation Learning à la Kohonen
Kazuki Irie
Róbert Csordás
Jürgen Schmidhuber
SSL
77
1
0
15 Feb 2023
From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten
Derya Soydaner
80
25
0
14 Feb 2023
VQ3D: Learning a 3D-Aware Generative Model on ImageNet
Kyle Sargent
Jing Yu Koh
Han Zhang
Huiwen Chang
Charles Herrmann
Pratul P. Srinivasan
Jiajun Wu
Deqing Sun
106
31
0
14 Feb 2023
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Binyang Song
Ruilin Zhou
Faez Ahmed
AI4CE
144
46
0
14 Feb 2023
MaskSketch: Unpaired Structure-guided Masked Image Generation
D. Bashkirova
José Lezama
Kihyuk Sohn
Kate Saenko
Irfan Essa
DiffM
60
25
0
10 Feb 2023
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
184
614
0
10 Feb 2023
Noise2Music: Text-conditioned Music Generation with Diffusion Models
Qingqing Huang
Daniel S. Park
Tao Wang
Timo I. Denk
Andy Ly
...
Jesse Engel
Quoc V. Le
William Chan
Zhifeng Chen
Wei Han
MGen
DiffM
115
202
0
08 Feb 2023
Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models
Hyeonho Jeong
Gihyun Kwon
Jong Chul Ye
77
23
0
08 Feb 2023
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Felix Friedrich
Manuel Brack
Lukas Struppek
Dominik Hintersdorf
P. Schramowski
Sasha Luccioni
Kristian Kersting
139
126
0
07 Feb 2023
Zero-shot Image-to-Image Translation
Gaurav Parmar
Krishna Kumar Singh
Richard Y. Zhang
Yijun Li
Jingwan Lu
Jun-Yan Zhu
DiffM
124
454
0
06 Feb 2023
Structure and Content-Guided Video Synthesis with Diffusion Models
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
DiffM
VGen
191
539
0
06 Feb 2023
Eliminating Contextual Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion
Zuopeng Yang
Tianshu Chu
Xin Lin
Erdun Gao
Daqing Liu
J. Yang
Chaoyue Wang
DiffM
72
21
0
05 Feb 2023
Dreamix: Video Diffusion Models are General Video Editors
Eyal Molad
Eliahu Horwitz
Dani Valevski
Alex Rav-Acha
Yossi Matias
Yael Pritch
Yaniv Leviathan
Yedid Hoshen
DiffM
VGen
131
188
0
02 Feb 2023
Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Hao Liu
Wilson Yan
Pieter Abbeel
99
25
0
02 Feb 2023
Grounding Language Models to Images for Multimodal Inputs and Outputs
Jing Yu Koh
Ruslan Salakhutdinov
Daniel Fried
MLLM
129
123
0
31 Jan 2023
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Hila Chefer
Yuval Alaluf
Yael Vinker
Lior Wolf
Daniel Cohen-Or
DiffM
179
519
0
31 Jan 2023
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao
Bingkun Bao
Hao Tang
Changsheng Xu
DiffM
VLM
117
109
0
30 Jan 2023
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
152
451
0
26 Jan 2023
Text-To-4D Dynamic Scene Generation
Uriel Singer
Shelly Sheynin
Adam Polyak
Oron Ashual
Iurii Makarov
...
Naman Goyal
Andrea Vedaldi
Devi Parikh
Justin Johnson
Yaniv Taigman
DiffM
105
156
0
26 Jan 2023
Simple diffusion: End-to-end diffusion for high resolution images
Emiel Hoogeboom
Jonathan Heek
Tim Salimans
108
268
0
26 Jan 2023
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Axel Sauer
Tero Karras
S. Laine
Andreas Geiger
Timo Aila
96
218
0
23 Jan 2023
Regeneration Learning: A Learning Paradigm for Data Generation
Xu Tan
Tao Qin
Jiang Bian
Tie-Yan Liu
Yoshua Bengio
GAN
64
15
0
21 Jan 2023
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
148
603
1
17 Jan 2023
Open-vocabulary Object Segmentation with Diffusion Models
Ziyi Li
Qinye Zhou
Xiaoyun Zhang
Ya Zhang
Yanfeng Wang
Weidi Xie
VLM
145
65
0
12 Jan 2023
Latent Autoregressive Source Separation
Emilian Postolache
Giorgio Mariani
Michele Mancusi
Andrea Santilli
Luca Cosmo
Emanuele Rodolà
BDL
DRL
63
10
0
09 Jan 2023
MAQA: A Multimodal QA Benchmark for Negation
Judith Yue Li
Aren Jansen
Qingqing Huang
Joonseok Lee
Ravi Ganti
Dima Kuzmin
81
5
0
09 Jan 2023
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng
Ziyang Chen
Andrew Owens
85
78
0
04 Jan 2023
Attribute-Centric Compositional Text-to-Image Generation
Yuren Cong
Martin Renqiang Min
Erran L. Li
Bodo Rosenhahn
M. Yang
114
13
0
04 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
278
560
0
02 Jan 2023
Multi-Realism Image Compression with a Conditional Generator
E. Agustsson
David C. Minnen
G. Toderici
Fabian Mentzer
101
75
0
28 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
82
12
0
23 Dec 2022
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
Wynne Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
177
752
0
22 Dec 2022
Character-Aware Models Improve Visual Text Rendering
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
117
74
0
20 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
116
72
0
20 Dec 2022
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
161
2,439
0
19 Dec 2022
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol
Heewoo Jun
Prafulla Dhariwal
Pamela Mishkin
Mark Chen
DiffM
139
613
0
16 Dec 2022
CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen
Basil Mustafa
N. Houlsby
CLIP
VLM
102
49
0
15 Dec 2022
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang
Chitwan Saharia
Ceslee Montgomery
Jordi Pont-Tuset
Shai Noy
...
Radu Soricut
Jason Baldridge
Mohammad Norouzi
Peter Anderson
William Chan
98
188
0
13 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
79
7
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
121
248
0
10 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Xinze Wang
William Yang Wang
CoGe
191
318
0
09 Dec 2022
Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari
Bin Zhang
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
233
877
0
08 Dec 2022
Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song
Ligong Han
Bingchen Liu
Dimitris N. Metaxas
Ahmed Elgammal
DiffM
120
35
0
08 Dec 2022
Previous
1
2
3
...
15
16
17
18
Next