Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.10789
Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"
50 / 870 papers shown
Title
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Yuhao Xu
Tao Gu
Weifeng Chen
Chengcai Chen
DiffM
35
52
0
04 Mar 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
89
180
0
29 Feb 2024
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models
Shyam Marjit
Harshit Singh
Nityanand Mathur
Sayak Paul
Chia-Mu Yu
Pin-Yu Chen
DiffM
47
2
0
27 Feb 2024
Disentangled 3D Scene Generation with Layout Learning
Dave Epstein
Ben Poole
B. Mildenhall
Alexei A. Efros
Aleksander Holynski
CoGe
OCL
3DV
48
20
0
26 Feb 2024
Contextualized Diffusion Models for Text-Guided Image and Video Generation
Ling Yang
Zhilong Zhang
Zhaochen Yu
Jingwei Liu
Minkai Xu
Stefano Ermon
Tengjiao Wang
44
4
0
26 Feb 2024
Generative AI in Vision: A Survey on Models, Metrics and Applications
Gaurav Raut
Apoorv Singh
VLM
MedIm
43
6
0
26 Feb 2024
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
Chun-Hsiao Yeh
Ta-Ying Cheng
He-Yen Hsieh
Chuan-En Lin
Yi Ma
Andrew Markham
Niki Trigoni
H. T. Kung
Yubei Chen
DiffM
25
3
0
23 Feb 2024
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
Yi Zhang
Yun Tang
Wenjie Ruan
Xiaowei Huang
Siddartha Khastgir
P. Jennings
Xingyu Zhao
AAML
35
4
0
23 Feb 2024
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace
Aliaksandr Siarohin
Ivan Skorokhodov
Ekaterina Deyneka
Tsai-Shien Chen
...
Yuwei Fang
A. Stoliar
Elisa Ricci
Jian Ren
Sergey Tulyakov
VGen
44
57
0
22 Feb 2024
Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion Models
C. Wu
Fernando de la Torre
DiffM
29
2
0
21 Feb 2024
CoFRIDA: Self-Supervised Fine-Tuning for Human-Robot Co-Painting
Peter Schaldenbrand
Gaurav Parmar
Jun-Yan Zhu
James McCann
Jean Oh
29
13
0
21 Feb 2024
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
Huizhuo Yuan
Zixiang Chen
Kaixuan Ji
Quanquan Gu
65
24
0
15 Feb 2024
L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
Yutaro Yamada
Khyathi Raghavi Chandu
Yuchen Lin
Jack Hessel
Ilker Yildirim
Yejin Choi
AI4CE
23
12
0
14 Feb 2024
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
31
62
0
13 Feb 2024
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Mateusz Lajszczak
Guillermo Cámbara
Yang Li
Fatih Beyhan
Arent van Korlaar
...
Bartosz Putrycz
Soledad López Gambino
Kayeon Yoo
Elena Sokolova
Thomas Drugman
LM&MA
38
72
0
12 Feb 2024
Rolling Diffusion Models
David Ruhe
Jonathan Heek
Tim Salimans
Emiel Hoogeboom
DiffM
35
32
0
12 Feb 2024
Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example
Aven Le Zhou
Yu-Ao Wang
Wei Wu
Kang Zhang
26
1
0
09 Feb 2024
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
Dewei Zhou
You Li
Fan Ma
Zongxin Yang
Yi Yang
DiffM
28
57
0
08 Feb 2024
CapHuman: Capture Your Moments in Parallel Universes
Chao Liang
Fan Ma
Linchao Zhu
Yingying Deng
Yi Yang
DiffM
23
22
0
01 Feb 2024
Spatial-Aware Latent Initialization for Controllable Image Generation
Wenqiang Sun
Tengtao Li
Zehong Lin
Jun Zhang
44
10
0
29 Jan 2024
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
Senthil Purushwalkam
Akash Gokul
Chenyu You
Nikhil Naik
DiffM
39
17
0
25 Jan 2024
Large-scale Reinforcement Learning for Diffusion Models
Yinan Zhang
Eric Tzeng
Yilun Du
Dmitry Kislyuk
VLM
37
31
0
20 Jan 2024
Make-A-Shape: a Ten-Million-scale 3D Shape Model
Ka-Hei Hui
Aditya Sanghi
Arianna Rampini
Kamal Rahimi Malekshan
Zhengzhe Liu
Hooman Shayani
Chi-Wing Fu
DiffM
29
17
0
20 Jan 2024
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
Changyao Tian
Xizhou Zhu
Yuwen Xiong
Weiyun Wang
Zhe Chen
...
Tong Lu
Jie Zhou
Hongsheng Li
Yu Qiao
Jifeng Dai
AuLLM
85
42
0
18 Jan 2024
DiffusionGPT: LLM-Driven Text-to-Image Generation System
Jie Qin
Jie Wu
Weifeng Chen
Yuxi Ren
Huixian Li
Hefeng Wu
Xuefeng Xiao
Rui Wang
S. Wen
DiffM
62
25
0
18 Jan 2024
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Xiaofeng Wang
Zheng Zhu
Guan Huang
Boyuan Wang
Xinze Chen
Jiwen Lu
VGen
40
32
0
18 Jan 2024
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation
Antoine Mercier
Ramin Nakhli
Mahesh Reddy
R. Yasarla
Hong Cai
Fatih Porikli
Guillaume Berger
DiffM
35
15
0
15 Jan 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee
Yinxiao Li
Junjie Ke
Innfarn Yoo
Han Zhang
...
Junfeng He
Gang Li
Sangpil Kim
Irfan Essa
Feng Yang
EGVM
41
18
0
11 Jan 2024
Concept Alignment
Sunayana Rane
Polyphony J. Bruna
Ilia Sucholutsky
Christopher Kello
Thomas Griffiths
CVBM
39
7
0
09 Jan 2024
Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models
Dingning Liu
Xiaoshui Huang
Yuenan Hou
Zhihui Wang
Zhen-fei Yin
Yongshun Gong
Peng Gao
Wanli Ouyang
29
8
0
09 Jan 2024
Improving Diffusion-Based Image Synthesis with Context Prediction
Ling Yang
Jingwei Liu
Shenda Hong
Zhilong Zhang
Zhilin Huang
Zheming Cai
Wentao Zhang
Tengjiao Wang
DiffM
46
33
0
04 Jan 2024
Instruct-Imagen: Image Generation with Multi-modal Instruction
Hexiang Hu
Kelvin C. K. Chan
Yu-Chuan Su
Wenhu Chen
Yandong Li
...
Xue Ben
Boqing Gong
William W. Cohen
Ming-Wei Chang
Xuhui Jia
MLLM
46
43
0
03 Jan 2024
Image Sculpting: Precise Object Editing with 3D Geometry Control
Jiraphon Yenphraphai
Xichen Pan
Sainan Liu
Daniele Panozzo
Saining Xie
34
19
0
02 Jan 2024
Improving Image Restoration through Removing Degradations in Textual Representations
Jingbo Lin
Zhilu Zhang
Yuxiang Wei
Dongwei Ren
Dongsheng Jiang
Wangmeng Zuo
29
26
0
28 Dec 2023
ZONE: Zero-Shot Instruction-Guided Local Editing
Shanglin Li
Bo-Wen Zeng
Yutang Feng
Sicheng Gao
Xuhui Liu
...
Li Lin
Xu Tang
Yao Hu
Jianzhuang Liu
Baochang Zhang
DiffM
36
30
0
28 Dec 2023
Semantic Guidance Tuning for Text-To-Image Diffusion Models
Hyun Kang
Dohae Lee
Myungjin Shin
In-Kwon Lee
32
1
0
26 Dec 2023
Cross Initialization for Personalized Text-to-Image Generation
Lianyu Pang
Jian Yin
Haoran Xie
Qiping Wang
Qing Li
Xudong Mao
DiffM
35
7
0
26 Dec 2023
Emage: Non-Autoregressive Text-to-Image Generation
Zhangyin Feng
Runyi Hu
Liangxin Liu
Fan Zhang
Duyu Tang
Yong Dai
Xiaocheng Feng
Jiwei Li
Bing Qin
Shuming Shi
DiffM
VLM
20
0
0
22 Dec 2023
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
Alicia Golden
Samuel Hsia
Fei Sun
Bilge Acun
Basil Hosmer
...
Zachary DeVito
Jeff Johnson
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
VLM
DiffM
37
8
0
22 Dec 2023
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Lijun Yu
Xiuye Gu
José Lezama
Jonathan Huang
...
Irfan Essa
Huisheng Wang
David A. Ross
Bryan Seybold
Lu Jiang
VGen
20
241
0
21 Dec 2023
DreamTuner: Single Image is Enough for Subject-Driven Generation
Miao Hua
Jiawei Liu
Fei Ding
Wei Liu
Jie Wu
Qian He
28
28
0
21 Dec 2023
StarVector: Generating Scalable Vector Graphics Code from Images
Juan A. Rodriguez
Shubham Agarwal
I. Laradji
Pau Rodríguez
David Vazquez
Christopher Pal
M. Pedersoli
51
6
0
17 Dec 2023
Rich Human Feedback for Text-to-Image Generation
Youwei Liang
Junfeng He
Gang Li
Peizhao Li
Arseniy Klimovskiy
...
Yiwen Luo
Yang Li
Kai Kohlhoff
Deepak Ramachandran
Vidhya Navalpakkam
EGVM
29
68
0
15 Dec 2023
HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation
Hongyu Liu
Xuan Wang
Bo Liu
Yujun Shen
Yibing Song
Jing Liao
Qifeng Chen
DiffM
48
17
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
59
177
0
11 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
50
64
0
11 Dec 2023
ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models
Denis Zavadski
Johann-Friedrich Feiden
Carsten Rother
DiffM
48
10
0
11 Dec 2023
Stellar: Systematic Evaluation of Human-Centric Personalized Text-to-Image Methods
Panos Achlioptas
Alexandros Benetatos
Iordanis Fostiropoulos
Dimitris Skourtis
29
8
0
11 Dec 2023
TabMT: Generating tabular data with masked transformers
Manbir Gulati
Paul F. Roysdon
LMTD
50
33
0
11 Dec 2023
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral
Enis Simsar
Federico Tombari
Pinar Yanardag
DiffM
VLM
44
26
0
11 Dec 2023
Previous
1
2
3
...
7
8
9
...
16
17
18
Next