ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 870 papers shown
Title
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
521
0
02 Jan 2023
Multi-Realism Image Compression with a Conditional Generator
Multi-Realism Image Compression with a Conditional Generator
E. Agustsson
David C. Minnen
G. Toderici
Fabian Mentzer
17
68
0
28 Dec 2022
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
21
12
0
23 Dec 2022
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for
  Text-to-Video Generation
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Jay Zhangjie Wu
Yixiao Ge
Xintao Wang
Weixian Lei
Yuchao Gu
Yufei Shi
W. Hsu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
VGen
62
692
0
22 Dec 2022
Character-Aware Models Improve Visual Text Rendering
Character-Aware Models Improve Visual Text Rendering
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
31
71
0
20 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
51
66
0
20 Dec 2022
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
40
2,024
0
19 Dec 2022
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol
Heewoo Jun
Prafulla Dhariwal
Pamela Mishkin
Mark Chen
DiffM
47
586
0
16 Dec 2022
CLIPPO: Image-and-Language Understanding from Pixels Only
CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen
Basil Mustafa
N. Houlsby
CLIP
VLM
32
47
0
15 Dec 2022
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image
  Inpainting
Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang
Chitwan Saharia
Ceslee Montgomery
Jordi Pont-Tuset
Shai Noy
...
Radu Soricut
Jason Baldridge
Mohammad Norouzi
Peter Anderson
William Chan
35
176
0
13 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
24
7
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
38
228
0
10 Dec 2022
Training-Free Structured Diffusion Guidance for Compositional
  Text-to-Image Synthesis
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng
Xuehai He
Tsu-jui Fu
Varun Jampani
Arjun Reddy Akula
P. Narayana
Sugato Basu
Qing Guo
William Yang Wang
CoGe
51
299
0
09 Dec 2022
Multi-Concept Customization of Text-to-Image Diffusion
Multi-Concept Customization of Text-to-Image Diffusion
Nupur Kumari
Bin Zhang
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
67
828
0
08 Dec 2022
Diffusion Guided Domain Adaptation of Image Generators
Diffusion Guided Domain Adaptation of Image Generators
Kunpeng Song
Ligong Han
Bingchen Liu
Dimitris N. Metaxas
Ahmed Elgammal
DiffM
28
34
0
08 Dec 2022
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Zhixing Zhang
Ligong Han
Arna Ghosh
Dimitris N. Metaxas
Jian Ren
DiffM
51
154
0
08 Dec 2022
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using
  CLIP and StableDiffusion
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion
Hanqing Zhao
Dianmo Sheng
Jianmin Bao
Dongdong Chen
Dong Chen
...
Ce Liu
Wenbo Zhou
Qi Chu
Weiming Zhang
Neng H. Yu
VLM
DiffM
38
39
0
07 Dec 2022
Rethinking the Objectives of Vector-Quantized Tokenizers for Image
  Synthesis
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Yuchao Gu
Xintao Wang
Yixiao Ge
Ying Shan
Xiaohu Qie
Mike Zheng Shou
DiffM
32
20
0
06 Dec 2022
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
Image Inpainting via Iteratively Decoupled Probabilistic Modeling
Wenbo Li
Xin Yu
Kun Zhou
Yibing Song
Zhe-nan Lin
Jiaya Jia
DiffM
40
11
0
06 Dec 2022
M-VADER: A Model for Diffusion with Multimodal Context
M-VADER: A Model for Diffusion with Multimodal Context
Samuel Weinbach
Marco Bellagente
C. Eichenberg
Andrew M. Dai
R. Baldock
Souradeep Nanda
Bjorn Deiseroth
Koen Oostermeijer
H. Teufel
Andres Felipe Cruz Salinas
DiffM
35
11
0
06 Dec 2022
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
Gimin Nam
Mariem Khlifi
Andrew Rodriguez
Alberto Tono
Linqi Zhou
Paul Guerrero
DiffM
37
68
0
01 Dec 2022
CLIPascene: Scene Sketching with Different Types and Levels of
  Abstraction
CLIPascene: Scene Sketching with Different Types and Levels of Abstraction
Yael Vinker
Yuval Alaluf
Daniel Cohen-Or
Ariel Shamir
CLIP
29
54
0
30 Nov 2022
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
44
628
0
30 Nov 2022
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
High-Fidelity Guided Image Synthesis with Latent Diffusion Models
Jaskirat Singh
Stephen Gould
Liang Zheng
DiffM
41
40
0
30 Nov 2022
Continuous diffusion for categorical data
Continuous diffusion for categorical data
Sander Dieleman
Laurent Sartran
Arman Roshannai
Nikolay Savinov
Yaroslav Ganin
...
Conor Durkan
Curtis Hawthorne
Rémi Leblond
Will Grathwohl
J. Adler
DiffM
29
98
0
28 Nov 2022
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Minghui Hu
Chuanxia Zheng
Heliang Zheng
Tat-Jen Cham
Chaoyue Wang
Zuopeng Yang
Dacheng Tao
Ponnuthurai Nagaratnam Suganthan
DiffM
20
23
0
27 Nov 2022
SpaText: Spatio-Textual Representation for Controllable Image Generation
SpaText: Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami
Thomas Hayes
Oran Gafni
Sonal Gupta
Yaniv Taigman
Devi Parikh
Dani Lischinski
Ohad Fried
Xiaoyue Yin
DiffM
40
203
0
25 Nov 2022
3DDesigner: Towards Photorealistic 3D Object Generation and Editing with
  Text-guided Diffusion Models
3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models
Gang Li
Heliang Zheng
Chaoyue Wang
Chang Li
C. Zheng
Dacheng Tao
DiffM
26
59
0
25 Nov 2022
Shifted Diffusion for Text-to-image Generation
Shifted Diffusion for Text-to-image Generation
Yufan Zhou
Bingchen Liu
Yizhe Zhu
Xiao Yang
Changyou Chen
Jinhui Xu
DiffM
24
40
0
24 Nov 2022
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Binxin Yang
Shuyang Gu
Bo Zhang
Ting Zhang
Xuejin Chen
Xiaoyan Sun
Dong Chen
Fang Wen
DiffM
60
403
0
23 Nov 2022
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors
R. Burgert
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
VLM
34
37
0
23 Nov 2022
ReCo: Region-Controlled Text-to-Image Generation
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang
Jianfeng Wang
Zhe Gan
Linjie Li
Kevin Qinghong Lin
...
Nan Duan
Zicheng Liu
Ce Liu
Michael Zeng
Lijuan Wang
DiffM
56
140
0
23 Nov 2022
Inversion-Based Style Transfer with Diffusion Models
Inversion-Based Style Transfer with Diffusion Models
Yuxin Zhang
Nisha Huang
Fan Tang
Haibin Huang
Chongyang Ma
Weiming Dong
Changsheng Xu
DiffM
38
257
0
23 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via
  Multimodal Masked Video Generation
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-jui Fu
Licheng Yu
Ning Zhang
Cheng-Yang Fu
Jong-Chyi Su
William Yang Wang
Sean Bell
VGen
56
37
0
23 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
22
95
0
22 Nov 2022
Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark
Vitali Petsiuk
Alexander E. Siemenn
Saisamrit Surbehera
Zad Chin
Keith Tyser
...
Ori Kerret
Tonio Buonassisi
Kate Saenko
Armando Solar-Lezama
Iddo Drori
VLM
31
35
0
22 Nov 2022
SceneComposer: Any-Level Semantic Image Synthesis
SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng
Zhe-nan Lin
Jianming Zhang
Qing Liu
John Collomosse
Jason Kuen
Vishal M. Patel
DiffM
25
49
0
21 Nov 2022
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Ajay Jain
Amber Xie
Pieter Abbeel
DiffM
38
89
0
21 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models
MagicVideo: Efficient Video Generation With Latent Diffusion Models
Daquan Zhou
Weimin Wang
Hanshu Yan
Weiwei Lv
Yizhe Zhu
Jiashi Feng
DiffM
VGen
39
373
0
20 Nov 2022
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
Xichen Pan
Pengda Qin
Yuhong Li
Hui Xue
Wenhu Chen
DiffM
23
62
0
20 Nov 2022
Visual Programming: Compositional visual reasoning without training
Visual Programming: Compositional visual reasoning without training
Tanmay Gupta
Aniruddha Kembhavi
ReLM
VLM
LRM
94
405
0
18 Nov 2022
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image
  Generative Models
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models
Ninareh Mehrabi
Palash Goyal
Apurv Verma
Jwala Dhamala
Varun Kumar
Qian Hu
Kai-Wei Chang
R. Zemel
Aram Galstyan
Rahul Gupta
28
6
0
17 Nov 2022
Will Large-scale Generative Models Corrupt Future Datasets?
Will Large-scale Generative Models Corrupt Future Datasets?
Ryuichiro Hataya
Han Bao
Hiromi Arai
22
52
0
15 Nov 2022
Cross-Reality Re-Rendering: Manipulating between Digital and Physical
  Realities
Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities
Siddhartha Datta
33
0
0
15 Nov 2022
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis
  in Quantized Latent Spaces
A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces
Dominic Rampas
Pablo Pernias
Marc Aubreville
DiffM
19
11
0
14 Nov 2022
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Large-Scale Bidirectional Training for Zero-Shot Image Captioning
Taehoon Kim
Mark A Marsden
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Alessandra Sala
S. Kim
VLM
32
4
0
13 Nov 2022
SSGVS: Semantic Scene Graph-to-Video Synthesis
SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong
Jinhui Yi
Bodo Rosenhahn
M. Yang
67
7
0
11 Nov 2022
Rickrolling the Artist: Injecting Backdoors into Text Encoders for
  Text-to-Image Synthesis
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Kristian Kersting
SILM
22
36
0
04 Nov 2022
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Evaluating a Synthetic Image Dataset Generated with Stable Diffusion
Andreas Stöckl
24
21
0
03 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
61
804
0
02 Nov 2022
Previous
123...15161718
Next