ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 874 papers shown
Title
ControlNet-XS: Designing an Efficient and Effective Architecture for
  Controlling Text-to-Image Diffusion Models
ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models
Denis Zavadski
Johann-Friedrich Feiden
Carsten Rother
DiffM
48
10
0
11 Dec 2023
Stellar: Systematic Evaluation of Human-Centric Personalized
  Text-to-Image Methods
Stellar: Systematic Evaluation of Human-Centric Personalized Text-to-Image Methods
Panos Achlioptas
Alexandros Benetatos
Iordanis Fostiropoulos
Dimitris Skourtis
29
8
0
11 Dec 2023
TabMT: Generating tabular data with masked transformers
TabMT: Generating tabular data with masked transformers
Manbir Gulati
Paul F. Roysdon
LMTD
50
33
0
11 Dec 2023
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image
  Diffusion Models
CONFORM: Contrast is All You Need For High-Fidelity Text-to-Image Diffusion Models
Tuna Han Salih Meral
Enis Simsar
Federico Tombari
Pinar Yanardag
DiffM
VLM
44
26
0
11 Dec 2023
Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion
  Models
Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models
Zhipeng Bao
Yijun Li
Krishna Kumar Singh
Yu-Xiong Wang
Martial Hebert
35
1
0
10 Dec 2023
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization
  Inversion for Zero-Shot Video Editing
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
Maomao Li
Yu Li
Tianyu Yang
Yunfei Liu
Dongxu Yue
Zhihui Lin
Dong Xu
VGen
31
8
0
10 Dec 2023
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained
  Object Insertion and Layout Control
SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control
Jaskirat Singh
Jianming Zhang
Qing Liu
Cameron Smith
Zhe-nan Lin
Liang Zheng
DiffM
34
11
0
08 Dec 2023
Scaling Laws of Synthetic Images for Model Training ... for Now
Scaling Laws of Synthetic Images for Model Training ... for Now
Lijie Fan
Kaifeng Chen
Dilip Krishnan
Dina Katabi
Phillip Isola
Yonglong Tian
CLIP
VLM
49
62
0
07 Dec 2023
Gen2Det: Generate to Detect
Gen2Det: Generate to Detect
Saksham Suri
Fanyi Xiao
Animesh Sinha
Sean Culatana
Raghuraman Krishnamoorthi
Chenchen Zhu
Abhinav Shrivastava
VLM
DiffM
29
9
0
07 Dec 2023
GenTron: Diffusion Transformers for Image and Video Generation
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen
Mengmeng Xu
Jiawei Ren
Yuren Cong
Sen He
Yanping Xie
Animesh Sinha
Ping Luo
Tao Xiang
Juan-Manuel Perez-Rua
VGen
39
38
0
07 Dec 2023
Generating Illustrated Instructions
Generating Illustrated Instructions
Sachit Menon
Ishan Misra
Rohit Girdhar
DiffM
34
4
0
07 Dec 2023
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion
  Models for Text-to-Image Synthesis
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis
Youngwan Lee
Kwanyong Park
Yoorhim Cho
Yong-Ju Lee
Sung Ju Hwang
VLM
29
4
0
07 Dec 2023
Diffusion Illusions: Hiding Images in Plain Sight
Diffusion Illusions: Hiding Images in Plain Sight
R. Burgert
Xiang Li
Abe Leite
Kanchana Ranasinghe
Michael S. Ryoo
55
17
0
06 Dec 2023
Language-Informed Visual Concept Learning
Language-Informed Visual Concept Learning
Sharon Lee
Yunzhi Zhang
Shangzhe Wu
Jiajun Wu
CoGe
29
9
0
06 Dec 2023
Make-A-Storyboard: A General Framework for Storyboard with Disentangled
  and Merged Control
Make-A-Storyboard: A General Framework for Storyboard with Disentangled and Merged Control
Sitong Su
Litao Guo
Lianli Gao
Hengtao Shen
Jingkuan Song
DiffM
35
3
0
06 Dec 2023
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer
Bichen Wu
Edgar Schoenfeld
Xiaoliang Dai
Ji Hou
...
Jonas Kohler
Christian Rupprecht
Daniel Cremers
Peter Vajda
Jialiang Wang
DiffM
38
58
0
06 Dec 2023
FERGI: Automatic Annotation of User Preferences for Text-to-Image
  Generation from Spontaneous Facial Expression Reaction
FERGI: Automatic Annotation of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction
Shuangquan Feng
Junhua Ma
Virginia R. de Sa
EGVM
29
0
0
05 Dec 2023
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon
Yonatan Bitton
Yonatan Shafir
Roopal Garg
Xi Chen
Dani Lischinski
Daniel Cohen-Or
Idan Szpektor
44
11
0
05 Dec 2023
Describing Differences in Image Sets with Natural Language
Describing Differences in Image Sets with Natural Language
Lisa Dunlap
Yuhui Zhang
Xiaohan Wang
Ruiqi Zhong
Trevor Darrell
Jacob Steinhardt
Joseph E. Gonzalez
Serena Yeung-Levy
CoGe
VLM
32
30
0
05 Dec 2023
MagicStick: Controllable Video Editing via Control Handle
  Transformations
MagicStick: Controllable Video Editing via Control Handle Transformations
Yue Ma
Xiaodong Cun
Yin-Yin He
Chenyang Qi
Xintao Wang
Ying Shan
Xiu Li
Qifeng Chen
VGen
14
24
0
05 Dec 2023
Customization Assistant for Text-to-image Generation
Customization Assistant for Text-to-image Generation
Yufan Zhou
Ruiyi Zhang
Jiuxiang Gu
Tongfei Sun
DiffM
30
11
0
05 Dec 2023
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Yankun Wu
Yuta Nakashima
Noa Garcia
28
16
0
05 Dec 2023
Orthogonal Adaptation for Modular Customization of Diffusion Models
Orthogonal Adaptation for Modular Customization of Diffusion Models
Ryan Po
Guandao Yang
Kfir Aberman
Gordon Wetzstein
DiffM
33
26
0
05 Dec 2023
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A
  Study with Unified Text-to-Image Fidelity Metrics
A Contrastive Compositional Benchmark for Text-to-Image Synthesis: A Study with Unified Text-to-Image Fidelity Metrics
Xiangru Zhu
Penglei Sun
Chengyu Wang
Jingping Liu
Zhixu Li
Yanghua Xiao
Jun Huang
CoGe
112
5
0
04 Dec 2023
InstructBooth: Instruction-following Personalized Text-to-Image
  Generation
InstructBooth: Instruction-following Personalized Text-to-Image Generation
Daewon Chae
Nokyung Park
Jinkyu Kim
Kimin Lee
DiffM
27
11
0
04 Dec 2023
GIVT: Generative Infinite-Vocabulary Transformers
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen
Cian Eastwood
Fabian Mentzer
31
34
0
04 Dec 2023
DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
Uy Dieu Tran
Minh Luu
P. Nguyen
K. Nguyen
Binh-Son Hua
40
1
0
02 Dec 2023
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting
  Activations to 3D
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey
Paul Guerrero
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy Mitra
DiffM
34
31
0
02 Dec 2023
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Enhancing Diffusion Models with 3D Perspective Geometry Constraints
Rishi Upadhyay
Howard Zhang
Yunhao Ba
Ethan Yang
Blake Gella
Sicheng Jiang
Alex Wong
A. Kadambi
24
11
0
01 Dec 2023
Segment and Caption Anything
Segment and Caption Anything
Xiaoke Huang
Jianfeng Wang
Yansong Tang
Zheng Zhang
Han Hu
Jiwen Lu
Lijuan Wang
Zicheng Liu
MLLM
VLM
34
18
0
01 Dec 2023
DeepCache: Accelerating Diffusion Models for Free
DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma
Gongfan Fang
Xinchao Wang
29
124
0
01 Dec 2023
SPOT: Self-Training with Patch-Order Permutation for Object-Centric
  Learning with Autoregressive Transformers
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
Ioannis Kakogeorgiou
Spyros Gidaris
Konstantinos Karantzalos
N. Komodakis
ViT
OCL
17
12
0
01 Dec 2023
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Sadeep Jayasumana
Srikumar Ramalingam
Andreas Veit
Daniel Glasner
Ayan Chakrabarti
Sanjiv Kumar
EGVM
24
128
0
30 Nov 2023
One-step Diffusion with Distribution Matching Distillation
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin
Michael Gharbi
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
Taesung Park
DiffM
135
226
0
30 Nov 2023
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
Sherwin Bahmani
Ivan Skorokhodov
Victor Rong
Gordon Wetzstein
Leonidas J. Guibas
Peter Wonka
Sergey Tulyakov
Jeong Joon Park
Andrea Tagliasacchi
David B. Lindell
DiffM
54
103
0
29 Nov 2023
Rethinking Image Editing Detection in the Era of Generative AI
  Revolution
Rethinking Image Editing Detection in the Era of Generative AI Revolution
Zhihao Sun
Haipeng Fang
Xinying Zhao
Danding Wang
Juan Cao
36
8
0
29 Nov 2023
DreamSync: Aligning Text-to-Image Generation with Image Understanding
  Feedback
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback
Jiao Sun
Deqing Fu
Yushi Hu
Su Wang
Royi Rassin
...
Dana Alon
Charles Herrmann
Sjoerd van Steenkiste
Ranjay Krishna
Cyrus Rashtchian
EGVM
46
40
0
29 Nov 2023
Adversarial Diffusion Distillation
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
138
337
0
28 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
34
43
0
28 Nov 2023
Reason out Your Layout: Evoking the Layout Master from Large Language
  Models for Text-to-Image Synthesis
Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis
Xiaohui Chen
Yongfei Liu
Yingxiang Yang
Jianbo Yuan
Quanzeng You
Liping Liu
Hongxia Yang
DiffM
56
11
0
28 Nov 2023
Efficient Multimodal Diffusion Models Using Joint Data Infilling with
  Partially Shared U-Net
Efficient Multimodal Diffusion Models Using Joint Data Infilling with Partially Shared U-Net
Zizhao Hu
Shaochong Jia
Mohammad Rostami
DiffM
MedIm
24
0
0
28 Nov 2023
Text-Driven Image Editing via Learnable Regions
Text-Driven Image Editing via Learnable Regions
Yuanze Lin
Yi-Wen Chen
Yi-Hsuan Tsai
Lu Jiang
Ming-Hsuan Yang
DiffM
34
16
0
28 Nov 2023
Manifold Preserving Guided Diffusion
Manifold Preserving Guided Diffusion
Yutong He
Naoki Murata
Chieh-Hsin Lai
Yuhta Takida
Toshimitsu Uesaka
...
Wei-Hsiang Liao
Yuki Mitsufuji
J. Zico Kolter
Ruslan Salakhutdinov
Stefano Ermon
DiffM
116
66
0
28 Nov 2023
IG Captioner: Information Gain Captioners are Strong Zero-shot
  Classifiers
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers
Chenglin Yang
Siyuan Qiao
Yuan Cao
Yu Zhang
Tao Zhu
Alan Yuille
Jiahui Yu
VLM
18
3
0
27 Nov 2023
Learning Disentangled Identifiers for Action-Customized Text-to-Image
  Generation
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
Siteng Huang
Biao Gong
Yutong Feng
Xi Chen
Yu Fu
Yu Liu
Donglin Wang
DiffM
29
13
0
27 Nov 2023
Check, Locate, Rectify: A Training-Free Layout Calibration System for
  Text-to-Image Generation
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Biao Gong
Siteng Huang
Yutong Feng
Shiwei Zhang
Yuyuan Li
Yu Liu
DiffM
33
11
0
27 Nov 2023
Reinforcement Learning from Diffusion Feedback: Q* for Image Search
Reinforcement Learning from Diffusion Feedback: Q* for Image Search
Aboli Rajan Marathe
VLM
47
0
0
27 Nov 2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image
  Generation
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
Yuhui Zhang
Brandon McKinzie
Zhe Gan
Vaishaal Shankar
Alexander Toshev
31
3
0
27 Nov 2023
Paragraph-to-Image Generation with Information-Enriched Diffusion Model
Paragraph-to-Image Generation with Information-Enriched Diffusion Model
Weijia Wu
Zhuang Li
Yefei He
Mike Zheng Shou
Chunhua Shen
Lele Cheng
Yan Li
Tingting Gao
Di Zhang
VLM
141
24
0
24 Nov 2023
Steal My Artworks for Fine-tuning? A Watermarking Framework for
  Detecting Art Theft Mimicry in Text-to-Image Models
Steal My Artworks for Fine-tuning? A Watermarking Framework for Detecting Art Theft Mimicry in Text-to-Image Models
Ge Luo
Junqiang Huang
Manman Zhang
Zhenxing Qian
Sheng Li
Xinpeng Zhang
WIGM
25
9
0
22 Nov 2023
Previous
123...8910...161718
Next