ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXiv (abs)PDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 899 papers shown
Title
Better Generalization with Semantic IDs: A Case Study in Ranking for
  Recommendations
Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
Anima Singh
Trung Vu
Nikhil Mehta
Raghunandan H. Keshavan
M. Sathiamoorthy
...
Lukasz Heldt
Li Wei
Devansh Tandon
Ed H. Chi
Xinyang Yi
84
24
0
13 Jun 2023
Dynamically Masked Discriminator for Generative Adversarial Networks
Dynamically Masked Discriminator for Generative Adversarial Networks
Wentian Zhang
Haozhe Liu
Bing Li
Jinheng Xie
Yawen Huang
Yuexiang Li
Yefeng Zheng
Guohao Li
TTA
85
2
0
13 Jun 2023
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing
  with Pre-Trained Diffusion Model
Paste, Inpaint and Harmonize via Denoising: Subject-Driven Image Editing with Pre-Trained Diffusion Model
Xinyu Zhang
Jiaxian Guo
Paul D. Yoo
Yutaka Matsuo
Yusuke Iwasawa
DiffM
110
22
0
13 Jun 2023
Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Controlling Text-to-Image Diffusion by Orthogonal Finetuning
Zeju Qiu
Wei-yu Liu
Haiwen Feng
Yuxuan Xue
Yao Feng
Zhen Liu
Dan Zhang
Adrian Weller
Bernhard Schölkopf
DiffM
126
158
0
12 Jun 2023
Fill-Up: Balancing Long-Tailed Data with Generative Models
Fill-Up: Balancing Long-Tailed Data with Generative Models
Joonghyuk Shin
Minguk Kang
Jaesik Park
109
33
0
12 Jun 2023
Face0: Instantaneously Conditioning a Text-to-Image Model on a Face
Face0: Instantaneously Conditioning a Text-to-Image Model on a Face
Dani Valevski
Danny Lumen
Yossi Matias
Yaniv Leviathan
DiffMVLM
77
77
0
11 Jun 2023
High-Fidelity Audio Compression with Improved RVQGAN
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
126
338
0
11 Jun 2023
Image Vectorization: a Review
Image Vectorization: a Review
Maria Dziuba
Ivan Jarsky
Valeria Efimova
Andrey Filchenkov
3DVDiffM
64
10
0
10 Jun 2023
Grounded Text-to-Image Synthesis with Attention Refocusing
Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
117
113
0
08 Jun 2023
Improving Tuning-Free Real Image Editing with Proximal Guidance
Improving Tuning-Free Real Image Editing with Proximal Guidance
Ligong Han
Song Wen
Qi Chen
Zhixing Zhang
Kunpeng Song
...
Qilong Zhangli
Jindong Jiang
Zhaoyang Xia
Akash Srivastava
Dimitris N. Metaxas
DiffM
115
63
0
08 Jun 2023
AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment
AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment
Chunyi Li
Zicheng Zhang
Haoning Wu
Wei Sun
Xiongkuo Min
Xiaohong Liu
Guangtao Zhai
Weisi Lin
EGVM
82
124
0
07 Jun 2023
A survey of Generative AI Applications
A survey of Generative AI Applications
Roberto Gozalo-Brizuela
Eduardo C. Garrido-Merchán
3DVMedIm
106
91
0
05 Jun 2023
Efficient Text-Guided 3D-Aware Portrait Generation with Score
  Distillation Sampling on Distribution
Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution
Yiji Cheng
Fei Yin
Xiaoke Huang
Xintong Yu
Jiaxiang Liu
Shi Feng
Yujiu Yang
Yansong Tang
DiffM
76
5
0
03 Jun 2023
Probabilistic Adaptation of Text-to-Video Models
Probabilistic Adaptation of Text-to-Video Models
Mengjiao Yang
Yilun Du
Bo Dai
Dale Schuurmans
J. Tenenbaum
Pieter Abbeel
VGenDiffM
137
26
0
02 Jun 2023
GANs Settle Scores!
GANs Settle Scores!
Siddarth Asokan
Nishanth Shetty
Aadithya Srikanth
C. Seelamantula
81
0
0
02 Jun 2023
KL-Divergence Guided Temperature Sampling
KL-Divergence Guided Temperature Sampling
Chung-Ching Chang
David Reitter
Renat Aksitov
Yun-hsuan Sung
HILM
63
7
0
02 Jun 2023
Diffusion Self-Guidance for Controllable Image Generation
Diffusion Self-Guidance for Controllable Image Generation
Dave Epstein
Allan Jabri
Ben Poole
Alexei A. Efros
Aleksander Holynski
111
266
0
01 Jun 2023
StyleDrop: Text-to-Image Generation in Any Style
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn
Nataniel Ruiz
Kimin Lee
Daniel Castro Chin
Irina Blok
...
Yuanzhen Li
Yuan Hao
Irfan Essa
Michael Rubinstein
Dilip Krishnan
70
152
0
01 Jun 2023
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual
  Representation Learners
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Yonglong Tian
Lijie Fan
Phillip Isola
Huiwen Chang
Dilip Krishnan
VLMDiffM
152
153
0
01 Jun 2023
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two
  Seconds
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds
Yanyu Li
Huan Wang
Qing Jin
Ju Hu
Pavlo Chemerys
Yun Fu
Yanzhi Wang
Sergey Tulyakov
Jian Ren
VLM
134
165
0
01 Jun 2023
ViCo: Plug-and-play Visual Condition for Personalized Text-to-image
  Generation
ViCo: Plug-and-play Visual Condition for Personalized Text-to-image Generation
Shaozhe Hao
Kai Han
Shihao Zhao
Kwan-Yee K. Wong
88
10
0
01 Jun 2023
The Hidden Language of Diffusion Models
The Hidden Language of Diffusion Models
Hila Chefer
Oran Lang
Mor Geva
Volodymyr Polosukhin
Assaf Shocher
Michal Irani
Inbar Mosseri
Lior Wolf
DiffM
117
27
0
01 Jun 2023
T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image
  Generation
T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation
Jialu Wang
Xinyue Liu
Zonglin Di
Yongxu Liu
Xin Eric Wang
65
36
0
01 Jun 2023
Learning Disentangled Prompts for Compositional Image Synthesis
Learning Disentangled Prompts for Compositional Image Synthesis
Kihyuk Sohn
Albert Eaton Shaw
Yuan Hao
Han Zhang
Luisa F. Polanía
Huiwen Chang
Lu Jiang
Irfan Essa
VLM
63
8
0
01 Jun 2023
Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image
  Diffusion Models
Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Mats L. Richter
Christopher Pal
Marc Aubreville
DiffMVLM
111
45
0
01 Jun 2023
Cones 2: Customizable Image Synthesis with Multiple Subjects
Cones 2: Customizable Image Synthesis with Multiple Subjects
Zhiheng Liu
Yifei Zhang
Yujun Shen
Kecheng Zheng
Kai Zhu
Ruili Feng
Yu Liu
Deli Zhao
Jingren Zhou
Yang Cao
DiffM
104
81
0
30 May 2023
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for
  Text-driven Video Editing
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing
Nazmul Karim
Umar Khalid
M. Joneidi
Chen Chen
Nazanin Rahnavard
DiffMVGen
70
5
0
30 May 2023
Controllable Text-to-Image Generation with GPT-4
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Eric Wang
DiffM
150
44
0
29 May 2023
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept
  Customization of Diffusion Models
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
Yuchao Gu
Xintao Wang
Jay Zhangjie Wu
Yujun Shi
Yunpeng Chen
...
Shuning Chang
Wei Wu
Yixiao Ge
Ying Shan
Mike Zheng Shou
DiffM
144
177
0
29 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGenDiffM
117
93
0
29 May 2023
COMCAT: Towards Efficient Compression and Customization of
  Attention-Based Vision Models
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Jinqi Xiao
Miao Yin
Yu Gong
Xiao Zang
Jian Ren
Bo Yuan
VLMViT
115
9
0
26 May 2023
Generating Images with Multimodal Language Models
Generating Images with Multimodal Language Models
Jing Yu Koh
Daniel Fried
Ruslan Salakhutdinov
MLLM
162
259
0
26 May 2023
High-Fidelity Image Compression with Score-based Generative Models
High-Fidelity Image Compression with Score-based Generative Models
Emiel Hoogeboom
E. Agustsson
Fabian Mentzer
Luca Versari
G. Toderici
Lucas Theis
DiffM
93
44
0
26 May 2023
Improved Visual Story Generation with Adaptive Context Modeling
Improved Visual Story Generation with Adaptive Context Modeling
Zhangyin Feng
Yuchen Ren
Xinmiao Yu
Xiaocheng Feng
Duyu Tang
Shuming Shi
Bing Qin
DiffM
81
15
0
26 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
115
268
0
25 May 2023
Image as First-Order Norm+Linear Autoregression: Unveiling Mathematical
  Invariance
Image as First-Order Norm+Linear Autoregression: Unveiling Mathematical Invariance
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Lu Yuan
Zicheng Liu
Youzuo Lin
101
2
0
25 May 2023
Break-A-Scene: Extracting Multiple Concepts from a Single Image
Break-A-Scene: Extracting Multiple Concepts from a Single Image
Omri Avrahami
Kfir Aberman
Ohad Fried
Daniel Cohen-Or
Dani Lischinski
VLMDiffM
106
178
0
25 May 2023
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion
  Models
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Xingqian Xu
Jiayi Guo
Zhangyang Wang
Gao Huang
Irfan Essa
Humphrey Shi
VLMDiffM
127
62
0
25 May 2023
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Ibrahim Ethem Hamamci
Sezgin Er
Anjany Sekuboyina
Enis Simsar
A. Tezcan
...
Hadrien Reynaud
Sarthak Pati
Christian Bluethgen
M. K. Özdemir
Bjoern Menze
DiffMMedIm
121
24
0
25 May 2023
A Neural Space-Time Representation for Text-to-Image Personalization
A Neural Space-Time Representation for Text-to-Image Personalization
Yuval Alaluf
Elad Richardson
G. Metzer
Daniel Cohen-Or
DiffM
104
100
0
24 May 2023
Visual Programming for Text-to-Image Generation and Evaluation
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
MLLM
119
51
0
24 May 2023
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create
  Visual Metaphors
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors
Tuhin Chakrabarty
Arkadiy Saakyan
Olivia Winn
Artemis Panagopoulou
Yue Yang
Marianna Apidianaki
Smaranda Muresan
DiffM
76
44
0
24 May 2023
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
117
7
0
24 May 2023
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic
  Correspondence
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence
Grace Luo
Lisa Dunlap
Dong Huk Park
Aleksander Holynski
Trevor Darrell
133
136
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
56
2
0
23 May 2023
Training Priors Predict Text-To-Image Model Performance
Training Priors Predict Text-To-Image Model Performance
Charles Lovering
Ellie Pavlick
CoGe
78
3
0
23 May 2023
Enhancing Detail Preservation for Customized Text-to-Image Generation: A
  Regularization-Free Approach
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach
Yufan Zhou
Ruiyi Zhang
Tongfei Sun
Jinhui Xu
DiffM
144
40
0
23 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based
  Text-to-Image Generation by Selection
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
91
21
0
22 May 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGenDiffM
124
254
0
22 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLMSyDa
129
61
0
22 May 2023
Previous
123...121314...161718
Next