ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXivPDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 870 papers shown
Title
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise
  Optimization
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
L. Eyring
Shyamgopal Karthik
Karsten Roth
Alexey Dosovitskiy
Zeynep Akata
88
17
0
06 Jun 2024
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan
Hayk Manukyan
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
DiffM
61
1
0
06 Jun 2024
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang
H. Wu
Chunyi Li
Yingjie Zhou
Wei Sun
Xiongkuo Min
Zijian Chen
Xiaohong Liu
Weisi Lin
Guangtao Zhai
EGVM
72
16
0
05 Jun 2024
Edit Distance Robust Watermarks for Language Models
Edit Distance Robust Watermarks for Language Models
Noah Golowich
Ankur Moitra
AAML
WaLM
39
5
0
04 Jun 2024
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Shusuke Takahashi
Yuki Mitsufuji
VGen
54
5
0
04 Jun 2024
$Δ$-DiT: A Training-Free Acceleration Method Tailored for Diffusion
  Transformers
ΔΔΔ-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers
Pengtao Chen
Mingzhu Shen
Peng Ye
Jianjian Cao
Chongjun Tu
C. Bouganis
Yiren Zhao
Tao Chen
60
28
0
03 Jun 2024
Kaleido Diffusion: Improving Conditional Diffusion Models with
  Autoregressive Latent Modeling
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Jiatao Gu
Ying Shen
Shuangfei Zhai
Yizhe Zhang
Navdeep Jaitly
J. Susskind
57
9
0
31 May 2024
Text Guided Image Editing with Automatic Concept Locating and Forgetting
Text Guided Image Editing with Automatic Concept Locating and Forgetting
Jia Li
Lijie Hu
Zhixian He
Jingfeng Zhang
Tianhang Zheng
Di Wang
DiffM
49
9
0
30 May 2024
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Vicky Zayats
Peter Chen
Melissa Ferrari
Dirk Padfield
AI4CE
38
0
0
29 May 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
Why are Visually-Grounded Language Models Bad at Image Classification?
Yuhui Zhang
Alyssa Unell
Xiaohan Wang
Dhruba Ghosh
Yuchang Su
Ludwig Schmidt
Serena Yeung-Levy
VLM
35
27
0
28 May 2024
Training-free Editioning of Text-to-Image Models
Training-free Editioning of Text-to-Image Models
Jinqi Wang
Yunfei Fu
Zhangcan Ding
Bailin Deng
Yu-Kun Lai
Yipeng Qin
DiffM
VLM
42
0
0
27 May 2024
EM Distillation for One-step Diffusion Models
EM Distillation for One-step Diffusion Models
Sirui Xie
Zhisheng Xiao
Diederik P. Kingma
Tingbo Hou
Ying Nian Wu
Kevin Patrick Murphy
Tim Salimans
Ben Poole
Ruiqi Gao
VLM
DiffM
42
24
0
27 May 2024
TIE: Revolutionizing Text-based Image Editing for Complex-Prompt
  Following and High-Fidelity Editing
TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing
Xinyu Zhang
Mengxue Kang
Fei Wei
Shuang Xu
Yuhe Liu
Lin Ma
MLLM
DiffM
34
2
0
27 May 2024
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
C. N. Vasconcelos
Abdullah Rashwan Austin Waters
Trevor Walker
Keyang Xu
Jimmy Yan
...
Wenlei Zhou
Kevin Swersky
David J. Fleet
Jason Baldridge
Oliver Wang
46
3
0
27 May 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
44
5
0
27 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
67
2
0
27 May 2024
Towards Black-Box Membership Inference Attack for Diffusion Models
Towards Black-Box Membership Inference Attack for Diffusion Models
Jingwei Li
Jingyi Dong
Tianxing He
Jingzhao Zhang
38
3
0
25 May 2024
Towards Understanding the Working Mechanism of Text-to-Image Diffusion
  Model
Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model
Mingyang Yi
Aoxue Li
Yi Xin
Zhenguo Li
DiffM
45
12
0
24 May 2024
Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion
Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion
Aoxue Li
Mingyang Yi
Zhenguo Li
DiffM
48
0
0
24 May 2024
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
Run Luo
Yunshui Li
Longze Chen
Wanwei He
Ting-En Lin
...
Zikai Song
Xiaobo Xia
Tongliang Liu
Min Yang
Binyuan Hui
VLM
DiffM
75
15
0
24 May 2024
Improved Distribution Matching Distillation for Fast Image Synthesis
Improved Distribution Matching Distillation for Fast Image Synthesis
Tianwei Yin
Michael Gharbi
Taesung Park
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
DiffM
44
98
0
23 May 2024
EditWorld: Simulating World Dynamics for Instruction-Following Image
  Editing
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
Ling Yang
Bo-Wen Zeng
Jiaming Liu
Hong Li
Minghao Xu
Wentao Zhang
Shuicheng Yan
DiffM
39
10
0
23 May 2024
Learning Multi-dimensional Human Preference for Text-to-Image Generation
Learning Multi-dimensional Human Preference for Text-to-Image Generation
Sixian Zhang
Bohan Wang
Junqiang Wu
Yan Li
Tingting Gao
Di Zhang
Zhongyuan Wang
EGVM
53
30
0
23 May 2024
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Zhicheng Sun
Zhenhao Yang
Yang Jin
Haozhe Chi
Kun Xu
...
Hao Jiang
Di Zhang
Yang Song
Kun Gai
Yadong Mu
37
3
0
23 May 2024
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models
Katherine Xu
Lingzhi Zhang
Jianbo Shi
46
12
0
23 May 2024
Robust Disaster Assessment from Aerial Imagery Using Text-to-Image
  Synthetic Data
Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
Tarun Kalluri
Jihyeon Janel Lee
Kihyuk Sohn
Sahil Singla
Manmohan Chandraker
Joseph Z. Xu
Jeremiah Liu
49
1
0
22 May 2024
A Versatile Diffusion Transformer with Mixture of Noise Levels for
  Audiovisual Generation
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
Gwanghyun Kim
Alonso Martinez
Yu-Chuan Su
Brendan Jou
José Lezama
...
Lijun Yu
Lu Jiang
A. Jansen
Jacob Walker
Krishna Somandepalli
32
8
0
22 May 2024
How to Trace Latent Generative Model Generated Images without Artificial
  Watermark?
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Zhenting Wang
Vikash Sehwag
Chen Chen
Lingjuan Lyu
Dimitris N. Metaxas
Shiqing Ma
WIGM
38
5
0
22 May 2024
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and
  Next-Token Prediction
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
Maciej Kilian
Varun Jampani
Luke Zettlemoyer
DiffM
32
8
0
21 May 2024
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in
  Large-Scale AI Models
OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models
Zhaojian Yu
Yinghao Wu
Zhuotao Deng
Yansong Tang
Xiao-Ping Zhang
52
2
0
21 May 2024
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against
  Both Textual Filters and Visual Checkers
UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
Duo Peng
Qi Ke
Jun Liu
30
4
0
18 May 2024
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion
  Models via Watermark LoRA
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
Weitao Feng
Wenbo Zhou
Jiyan He
Jie Zhang
Tianyi Wei
Guanlin Li
Tianwei Zhang
Weiming Zhang
Neng H. Yu
34
18
0
18 May 2024
Compositional Text-to-Image Generation with Dense Blob Representations
Compositional Text-to-Image Generation with Dense Blob Representations
Weili Nie
Sifei Liu
Morteza Mardani
Chao Liu
Benjamin Eckart
Arash Vahdat
DiffM
86
17
0
14 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
44
2
0
11 May 2024
Distilling Diffusion Models into Conditional GANs
Distilling Diffusion Models into Conditional GANs
Minguk Kang
Richard Zhang
Connelly Barnes
Sylvain Paris
Suha Kwak
Jaesik Park
Eli Shechtman
Jun-Yan Zhu
Taesung Park
46
37
0
09 May 2024
FlexEControl: Flexible and Efficient Multimodal Control for
  Text-to-Image Generation
FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation
Xuehai He
Jian Zheng
Jacob Zhiyuan Fang
Robinson Piramuthu
Mohit Bansal
Vicente Ordonez
Gunnar A. Sigurdsson
Nanyun Peng
Xin Eric Wang
DiffM
53
1
0
08 May 2024
Generated Contents Enrichment
Generated Contents Enrichment
Mahdi Naseri
Jiayan Qiu
Zhou Wang
37
0
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
87
38
0
06 May 2024
Auto-Encoding Morph-Tokens for Multimodal LLM
Auto-Encoding Morph-Tokens for Multimodal LLM
Kaihang Pan
Siliang Tang
Juncheng Li
Zhaoyu Fan
Wei Chow
Shuicheng Yan
Tat-Seng Chua
Yueting Zhuang
Hanwang Zhang
MLLM
35
18
0
03 May 2024
Customizing Text-to-Image Models with a Single Image Pair
Customizing Text-to-Image Models with a Single Image Pair
Maxwell Jones
Sheng-Yu Wang
Nupur Kumari
David Bau
Jun-Yan Zhu
DiffM
25
19
0
02 May 2024
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Kelvin C. K. Chan
Yang Zhao
Xuhui Jia
Ming-Hsuan Yang
Huisheng Wang
22
3
0
02 May 2024
DOCCI: Descriptions of Connected and Contrasting Images
DOCCI: Descriptions of Connected and Contrasting Images
Yasumasa Onoe
Sunayana Rane
Zachary Berger
Yonatan Bitton
Jaemin Cho
...
Zarana Parekh
Jordi Pont-Tuset
Garrett Tanzer
Su Wang
Jason Baldridge
41
48
0
30 Apr 2024
Stylus: Automatic Adapter Selection for Diffusion Models
Stylus: Automatic Adapter Selection for Diffusion Models
Michael Luo
Justin Wong
Brandon Trabucco
Yanping Huang
Joseph E. Gonzalez
Zhifeng Chen
Ruslan Salakhutdinov
Ion Stoica
DiffM
45
6
0
29 Apr 2024
SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse
  Attributes
SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes
Georgia Baltsou
Ioannis Sarridis
C. Koutlis
Symeon Papadopoulos
34
2
0
26 Apr 2024
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Olivia Wiles
Chuhan Zhang
Isabela Albuquerque
Ivana Kajić
Su Wang
...
Jordi Pont-Tuset
Aida Nematzadeh
Anant Nawalgaria
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
135
14
0
25 Apr 2024
From Parts to Whole: A Unified Reference Framework for Controllable
  Human Image Generation
From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
Zehuan Huang
Hongxing Fan
Lipeng Wang
Lu Sheng
DiffM
42
10
0
23 Apr 2024
Do not think pink elephant!
Do not think pink elephant!
Kyomin Hwang
Suyoung Kim
Junhoo Lee
Nojun Kwak
19
1
0
22 Apr 2024
Accelerating Image Generation with Sub-path Linear Approximation Model
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu
Tian-Shu Song
Weixin Feng
Xubin Li
Tiezheng Ge
Bo Zheng
Limin Wang
42
11
0
22 Apr 2024
Iteratively Prompting Multimodal LLMs to Reproduce Natural and
  AI-Generated Images
Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Ali Naseh
Katherine Thai
Mohit Iyyer
Amir Houmansadr
47
6
0
21 Apr 2024
GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I
  Diffusion Models
GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I Diffusion Models
Sai Sree Harsha
Ambareesh Revanur
Dhwanit Agarwal
Shradha Agrawal
VGen
DiffM
48
3
0
18 Apr 2024
Previous
123...567...161718
Next