ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXiv (abs)PDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

50 / 899 papers shown
Title
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Efficient Online Inference of Vision Transformers by Training-Free Tokenization
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
201
0
0
01 Jul 2025
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions
Manuel Brack
Sudeep Katakol
Felix Friedrich
P. Schramowski
Hareesh Ravi
Kristian Kersting
Ajinkya Kale
20
0
0
20 Jun 2025
Reward-Agnostic Prompt Optimization for Text-to-Image Diffusion Models
Reward-Agnostic Prompt Optimization for Text-to-Image Diffusion Models
Semin Kim
Yeonwoo Cha
Jaehoon Yoo
Seunghoon Hong
EGVM
30
0
0
20 Jun 2025
Watermarking Autoregressive Image Generation
Watermarking Autoregressive Image Generation
Nikola Jovanović
Ismail Labiad
Tomáš Souček
Martin Vechev
Pierre Fernandez
WIGM
40
0
0
19 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
52
0
0
18 Jun 2025
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
Black Forest Labs
Stephen Batifol
A. Blattmann
Frederic Boesel
Saksham Consul
...
Dustin Podell
Robin Rombach
Harry Saini
Axel Sauer
Luke Smith
DiffM
25
0
0
17 Jun 2025
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
Shang-Chi Tsai
Seiya Kawano
Angel García Contreras
Koichiro Yoshino
Yun-Nung Chen
LM&Ro
38
2
0
16 Jun 2025
SpectralAR: Spectral Autoregressive Visual Generation
SpectralAR: Spectral Autoregressive Visual Generation
Yuanhui Huang
Weiliang Chen
Wenzhao Zheng
Yueqi Duan
Jie Zhou
Jiwen Lu
DiffMVGen
117
0
0
12 Jun 2025
LeVo: High-Quality Song Generation with Multi-Preference Alignment
LeVo: High-Quality Song Generation with Multi-Preference Alignment
Shun Lei
Yaoxun Xu
Zhiwei Lin
Huaicheng Zhang
Wei Tan
...
Chenyu Yang
Haina Zhu
Shuai Wang
Zhiyong Wu
Dong Yu
47
0
0
09 Jun 2025
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Jingjing Chang
Yixiao Fang
Peng Xing
Shuhan Wu
Wei Cheng
Rui Wang
Xianfang Zeng
Gang Yu
H. Chen
EGVMVLM
32
0
0
09 Jun 2025
CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems
Aniket Rege
Zinnia Nie
Mahesh Ramesh
Unmesh Raskar
Zhuoran Yu
Aditya Kusupati
Yong Jae Lee
Ramya Korlakai Vinayak
26
0
0
09 Jun 2025
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu
Tianrong Chen
David Berthelot
Huangjie Zheng
Yuyang Wang
Ruixiang Zhang
Laurent Dinh
Miguel Angel Bautista
Josh Susskind
Shuangfei Zhai
45
0
0
06 Jun 2025
Improving AI-generated music with user-guided training
Vishwa Mohan Singh
Sai Anirudh Aryasomayajula
Ahan Chatterjee
Beste Aydemir
Rifat Mehreen Amin
94
0
0
05 Jun 2025
How Far Are We from Predicting Missing Modalities with Foundation Models?
Guanzhou Ke
Yi Xie
Xiaoli Wang
Guoqing Chao
Bo Wang
Shengfeng He
VLM
105
0
0
04 Jun 2025
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
Hermann Kumbong
Xian Liu
Tsung-Yi Lin
Ming-Yu Liu
Xihui Liu
Ziwei Liu
Daniel Y. Fu
Christopher Ré
David W. Romero
DiffM
50
0
0
04 Jun 2025
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
Yunhong Lu
Qichao Wang
H. Cao
Xiaoyin Xu
Min Zhang
49
0
0
03 Jun 2025
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
Mingzhe Li
Gehao Zhang
Zhenting Wang
Shiqing Ma
Siqi Pan
Richard Cartwright
Juan Zhai
DiffM
54
0
0
03 Jun 2025
Native-Resolution Image Synthesis
Native-Resolution Image Synthesis
Zidong Wang
Lei Bai
Xiangyu Yue
Wanli Ouyang
Yiyuan Zhang
74
0
0
03 Jun 2025
Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models
Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models
Hongtao Huang
Xiaojun Chang
L. Yao
MedIm
55
0
0
03 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
29
0
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
52
0
0
02 Jun 2025
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
S. Li
Lei Wang
Kai Wang
Tao Liu
J. Xie
Joost van de Weijer
Fahad Shahbaz Khan
Shiqi Yang
Yaxing Wang
Jian Yang
64
0
0
28 May 2025
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
Jiru Deng
Tengjin Weng
Tianyu Yang
Wenhan Luo
Zhiheng Li
Wenhao Jiang
VLM
149
0
0
26 May 2025
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models
Hang Hua
Ziyun Zeng
Yizhi Song
Yunlong Tang
Liu He
Daniel G. Aliaga
Wei Xiong
Jiebo Luo
EGVM
88
0
0
26 May 2025
Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling
Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling
Junhong Lee
Seungwook Kim
Minsu Cho
DiffM
51
0
0
26 May 2025
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation
Wenchao Zhang
Jiahe Tian
Runze He
Jizhong Han
Jiao Dai
Miaomiao Feng
Wei Mi
Xiaodan Zhang
111
0
0
24 May 2025
Rethinking Direct Preference Optimization in Diffusion Models
Rethinking Direct Preference Optimization in Diffusion Models
Junyong Kang
Seohyun Lim
Kyungjune Baek
Hyunjung Shim
777
0
0
24 May 2025
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
Yanting Miao
William Loh
Suraj Kothawade
Pacal Poupart
45
0
0
23 May 2025
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
Qirui Jiao
Daoyuan Chen
Yilun Huang
Xika Lin
Ying Shen
Yaliang Li
VLM
66
0
0
22 May 2025
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
Chaoyi Jiang
Sungwoo Kim
Lei Gao
Hossein Entezari Zarch
Won Woo Ro
Murali Annavaram
20
0
0
22 May 2025
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Conditional Panoramic Image Generation via Masked Autoregressive Modeling
Chaoyang Wang
Xiangtai Li
Lu Qi
X. Lin
Jinbin Bai
Qianyu Zhou
Yunhai Tong
DiffM
87
1
0
22 May 2025
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
Zhicheng Chen
Shibo Feng
Xi Xiao
Zhong Zhang
Qing Li
Xingyu Gao
Peilin Zhao
58
0
0
20 May 2025
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
Yilin Ye
Junchao Huang
Xingchen Zeng
Jiazhi Xia
Wei Zeng
149
0
0
20 May 2025
Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Replace in Translation: Boost Concept Alignment in Counterfactual Text-to-Image
Sifan Li
Ming Tao
Hao Zhao
Ling Shao
Hao Tang
DiffM
54
0
0
20 May 2025
Few-Step Diffusion via Score identity Distillation
Few-Step Diffusion via Score identity Distillation
Mingyuan Zhou
Yi Gu
Zhendong Wang
96
1
0
19 May 2025
Context-Aware Autoregressive Models for Multi-Conditional Image Generation
Context-Aware Autoregressive Models for Multi-Conditional Image Generation
Yixiao Chen
Zhiyuan Ma
Guoli Jia
Che Jiang
Jianjun Li
Bowen Zhou
DiffM
67
0
0
18 May 2025
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
Jiarui Wang
Huiyu Duan
Ziheng Jia
Yu Zhao
Woo Yi Yang
...
Zhongfu Chen
Juntong Wang
Yuke Xing
Guangtao Zhai
Xiongkuo Min
VGen
84
1
0
17 May 2025
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
Haipeng Fang
Sheng Tang
Juan Cao
Enshuo Zhang
Fan Tang
Tong-Yee Lee
98
0
0
16 May 2025
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
Feiran Li
Qianqian Xu
Shilong Bao
Zhiyong Yang
Xiaochun Cao
Qingming Huang
DiffM
131
0
0
16 May 2025
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
78
0
0
09 May 2025
Diffusion Model Quantization: A Review
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
100
0
0
08 May 2025
A Preliminary Study for GPT-4o on Image Restoration
A Preliminary Study for GPT-4o on Image Restoration
Hao Yang
Yiran Yang
Ruikun Zhang
Liyuan Pan
103
1
0
08 May 2025
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
Kuofeng Gao
Yufei Zhu
Yiming Li
Jiawang Bai
Yong-Liang Yang
Zerui Li
Shu-Tao Xia
86
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
303
1
0
05 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
105
0
0
01 May 2025
A Survey of Interactive Generative Video
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
Xinyu Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
109
3
0
30 Apr 2025
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
422
0
0
30 Apr 2025
Open-set Anomaly Segmentation in Complex Scenarios
Open-set Anomaly Segmentation in Complex Scenarios
Song Xia
Yi Yu
Henghui Ding
Wenhan Yang
Shixuan Liu
Alex C. Kot
Xudong Jiang
DiffM
85
0
0
28 Apr 2025
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition
Yuki Hirakawa
Ryotaro Shimizu
106
0
0
28 Apr 2025
Fast Autoregressive Models for Continuous Latent Generation
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang
Jianmin Bao
Fangyun Wei
Dong Chen
DiffM
106
1
0
24 Apr 2025
1234...161718
Next