ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.10789
  4. Cited By
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

22 June 2022
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
Zirui Wang
Vijay Vasudevan
Alexander Ku
Yinfei Yang
Burcu Karagol Ayan
Ben Hutchinson
Wei Han
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
    EGVM
ArXiv (abs)PDFHTML

Papers citing "Scaling Autoregressive Models for Content-Rich Text-to-Image Generation"

49 / 899 papers shown
Title
DE-FAKE: Detection and Attribution of Fake Images Generated by
  Text-to-Image Generation Models
DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha
Zheng Li
Ning Yu
Yang Zhang
DiffM
109
135
0
13 Oct 2022
Underspecification in Scene Description-to-Depiction Tasks
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
128
34
0
11 Oct 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng
Noriyuki Kojima
Alexander M. Rush
DiffM
86
4
0
11 Oct 2022
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Can Artificial Intelligence Reconstruct Ancient Mosaics?
Fernando Moral-Andrés
Elena Merino-Gómez
Pedro Reviriego
Fabrizio Lombardi
34
7
0
07 Oct 2022
On Distillation of Guided Diffusion Models
On Distillation of Guided Diffusion Models
Chenlin Meng
Robin Rombach
Ruiqi Gao
Diederik P. Kingma
Stefano Ermon
Jonathan Ho
Tim Salimans
VLMDiffM
89
536
0
06 Oct 2022
A New Path: Scaling Vision-and-Language Navigation with Synthetic
  Instructions and Imitation Learning
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath
Peter Anderson
Su Wang
Jing Yu Koh
Alexander Ku
Austin Waters
Yinfei Yang
Jason Baldridge
Zarana Parekh
LM&Ro
104
48
0
06 Oct 2022
Phenaki: Variable Length Video Generation From Open Domain Textual
  Description
Phenaki: Variable Length Video Generation From Open Domain Textual Description
Ruben Villegas
Mohammad Babaeizadeh
Pieter-Jan Kindermans
Hernan Moraldo
Han Zhang
M. Saffar
Santiago Castro
Julius Kunze
D. Erhan
DiffMVGen
157
396
0
05 Oct 2022
Imagen Video: High Definition Video Generation with Diffusion Models
Imagen Video: High Definition Video Generation with Diffusion Models
Jonathan Ho
William Chan
Chitwan Saharia
Jay Whang
Ruiqi Gao
...
Diederik P. Kingma
Ben Poole
Mohammad Norouzi
David J. Fleet
Tim Salimans
VGen
181
1,548
0
05 Oct 2022
Progressive Text-to-Image Generation
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
156
4
0
05 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLMVLM
161
89
0
03 Oct 2022
Membership Inference Attacks Against Text-to-image Generation Models
Membership Inference Attacks Against Text-to-image Generation Models
Yixin Wu
Ning Yu
Zheng Li
Michael Backes
Yang Zhang
DiffM
79
68
0
03 Oct 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
132
309
0
30 Sep 2022
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Understanding Pure CLIP Guidance for Voxel Grid NeRF Models
Han-Hung Lee
Angel X. Chang
82
63
0
30 Sep 2022
DreamFusion: Text-to-3D using 2D Diffusion
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole
Ajay Jain
Jonathan T. Barron
B. Mildenhall
220
2,445
0
29 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer
Adam Polyak
Thomas Hayes
Xiaoyue Yin
Jie An
...
Oron Ashual
Oran Gafni
Devi Parikh
Sonal Gupta
Yaniv Taigman
DiffMVGen
97
1,439
0
29 Sep 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
212
178
0
29 Sep 2022
Learning to Learn with Generative Models of Neural Network Checkpoints
Learning to Learn with Generative Models of Neural Network Checkpoints
William S. Peebles
Ilija Radosavovic
Tim Brooks
Alexei A. Efros
Jitendra Malik
UQCV
156
69
0
26 Sep 2022
All are Worth Words: A ViT Backbone for Diffusion Models
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao
Shen Nie
Kaiwen Xue
Yue Cao
Chongxuan Li
Hang Su
Jun Zhu
VLM
185
365
0
25 Sep 2022
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Andrija Djurisic
Nebojsa Bozanic
Arjun Ashok
Rosanne Liu
OODD
232
166
0
20 Sep 2022
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
Lukas Struppek
Dominik Hintersdorf
Felix Friedrich
Manuel Brack
P. Schramowski
Kristian Kersting
130
33
0
19 Sep 2022
Does CLIP Know My Face?
Does CLIP Know My Face?
Dominik Hintersdorf
Lukas Struppek
Manuel Brack
Felix Friedrich
P. Schramowski
Kristian Kersting
VLM
60
11
0
15 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
163
616
0
07 Sep 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for
  Subject-Driven Generation
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
354
2,906
0
25 Aug 2022
Text to Image Generation: Leaving no Language Behind
Text to Image Generation: Leaving no Language Behind
Pedro Reviriego
Elena Merino-Gómez
VLM
49
13
0
19 Aug 2022
Finding Reusable Machine Learning Components to Build Programming
  Language Processing Pipelines
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
Patrick Flynn
T. Vanderbruggen
C. Liao
Pei-Hung Lin
M. Emani
Xipeng Shen
80
4
0
11 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and
  Robustness of CLIP
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIPVLM
180
108
0
10 Aug 2022
Adversarial Attacks on Image Generation With Made-Up Words
Adversarial Attacks on Image Generation With Made-Up Words
Raphael Milliere
90
39
0
04 Aug 2022
DALLE-URBAN: Capturing the urban design expertise of large text to image
  transformers
DALLE-URBAN: Capturing the urban design expertise of large text to image transformers
Sachith Seneviratne
Damith A. Senanayake
Sanka Rasnayaka
Rajith Vidanaarachchi
Jason Thompson
ViT
107
22
0
03 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
247
1,796
0
02 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using
  Textual Inversion
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
176
1,903
0
02 Aug 2022
Lighting (In)consistency of Paint by Text
Lighting (In)consistency of Paint by Text
Hany Farid
73
32
0
27 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented
  Diffusion Models
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
83
71
0
26 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for
  Infinite Visual Synthesis
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
89
74
0
20 Jul 2022
Perspective (In)consistency of Paint by Text
Perspective (In)consistency of Paint by Text
Hany Farid
DiffM
77
37
0
27 Jun 2022
Worldwide AI Ethics: a review of 200 guidelines and recommendations for
  AI governance
Worldwide AI Ethics: a review of 200 guidelines and recommendations for AI governance
N. Corrêa
Camila Galvão
J. Santos
C. Pino
Edson Pontes Pinto
...
Diogo Massmann
Rodrigo Mambrini
Luiza Galvao
Edmund Terem
Nythamar Fernandes de Oliveira
138
99
0
23 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjDVLMMLLM
171
412
0
17 Jun 2022
Write and Paint: Generative Vision-Language Models are Unified Modal
  Learners
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
Shizhe Diao
Wangchunshu Zhou
Xinsong Zhang
Jiawei Wang
MLLMAI4CE
95
17
0
15 Jun 2022
Blended Latent Diffusion
Blended Latent Diffusion
Omri Avrahami
Ohad Fried
Dani Lischinski
DiffM
174
393
0
06 Jun 2022
Parallel Synthesis for Autoregressive Speech Generation
Parallel Synthesis for Autoregressive Speech Generation
Po-Chun Hsu
Da-Rong Liu
Andy T. Liu
Hung-yi Lee
74
5
0
25 Apr 2022
Opal: Multimodal Image Generation for News Illustration
Opal: Multimodal Image Generation for News Illustration
Vivian Liu
Han Qiao
Lydia B. Chilton
118
104
0
19 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
KNN-Diffusion: Image Generation via Large-Scale Retrieval
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLMSyDaDiffM
85
124
0
06 Apr 2022
An Introduction to Neural Data Compression
An Introduction to Neural Data Compression
Yibo Yang
Stephan Mandt
Lucas Theis
145
125
0
14 Feb 2022
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
Minheng Ni
Chenfei Wu
Haoyang Huang
Daxin Jiang
W. Zuo
Nan Duan
67
19
0
10 Feb 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of
  Text-to-Image Generation Models
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
ViT
243
193
0
08 Feb 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
198
51
0
27 Dec 2021
Exploration into Translation-Equivariant Image Quantization
Exploration into Translation-Equivariant Image Quantization
W. Shin
Gyubok Lee
Jiyoung Lee
Eun-Young Lyou
Joonseok Lee
Edward Choi
95
7
0
01 Dec 2021
EdiBERT, a generative model for image editing
EdiBERT, a generative model for image editing
Thibaut Issenhuth
Ugo Tanielian
Jérémie Mary
David Picard
DiffM
100
12
0
30 Nov 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLMMedIm
110
387
0
16 Jun 2021
Neural Distributed Source Coding
Neural Distributed Source Coding
Jay Whang
Alliot Nagle
Anish Acharya
Hyeji Kim
A. Dimakis
85
21
0
05 Jun 2021
Previous
123...161718