Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.15245
Cited By
AnyText2: Visual Text Generation and Editing With Customizable Attributes
22 November 2024
Yuxiang Tuo
Yifeng Geng
Liefeng Bo
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (105★)
Papers citing
"AnyText2: Visual Text Generation and Editing With Customizable Attributes"
41 / 41 papers shown
Title
STRICT: Stress Test of Rendering Images Containing Text
Tianyu Zhang
Xinyu Wang
Zhenghan Tai
Lu Li
Jijun Chi
Jingrui Tian
Hailin He
Suyuchen Wang
57
0
0
25 May 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J Taylor
DiffM
110
2
0
16 Feb 2025
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
Jian Ma
Yonglin Deng
Chen Chen
H. Lu
Zhenyu Yang
Zhenyu Yang
VLM
DiffM
149
10
0
02 Jul 2024
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering
Zeyu Liu
Weicong Liang
Yiming Zhao
Bohan Chen
Lin Liang
Lijuan Wang
Ji Li
Yuhui Yuan
62
21
0
14 Jun 2024
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
Zeyu Liu
Weicong Liang
Zhanhao Liang
Chong Luo
Ji Li
Gao Huang
Yuhui Yuan
DiffM
92
33
0
14 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
291
1,388
0
05 Mar 2024
InstantID: Zero-shot Identity-Preserving Generation in Seconds
Qixun Wang
Xu Bai
Haofan Wang
Zekui Qin
Anthony Chen
Huaxia Li
Xu Tang
Feng-Long Xie
81
257
0
15 Jan 2024
Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model
Lingjun Zhang
Xinyuan Chen
Yaohui Wang
Yue Lu
Yu Qiao
DiffM
60
35
0
19 Dec 2023
FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
Zhenhua Yang
Dezhi Peng
Yuxin Kong
Yuyi Zhang
Cong Yao
Lianwen Jin
VLM
DiffM
78
45
0
19 Dec 2023
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models
Yiming Zhao
Zhouhui Lian
108
30
0
08 Dec 2023
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Zhen Li
Mingdeng Cao
Xintao Wang
Zhongang Qi
Ming-Ming Cheng
Ying Shan
DiffM
104
200
0
07 Dec 2023
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Liucheng Hu
Xin Gao
Peng Zhang
Ke Sun
Bang Zhang
Liefeng Bo
DiffM
VGen
106
393
0
28 Nov 2023
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
DiffM
91
70
0
28 Nov 2023
AnyText: Multilingual Visual Text Generation And Editing
Yuxiang Tuo
Wangmeng Xiang
Jun-Yan He
Yifeng Geng
Xuansong Xie
DiffM
96
83
0
06 Nov 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Jinze Bai
Shuai Bai
Shusheng Yang
Shijie Wang
Sinan Tan
Peng Wang
Junyang Lin
Chang Zhou
Jingren Zhou
MLLM
VLM
ObjD
151
932
0
24 Aug 2023
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Hu Ye
Jun Zhang
Siyi Liu
Xiao Han
Wei Yang
DiffM
105
808
0
13 Aug 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
255
2,447
0
04 Jul 2023
GlyphControl: Glyph Conditional Control for Visual Text Generation
Yukang Yang
Dongnan Gui
Yuhui Yuan
Weicong Liang
Haisong Ding
Hang-Rui Hu
Kai Chen
DiffM
79
85
0
29 May 2023
TextDiffuser: Diffusion Models as Text Painters
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
110
126
0
18 May 2023
Improving Diffusion Models for Scene Text Editing with Dual Encoders
Jiabao Ji
Guanhua Zhang
Zhaowen Wang
Bairu Hou
Zhifei Zhang
Brian L. Price
Shiyu Chang
DiffM
63
31
0
12 Apr 2023
Composer: Creative and Controllable Image Synthesis with Composable Conditions
Lianghua Huang
Di Chen
Yu Liu
Yujun Shen
Deli Zhao
Jingren Zhou
DiffM
66
289
0
20 Feb 2023
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Chong Mou
Xintao Wang
Liangbin Xie
Yanze Wu
Shuai Liu
Zhongang Qi
Ying Shan
Xiaohu Qie
DiffM
135
1,030
0
16 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
182
4,175
1
10 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
429
4,642
0
30 Jan 2023
Character-Aware Models Improve Visual Text Rendering
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
93
74
0
20 Dec 2022
Diff-Font: Diffusion Model for Robust One-Shot Font Generation
Haibin He
Xinyuan Chen
Chaoyue Wang
Juhua Liu
Bo Du
Dacheng Tao
Yu Qiao
DiffM
92
38
0
12 Dec 2022
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz
Yuanzhen Li
Varun Jampani
Yael Pritch
Michael Rubinstein
Kfir Aberman
279
2,891
0
25 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
206
1,789
0
02 Aug 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
164
1,897
0
02 Aug 2022
PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System
Chenxia Li
Weiwei Liu
Ruoyu Guo
Xiaoyue Yin
Kaitao Jiang
...
Lingfeng Zhu
Baohua Lai
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
90
113
0
07 Jun 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
413
6,916
0
13 Apr 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
496
15,768
0
20 Dec 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
243
1,444
0
03 Nov 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
169
1,588
0
18 Apr 2021
Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts
Song Park
Sanghyuk Chun
Junbum Cha
Bado Lee
Hyunjung Shim
80
66
0
02 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
978
29,871
0
26 Feb 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
418
5,000
0
24 Feb 2021
Few-shot Font Generation with Localized Style Representations and Factorization
Song Park
Sanghyuk Chun
Junbum Cha
Bado Lee
Hyunjung Shim
68
76
0
23 Sep 2020
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
721
18,364
0
19 Jun 2020
Decision-Making with Auto-Encoding Variational Bayes
Romain Lopez
Pierre Boyeau
Nir Yosef
Michael I. Jordan
Jeffrey Regier
BDL
521
10,591
0
17 Feb 2020
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
1.9K
77,441
0
18 May 2015
1