ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.16465
  4. Cited By
TextDiffuser-2: Unleashing the Power of Language Models for Text
  Rendering

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

28 November 2023
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
    DiffM
ArXiv (abs)PDFHTML

Papers citing "TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering"

50 / 62 papers shown
Title
STRICT: Stress Test of Rendering Images Containing Text
STRICT: Stress Test of Rendering Images Containing Text
Tianyu Zhang
Xinyu Wang
Zhenghan Tai
Lu Li
Jijun Chi
Jingrui Tian
Hailin He
Suyuchen Wang
62
0
0
25 May 2025
Syn3DTxt: Embedding 3D Cues for Scene Text Generation
Syn3DTxt: Embedding 3D Cues for Scene Text Generation
Li-Syun Hsiung
Jun-Kai Tu
Kuan-Wu Chu
Yu-Hsuan Chiu
Yan-Tsung Peng
Sheng-Luen Chung
Gee-Sern Jison Hsu
48
0
0
24 May 2025
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
Maoyuan Ye
Jing Zhang
Juhua Liu
Bo Du
Dacheng Tao
LRM
170
0
0
18 May 2025
ViMo: A Generative Visual GUI World Model for App Agents
ViMo: A Generative Visual GUI World Model for App Agents
Dezhao Luo
Bohan Tang
Kang Li
Georgios Papoudakis
Jifei Song
S. Gong
Haifeng Zhang
Jun Wang
Kun Shao
LM&RoVGen
167
1
0
15 Apr 2025
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
Bowen Jiang
Yuan Yuan
Xinyi Bai
Zhuoqun Hao
Alyson Yin
Yaojie Hu
Wenyu Liao
Lyle Ungar
Camillo J Taylor
DiffM
110
2
0
16 Feb 2025
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Minxing Luo
Zixun Xia
L. Chen
Zhenhang Li
Weichao Zeng
Jinqiao Wang
Wentao Cheng
Yaxing Wang
Yu Zhou
Jian Yang
DiffM
132
1
0
10 Jan 2025
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Type-R: Automatically Retouching Typos for Text-to-Image Generation
Wataru Shimoda
Naoto Inoue
Daichi Haraguchi
Hayato Mitani
S. Uchida
Kota Yamaguchi
DiffM
205
0
0
27 Nov 2024
JoyType: A Robust Design for Multilingual Visual Text Creation
JoyType: A Robust Design for Multilingual Visual Text Creation
Chao Li
Chen Jiang
Xiaolong Liu
Jun Zhao
Guoxin Wang
DiffM
109
7
0
26 Sep 2024
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
Jian Ma
Yonglin Deng
Chen Chen
H. Lu
Zhenyu Yang
Zhenyu Yang
VLMDiffM
162
10
0
02 Jul 2024
TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation
TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation
Tianyi Liang
Jiangqi Liu
Sicheng Song
Shiqi Jiang
Yifei Huang
Changbo Wang
Chenhui Li
138
0
0
18 Apr 2024
LayoutPrompter: Awaken the Design Ability of Large Language Models
LayoutPrompter: Awaken the Design Ability of Large Language Models
Jiawei Lin
Jiaqi Guo
Shizhao Sun
Z. Yang
Jian-Guang Lou
Dongmei Zhang
VLM
75
25
0
11 Nov 2023
Ferret: Refer and Ground Anything Anywhere at Any Granularity
Ferret: Refer and Ground Anything Anywhere at Any Granularity
Haoxuan You
Haotian Zhang
Zhe Gan
Xianzhi Du
Bowen Zhang
Zirui Wang
Liangliang Cao
Shih-Fu Chang
Yinfei Yang
ObjDMLLMVLM
113
328
0
11 Oct 2023
PixArt-$α$: Fast Training of Diffusion Transformer for
  Photorealistic Text-to-Image Synthesis
PixArt-ααα: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen
Jincheng Yu
Chongjian Ge
Lewei Yao
Enze Xie
...
Zhongdao Wang
James T. Kwok
Ping Luo
Huchuan Lu
Zhenguo Li
DiffM
107
460
0
30 Sep 2023
Kosmos-2.5: A Multimodal Literate Model
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv
Yupan Huang
Jingye Chen
Lei Cui
Shuming Ma
...
Weiyao Luo
Shaoxiang Wu
Guoxin Wang
Cha Zhang
Furu Wei
VLMMLLM
114
66
0
20 Sep 2023
Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through
  Image-IDS Aligning
Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning
Haiyang Yu
Xiaocong Wang
Bin Li
Xiangyang Xue
VLM
76
20
0
03 Sep 2023
Position-Enhanced Visual Instruction Tuning for Multimodal Large
  Language Models
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
Chi Chen
Ruoyu Qin
Ziyue Wang
Xiaoyue Mi
Peng Li
Maosong Sun
Yang Liu
MLLMVLM
71
45
0
25 Aug 2023
RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic
  and Regional Comprehension
RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension
Qiang-feng Zhou
Chaohui Yu
Shaofeng Zhang
Sitong Wu
Zhibin Wang
Fan Wang
74
27
0
03 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
413
12,076
0
18 Jul 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
MLLMVLM
154
238
0
07 Jul 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
264
2,450
0
04 Jul 2023
Kosmos-2: Grounding Multimodal Large Language Models to the World
Kosmos-2: Grounding Multimodal Large Language Models to the World
Zhiliang Peng
Wenhui Wang
Li Dong
Y. Hao
Shaohan Huang
Shuming Ma
Furu Wei
MLLMObjDVLM
121
764
0
26 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
452
4,444
0
09 Jun 2023
GlyphControl: Glyph Conditional Control for Visual Text Generation
GlyphControl: Glyph Conditional Control for Visual Text Generation
Yukang Yang
Dongnan Gui
Yuhui Yuan
Weicong Liang
Haisong Ding
Hang-Rui Hu
Kai Chen
DiffM
81
85
0
29 May 2023
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Shihao Zhao
Dongdong Chen
Yen-Chun Chen
Jianmin Bao
Shaozhe Hao
Lu Yuan
Kwan-Yee K. Wong
109
267
0
25 May 2023
LayoutGPT: Compositional Visual Planning and Generation with Large
  Language Models
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng
Wanrong Zhu
Tsu-Jui Fu
Varun Jampani
Arjun Reddy Akula
Xuehai He
Sugato Basu
Xinze Wang
William Yang Wang
MLLM
86
179
0
24 May 2023
TextDiffuser: Diffusion Models as Text Painters
TextDiffuser: Diffusion Models as Text Painters
Jingye Chen
Yupan Huang
Tengchao Lv
Lei Cui
Qifeng Chen
Furu Wei
139
126
0
18 May 2023
Diffusion-based Document Layout Generation
Diffusion-based Document Layout Generation
Liu He
Yijuan Lu
John Corring
D. Florêncio
Cha Zhang
DiffM
50
22
0
19 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.5K
13,472
0
27 Feb 2023
Adding Conditional Control to Text-to-Image Diffusion Models
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
184
4,180
1
10 Feb 2023
GLIGEN: Open-Set Grounded Text-to-Image Generation
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li
Haotian Liu
Qingyang Wu
Fangzhou Mu
Jianwei Yang
Jianfeng Gao
Chunyuan Li
Yong Jae Lee
VLM
131
602
1
17 Jan 2023
Character-Aware Models Improve Visual Text Rendering
Character-Aware Models Improve Visual Text Rendering
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
105
74
0
20 Dec 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLMMoE
177
831
0
02 Nov 2022
Scene Text Recognition with Permuted Autoregressive Sequence Models
Scene Text Recognition with Permuted Autoregressive Sequence Models
Darwin Bautista
Rowel Atienza
105
173
0
14 Jul 2022
A Unified Sequence Interface for Vision Tasks
A Unified Sequence Interface for Vision Tasks
Ting-Li Chen
Saurabh Saxena
Lala Li
Nayeon Lee
David J. Fleet
Geoffrey E. Hinton
VLMMLLM
74
151
0
15 Jun 2022
Discovering the Hidden Vocabulary of DALLE-2
Discovering the Hidden Vocabulary of DALLE-2
Giannis Daras
A. Dimakis
183
68
0
01 Jun 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
466
6,083
0
23 May 2022
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an
  Empirical Study
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study
Haiyang Yu
Jingye Chen
Bin Li
Jianqi Ma
Mengnan Guan
Xixi Xu
Xiaocong Wang
Shaobo Qu
Xiangyang Xue
89
56
0
30 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
505
15,788
0
20 Dec 2021
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu
Dong Chen
Jianmin Bao
Fang Wen
Bo Zhang
Dongdong Chen
Lu Yuan
B. Guo
DiffM
169
799
0
29 Nov 2021
Palette: Image-to-Image Diffusion Models
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffMVLM
486
1,654
0
10 Nov 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLMViTVLM
279
350
0
22 Sep 2021
TrOCR: Transformer-based Optical Character Recognition with Pre-trained
  Models
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Minghao Li
Tengchao Lv
Jingye Chen
Lei Cui
Yijuan Lu
D. Florêncio
Cha Zhang
Zhoujun Li
Furu Wei
ViT
246
372
0
21 Sep 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
104
506
0
28 May 2021
MOST: A Multi-Oriented Scene Text Detector with Localization Refinement
MOST: A Multi-Oriented Scene Text Detector with Localization Refinement
Minghang He
Minghui Liao
Zhibo Yang
Humen Zhong
Jun Tang
Wenqing Cheng
Cong Yao
Yongpan Wang
X. Bai
87
75
0
02 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
1.0K
29,926
0
26 Feb 2021
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
304
7,500
0
06 Oct 2020
LayoutTransformer: Layout Generation and Completion with Self-attention
LayoutTransformer: Layout Generation and Completion with Self-attention
Kamal Gupta
Justin Lazarow
Alessandro Achille
Larry S. Davis
Vijay Mahadevan
Abhinav Shrivastava
ViT
91
137
0
25 Jun 2020
Denoising Diffusion Probabilistic Models
Denoising Diffusion Probabilistic Models
Jonathan Ho
Ajay Jain
Pieter Abbeel
DiffM
759
18,408
0
19 Jun 2020
Real-time Scene Text Detection with Differentiable Binarization
Real-time Scene Text Detection with Differentiable Binarization
Minghui Liao
Zhaoyi Wan
Cong Yao
Kai Chen
X. Bai
83
683
0
20 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
503
20,376
0
23 Oct 2019
12
Next