ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.12092
  4. Cited By
Zero-Shot Text-to-Image Generation

Zero-Shot Text-to-Image Generation

24 February 2021
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
    VLM
ArXivPDFHTML

Papers citing "Zero-Shot Text-to-Image Generation"

50 / 212 papers shown
Title
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles Ling
Boyu Wang
146
3
0
24 Jan 2025
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps
Andrey Palaev
Adil Mehmood Khan
S. M. Ahsan Kazmi
DiffM
105
0
0
23 Jan 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
191
11
0
23 Jan 2025
Slot-BERT: Self-supervised Object Discovery in Surgical Video
Slot-BERT: Self-supervised Object Discovery in Surgical Video
Guiqiu Liao
M. Jogan
Marcel Hussing
Kenta Nakahashi
Kazuhiro Yasufuku
Amin Madani
Eric Eaton
Daniel A. Hashimoto
419
0
0
21 Jan 2025
Owls are wise and foxes are unfaithful: Uncovering animal stereotypes in vision-language models
Owls are wise and foxes are unfaithful: Uncovering animal stereotypes in vision-language models
Tabinda Aman
Mohammad Nadeem
S. Sohail
Mohammad Anas
Min Zhang
VLM
145
1
0
21 Jan 2025
Nested Annealed Training Scheme for Generative Adversarial Networks
Nested Annealed Training Scheme for Generative Adversarial Networks
Chang Wan
Ming-Hsuan Yang
Minglu Li
Yunliang Jiang
Zhonglong Zheng
GAN
96
0
0
20 Jan 2025
ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models
ProKeR: A Kernel Perspective on Few-Shot Adaptation of Large Vision-Language Models
Yassir Bendou
Amine Ouasfi
Vincent Gripon
A. Boukhayma
VLM
135
0
0
19 Jan 2025
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
J. Park
Jungbeom Lee
Jongyoon Song
Sangwon Yu
Dahuin Jung
Sungroh Yoon
83
2
0
19 Jan 2025
Diffusion Models in Recommendation Systems: A Survey
Diffusion Models in Recommendation Systems: A Survey
Ting-Ruen Wei
Yi Fang
181
2
0
17 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
269
26
0
17 Jan 2025
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
Dongwon Kim
Ju He
Qihang Yu
Chenglin Yang
Xiaohui Shen
Suha Kwak
Liang-Chieh Chen
VLM
113
7
0
13 Jan 2025
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs
Sheng Zhang
Yanbo Xu
Naoto Usuyama
Hanwen Xu
J. Bagga
...
Carlo Bifulco
M. Lungren
Tristan Naumann
Sheng Wang
Hoifung Poon
LM&MA
MedIm
199
229
0
10 Jan 2025
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
Minxing Luo
Zixun Xia
L. Chen
Zhenhang Li
Weichao Zeng
Jinqiao Wang
Wentao Cheng
Yaxing Wang
Yu Zhou
Jian Yang
DiffM
122
1
0
10 Jan 2025
INFELM: In-depth Fairness Evaluation of Large Text-To-Image Models
INFELM: In-depth Fairness Evaluation of Large Text-To-Image Models
Di Jin
Xing Liu
Yu Liu
Jia Qing Yap
Andrea Wong
Adriana Crespo
Qi Lin
Zhiyuan Yin
Qiang Yan
Ryan Ye
EGVM
VLM
445
0
0
10 Jan 2025
Text2midi: Generating Symbolic Music from Captions
Text2midi: Generating Symbolic Music from Captions
Keshav Bhandari
Abhinaba Roy
Kyra Wang
Geeta Puri
Simon Colton
Dorien Herremans
120
5
0
03 Jan 2025
RealCustom++: Representing Images as Real-Word for Real-Time Customization
RealCustom++: Representing Images as Real-Word for Real-Time Customization
Zhendong Mao
Mengqi Huang
Fei Ding
Mingcong Liu
Qian He
Xiaojun Chang
DiffM
132
6
0
03 Jan 2025
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice Oh
Seohyon Jung
173
1
0
03 Jan 2025
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Yuzhu Cai
Sheng Yin
Yuxi Wei
Chenxin Xu
Weibo Mao
Felix Juefei Xu
Siheng Chen
Yanfeng Wang
EGVM
152
3
0
03 Jan 2025
Multimodal Human-Autonomous Agents Interaction Using Pre-Trained Language and Visual Foundation Models
Multimodal Human-Autonomous Agents Interaction Using Pre-Trained Language and Visual Foundation Models
Linus Nwankwo
Elmar Rueckert
115
2
0
31 Dec 2024
Grid Diffusion Models for Text-to-Video Generation
Grid Diffusion Models for Text-to-Video Generation
Taegyeong Lee
Soyeong Kwon
Taehwan Kim
117
7
0
31 Dec 2024
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis
Zhipeng Chen
Lan Yang
Yonggang Qi
Honggang Zhang
Kaiyue Pang
Ke Li
Yi-Zhe Song
DiffM
137
0
0
31 Dec 2024
Multi-Agent Planning Using Visual Language Models
Multi-Agent Planning Using Visual Language Models
Michele Brienza
F. Argenziano
Vincenzo Suriani
D. Bloisi
Daniele Nardi
LM&Ro
LLMAG
117
4
0
31 Dec 2024
Is Your Image a Good Storyteller?
Is Your Image a Good Storyteller?
Xiujie Song
Xiaoyi Pang
Haifeng Tang
Mengyue Wu
Kenny Q. Zhu
81
0
0
29 Dec 2024
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
Quan Dao
Hao Phung
T. Dao
Dimitris Metaxas
Anh Tran
147
1
0
22 Dec 2024
Parallelized Autoregressive Visual Generation
Parallelized Autoregressive Visual Generation
Yanjie Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
163
12
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
238
10
0
19 Dec 2024
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hong Chen
Zihan Wang
Xianrui Li
Xingwu Sun
Fangyi Chen
Jiang Liu
Jiadong Wang
Bhiksha Raj
Zicheng Liu
Emad Barsoum
VLM
204
8
0
14 Dec 2024
Mojito: Motion Trajectory and Intensity Control for Video Generation
Mojito: Motion Trajectory and Intensity Control for Video Generation
Xuehai He
Shuohang Wang
Jianwei Yang
Xiaoxia Wu
Yansen Wang
Kuan-Chieh Wang
Z. Zhan
Olatunji Ruwase
Yelong Shen
Xinze Wang
VGen
203
2
0
12 Dec 2024
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Qu He
Jinlong Peng
P. Xu
Boyuan Jiang
Xiaobin Hu
...
Yang Liu
Yun Wang
Chengjie Wang
Xuelong Li
Jing Zhang
DiffM
169
1
0
04 Dec 2024
Seamless Optical Cloud Computing across Edge-Metro Network for Generative AI
Seamless Optical Cloud Computing across Edge-Metro Network for Generative AI
Sizhe Xing
Aolong Sun
Chengxi Wang
Yizhi Wang
Boyu Dong
...
Xi Xiao
R. Penty
Qixiang Cheng
Nan Chi
Junwen Zhang
149
0
0
04 Dec 2024
SerialGen: Personalized Image Generation by First Standardization Then Personalization
SerialGen: Personalized Image Generation by First Standardization Then Personalization
Cong Xie
Han Zou
Ruiqi Yu
Yan Zhang
Zhenpeng Zhan
124
1
0
02 Dec 2024
IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models
IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models
Khaled Abud
Sergey Lavrushkin
Alexey Kirillov
D. Vatolin
156
0
0
02 Dec 2024
Improving Object Detection by Modifying Synthetic Data with Explainable AI
Improving Object Detection by Modifying Synthetic Data with Explainable AI
Nitish Mital
Simon Malzard
Richard Walters
Celso M. De Melo
Raghuveer Rao
Victoria Nockles
123
0
0
02 Dec 2024
Continuous Concepts Removal in Text-to-image Diffusion Models
Continuous Concepts Removal in Text-to-image Diffusion Models
Tingxu Han
Weisong Sun
Yanrong Hu
Chunrong Fang
Yonglong Zhang
Shiqing Ma
Tao Zheng
Zhenyu Chen
Zhenting Wang
DiffM
166
3
0
30 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
131
1
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
194
9
0
28 Nov 2024
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis
Bo Liu
K. Zou
Liming Zhan
Zexin Lu
Xiaoyu Dong
Yidi Chen
Chengqiang Xie
Jiannong Cao
Xiao-Ming Wu
Huazhu Fu
174
2
0
25 Nov 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
193
6
0
25 Nov 2024
TKG-DM: Training-free Chroma Key Content Generation Diffusion Model
TKG-DM: Training-free Chroma Key Content Generation Diffusion Model
Ryugo Morita
Stanislav Frolov
Brian B. Moser
Takahiro Shirakawa
Ko Watanabe
Andreas Dengel
Jinjia Zhou
DiffM
129
0
0
23 Nov 2024
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Zhiwei Jia
Yuesong Nan
Huixi Zhao
Gengdai Liu
EGVM
157
1
0
22 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Few-Shot Task Learning through Inverse Generative Modeling
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
115
4
0
07 Nov 2024
TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
Sunjae Yoon
Gwanhyeong Koo
Younghwan Lee
Chang D. Yoo
VGen
117
5
0
31 Oct 2024
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models
Arash Marioriyad
Parham Rezaei
M. Baghshah
M. Rohban
CoGe
415
0
0
30 Oct 2024
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
Ji Guo
Wenbo Jiang
Rui Zhang
Guoming Lu
Hongwei Li
AAML
103
0
0
30 Oct 2024
Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
Antoine Schnepf
Karim Kassab
Jean-Yves Franceschi
Laurent Caraffa
Flavian Vasile
Jeremie Mary
Andrew Comport
Valérie Gouet-Brunet
129
2
0
30 Oct 2024
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
109
2
0
28 Oct 2024
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Shuhao Gu
Jialing Zhang
Siyuan Zhou
Kevin Yu
Zhaohu Xing
...
Yufeng Cui
Xinlong Wang
Yaoqi Liu
Fangxiang Feng
Guang Liu
SyDa
VLM
MLLM
81
26
0
24 Oct 2024
Structure Language Models for Protein Conformation Generation
Structure Language Models for Protein Conformation Generation
Jiarui Lu
Xiaoyin Chen
Stephen Zhewen Lu
Chence Shi
Hongyu Guo
Yoshua Bengio
Xiangbo Shu
DiffM
77
3
0
24 Oct 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Hao Peng
Dianbo Sui
AI4CE
111
25
0
23 Oct 2024
TopoDiffusionNet: A Topology-aware Diffusion Model
TopoDiffusionNet: A Topology-aware Diffusion Model
Saumya Gupta
Dimitris Samaras
Chong Chen
DiffM
113
4
0
22 Oct 2024
Previous
12345
Next