ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.12092
  4. Cited By
Zero-Shot Text-to-Image Generation

Zero-Shot Text-to-Image Generation

24 February 2021
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
    VLM
ArXivPDFHTML

Papers citing "Zero-Shot Text-to-Image Generation"

50 / 212 papers shown
Title
Conjuring Semantic Similarity
Conjuring Semantic Similarity
Tian Yu Liu
Stefano Soatto
DiffM
151
0
0
21 Oct 2024
Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
Anh-Vu Bui
L. Vuong
Khanh Doan
Trung Le
Paul Montague
Tamas Abraham
Dinh Q. Phung
KELM
DiffM
76
12
0
21 Oct 2024
Triplane Grasping: Efficient 6-DoF Grasping with Single RGB Images
Triplane Grasping: Efficient 6-DoF Grasping with Single RGB Images
Yiming Li
Hanchi Ren
Yue Yang
Jingjing Deng
Xianghua Xie
95
0
0
21 Oct 2024
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model
Haoye Chai
Shiyuan Zhang
Xiaoqian Qi
Yong Li
130
1
0
20 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng Ann Heng
84
5
0
17 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
115
16
0
17 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
130
29
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
197
19
0
03 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
91
14
0
02 Oct 2024
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard
Amin Karimi Monsefi
Mengxi Zhou
Wei-Lun Chao
Alper Yilmaz
R. Ramnath
DiffM
95
3
0
02 Oct 2024
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He
Haodong Li
Wei Yin
Yixun Liang
Leheng Li
Kaiqiang Zhou
Hongbo Zhang
Bingbing Liu
Ying-Cong Chen
DiffM
VLM
157
52
0
26 Sep 2024
StackGen: Generating Stable Structures from Silhouettes via Diffusion
StackGen: Generating Stable Structures from Silhouettes via Diffusion
Luzhe Sun
Takuma Yoneda
Samuel Wheeler
Tianchong Jiang
Matthew R. Walter
DiffM
147
1
0
26 Sep 2024
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design
GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design
Phillip Mueller
Sebastian Mueller
Lars Mikelsons
65
2
0
25 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
146
13
0
23 Sep 2024
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
Zhecan Wang
Junzhang Liu
Chia-Wei Tang
Hani Alomari
Anushka Sivakumar
...
Haoxuan You
A. Ishmam
Kai-Wei Chang
Shih-Fu Chang
Chris Thomas
CoGe
VLM
127
2
0
19 Sep 2024
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
Amin Karimi Monsefi
Mengxi Zhou
Nastaran Karimi Monsefi
Ser-Nam Lim
Wei-Lun Chao
R. Ramnath
109
1
0
16 Sep 2024
TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfer
TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfer
Zihan Su
Junhao Zhuang
Chun Yuan
DiffM
91
0
0
15 Sep 2024
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
89
1
0
13 Sep 2024
LT3SD: Latent Trees for 3D Scene Diffusion
LT3SD: Latent Trees for 3D Scene Diffusion
Quan Meng
Lei Li
Matthias Nießner
Angela Dai
138
13
0
12 Sep 2024
What to align in multimodal contrastive learning?
What to align in multimodal contrastive learning?
Benoit Dufumier
J. Castillo-Navarro
D. Tuia
Jean-Philippe Thiran
118
4
0
11 Sep 2024
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors
Md Ferdous Alam
Faez Ahmed
DiffM
91
7
0
08 Sep 2024
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers
Lorenza Prospero
Abdullah Hamdi
João F. Henriques
Christian Rupprecht
3DGS
73
3
0
06 Sep 2024
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
Peiming Guo
Sinuo Liu
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Hao Fei
DiffM
123
1
0
16 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
141
54
0
05 Aug 2024
Neural Network Emulator for Atmospheric Chemical ODE
Neural Network Emulator for Atmospheric Chemical ODE
Zhi-Song Liu
Petri S. Clusius
Michael Boy
92
3
0
03 Aug 2024
Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
Phillip Mueller
Lars Mikelsons
AI4CE
95
3
0
15 Jul 2024
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
Yuxiao Chen
Xiaolin Huang
Quan Zhang
Wei Li
Mingjian Zhu
...
Hanting Chen
Hailin Hu
J. Yang
Wen Liu
Jie Hu
EGVM
100
7
0
24 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
131
35
0
24 Jun 2024
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das
Jie Zhang
Florian Tramèr
MIALM
145
36
1
23 Jun 2024
Generative Topological Networks
Generative Topological Networks
Alona Levy-Jurgenson
Z. Yakhini
80
0
0
21 Jun 2024
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Adding Conditional Control to Diffusion Models with Reinforcement Learning
Yulai Zhao
Masatoshi Uehara
Gabriele Scalia
Tommaso Biancalani
Sergey Levine
Ehsan Hajiramezanali
Ehsan Hajiramezanali
AI4CE
105
6
0
17 Jun 2024
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
Joseph Spracklen
Raveen Wijewickrama
A. H. M. N. Sakib
Anindya Maiti
Murtuza Jadliwala
Murtuza Jadliwala
105
12
0
12 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
110
22
0
06 Jun 2024
Training-efficient density quantum machine learning
Training-efficient density quantum machine learning
Brian Coyle
El Amine Cherrat
Nishant Jain
Natansh Mathur
Snehal Raj
Skander Kazdaghli
Iordanis Kerenidis
89
5
0
30 May 2024
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee
Minsu Kim
Lynn Cherif
David Dobre
Juho Lee
...
Kenji Kawaguchi
Gauthier Gidel
Yoshua Bengio
Nikolay Malkin
Moksh Jain
AAML
114
19
0
28 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
143
3
0
27 May 2024
Ensembling Diffusion Models via Adaptive Feature Aggregation
Ensembling Diffusion Models via Adaptive Feature Aggregation
Cong Wang
Kuan Tian
Yonghang Guan
Jun Zhang
Zhiwei Jiang
Fei Shen
Xiao Han
110
5
0
27 May 2024
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
Run Luo
Yunshui Li
Longze Chen
Wanwei He
Ting-En Lin
...
Zikai Song
Xiaobo Xia
Tongliang Liu
Min Yang
Binyuan Hui
VLM
DiffM
104
22
0
24 May 2024
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
Seyedmorteza Sadat
Jakob Buhmann
Derek Bradley
Otmar Hilliges
Romann M. Weber
106
9
0
23 May 2024
TerDiT: Ternary Diffusion Models with Transformers
TerDiT: Ternary Diffusion Models with Transformers
Xudong Lu
Aojun Zhou
Ziyi Lin
Qi Liu
Yuhui Xu
Renrui Zhang
Yafei Wen
Shuai Ren
Peng Gao
Junchi Yan
MQ
92
3
0
23 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
182
309
0
16 May 2024
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Modeling Caption Diversity in Contrastive Vision-Language Pretraining
Samuel Lavoie
Polina Kirichenko
Mark Ibrahim
Mahmoud Assran
Andrew Gordon Wilson
Aaron Courville
Nicolas Ballas
CLIP
VLM
109
23
0
30 Apr 2024
Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation
Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation
Jingmin Sun
Yuxuan Liu
Zecheng Zhang
Hayden Schaeffer
AI4CE
90
20
0
18 Apr 2024
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Zhi-Yi Lin
Jouh Yeong Chew
Jan van Gemert
Xucong Zhang
136
3
0
16 Apr 2024
RankCLIP: Ranking-Consistent Language-Image Pretraining
RankCLIP: Ranking-Consistent Language-Image Pretraining
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zhili Feng
Zenghui Ding
Yining Sun
SSL
VLM
90
7
0
15 Apr 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
145
35
0
07 Apr 2024
Faster Diffusion via Temporal Attention Decomposition
Faster Diffusion via Temporal Attention Decomposition
Haozhe Liu
Wentian Zhang
Jinheng Xie
Francesco Faccio
Mengmeng Xu
Tao Xiang
Mike Zheng Shou
Juan-Manuel Perez-Rua
Jürgen Schmidhuber
DiffM
124
22
0
03 Apr 2024
FaceXFormer: A Unified Transformer for Facial Analysis
FaceXFormer: A Unified Transformer for Facial Analysis
Kartik Narayan
VS Vibashan
Rama Chellappa
Vishal M. Patel
ViT
79
13
0
19 Mar 2024
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Yongqi Wang
Ruofan Hu
Rongjie Huang
Zhiqing Hong
Ruiqi Li
Wenrui Liu
Fuming You
Tao Jin
Zhou Zhao
82
12
0
18 Mar 2024
Specification Overfitting in Artificial Intelligence
Specification Overfitting in Artificial Intelligence
Benjamin Roth
Pedro Henrique Luz de Araujo
Yuxi Xia
Saskia Kaltenbrunner
Christoph Korab
184
1
0
13 Mar 2024
Previous
12345
Next