Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.13290
Cited By
CogView: Mastering Text-to-Image Generation via Transformers
26 May 2021
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
Da Yin
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
ViT
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogView: Mastering Text-to-Image Generation via Transformers"
50 / 540 papers shown
Title
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Linjiang Huang
Rongyao Fang
Aiping Zhang
Guanglu Song
Si Liu
Yu Liu
Hongsheng Li
DiffM
38
22
0
19 Mar 2024
LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing
Yazeed Alharbi
Peter Wonka
DiffM
40
0
0
19 Mar 2024
Can AI Outperform Human Experts in Creating Social Media Creatives?
Eunkyung Park
Raymond K. Wong
Junbum Kwon
44
0
0
19 Mar 2024
Just Say the Name: Online Continual Learning with Category Names Only via Data Generation
Minhyuk Seo
Diganta Misra
Seongwon Cho
Minjae Lee
Jonghyun Choi
CLL
44
7
0
16 Mar 2024
Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
Baoquan Zhang
Huaibin Wang
Chuyao Luo
Xutao Li
Guotao Liang
Yunming Ye
Xiaochen Qi
Yao He
40
11
0
15 Mar 2024
Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
Shihao Zhao
Shaozhe Hao
Bojia Zi
Huaizhe Xu
Kwan-Yee K. Wong
DiffM
VLM
68
8
0
12 Mar 2024
Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
DiffM
54
7
0
12 Mar 2024
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Pengchong Qiao
Lei Shang
Chang-Shu Liu
Baigui Sun
Xiang Ji
Jie Chen
CVBM
38
3
0
11 Mar 2024
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
Wendi Zheng
Jiayan Teng
Zhuoyi Yang
Weihan Wang
Jidong Chen
Xiaotao Gu
Yuxiao Dong
Ming Ding
Jie Tang
DiffM
35
35
0
08 Mar 2024
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala
Jianfeng Gao
Jianwei Yang
VGen
DiffM
33
4
0
07 Mar 2024
Discriminative Probing and Tuning for Text-to-Image Generation
Leigang Qu
Wenjie Wang
Yongqi Li
Hanwang Zhang
Liqiang Nie
Tat-Seng Chua
46
7
0
07 Mar 2024
PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement
Zhijie Wang
Yuheng Huang
Da Song
Lei Ma
Tianyi Zhang
DiffM
50
57
0
06 Mar 2024
Text-guided Explorable Image Super-resolution
Kanchana Vaishnavi Gandikota
Paramanand Chandramouli
48
7
0
02 Mar 2024
LLMBind: A Unified Modality-Task Integration Framework
Bin Zhu
Munan Ning
Peng Jin
Bin Lin
Jinfa Huang
...
Junwu Zhang
Zhenyu Tang
Mingjun Pan
Xing Zhou
Li-ming Yuan
MLLM
40
6
0
22 Feb 2024
Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion Models
C. Wu
Fernando de la Torre
DiffM
29
2
0
21 Feb 2024
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
Dewei Zhou
You Li
Fan Ma
Zongxin Yang
Yi Yang
DiffM
30
57
0
08 Feb 2024
λ
λ
λ
-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Maitreya Patel
Sangmin Jung
Chitta Baral
Yezhou Yang
VLM
31
29
0
07 Feb 2024
Text2Street: Controllable Text-to-image Generation for Street Views
Jinming Su
Songen Gu
Yiting Duan
Xing‐zhen Chen
Junfeng Luo
DiffM
58
5
0
07 Feb 2024
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Pengyuan Zhou
Lin Wang
Zhi Liu
Yanbin Hao
Pan Hui
Sasu Tarkoma
J. Kangasharju
VGen
48
26
0
30 Jan 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
79
15
0
30 Jan 2024
Image-Text Out-Of-Context Detection Using Synthetic Multimodal Misinformation
Fatma Shalabi
H. Nguyen
Hichem Felouat
Ching-Chun Chang
Isao Echizen
40
5
0
29 Jan 2024
Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support
Xiaojun Wu
Di Zhang
Ruyi Gan
Junyu Lu
Ziwei Wu
Renliang Sun
Jiaxing Zhang
Pingjian Zhang
Yan Song
VLM
34
6
0
26 Jan 2024
Generative Human Motion Stylization in Latent Space
Chuan Guo
Yuxuan Mu
Wei Ji
Peng Dai
Youliang Yan
Juwei Lu
Li Cheng
VGen
38
10
0
24 Jan 2024
Benchmarking Large Multimodal Models against Common Corruptions
Jiawei Zhang
Tianyu Pang
Chao Du
Yi Ren
Bo-wen Li
Min Lin
MLLM
35
14
0
22 Jan 2024
Text-to-Image Cross-Modal Generation: A Systematic Review
Maciej Żelaszczyk
Jacek Mańdziuk
35
3
0
21 Jan 2024
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
Changyao Tian
Xizhou Zhu
Yuwen Xiong
Weiyun Wang
Zhe Chen
...
Tong Lu
Jie Zhou
Hongsheng Li
Yu Qiao
Jifeng Dai
AuLLM
85
42
0
18 Jan 2024
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Xiaofeng Wang
Zheng Zhu
Guan Huang
Boyuan Wang
Xinze Chen
Jiwen Lu
VGen
40
32
0
18 Jan 2024
A New Creative Generation Pipeline for Click-Through Rate with Stable Diffusion Model
Hao Yang
Jianxin Yuan
Shuai Yang
Linhe Xu
Shuo Yuan
Yifan Zeng
28
11
0
17 Jan 2024
Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data
Yuhui Zhang
Elaine Sui
Serena Yeung-Levy
44
9
0
16 Jan 2024
ModaVerse: Efficiently Transforming Modalities with LLMs
Xinyu Wang
Bohan Zhuang
Qi Wu
14
11
0
12 Jan 2024
xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein
Bo Chen
Xingyi Cheng
Pan Li
Yangli-ao Geng
Jing Gong
...
Chiming Liu
Aohan Zeng
Yuxiao Dong
Jie Tang
Leo T. Song
42
101
0
11 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
56
9
0
31 Dec 2023
Improving Image Restoration through Removing Degradations in Textual Representations
Jingbo Lin
Zhilu Zhang
Yuxiang Wei
Dongwei Ren
Dongsheng Jiang
Wangmeng Zuo
34
26
0
28 Dec 2023
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
40
147
0
28 Dec 2023
Spike No More: Stabilizing the Pre-training of Large Language Models
Sho Takase
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
20
14
0
28 Dec 2023
ChartBench: A Benchmark for Complex Visual Reasoning in Charts
Zhengzhuo Xu
Sinan Du
Yiyan Qi
Chengjin Xu
Chun Yuan
Jian Guo
42
36
0
26 Dec 2023
Cross Initialization for Personalized Text-to-Image Generation
Lianyu Pang
Jian Yin
Haoran Xie
Qiping Wang
Qing Li
Xudong Mao
DiffM
41
7
0
26 Dec 2023
Tuning-Free Inversion-Enhanced Control for Consistent Image Editing
Xiaoyue Duan
Shuhao Cui
Guoliang Kang
Baochang Zhang
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
39
8
0
22 Dec 2023
Emage: Non-Autoregressive Text-to-Image Generation
Zhangyin Feng
Runyi Hu
Liangxin Liu
Fan Zhang
Duyu Tang
Yong Dai
Xiaocheng Feng
Jiwei Li
Bing Qin
Shuming Shi
DiffM
VLM
28
0
0
22 Dec 2023
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
Alicia Golden
Samuel Hsia
Fei Sun
Bilge Acun
Basil Hosmer
...
Zachary DeVito
Jeff Johnson
Gu-Yeon Wei
David Brooks
Carole-Jean Wu
VLM
DiffM
37
8
0
22 Dec 2023
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Lijun Yu
Xiuye Gu
José Lezama
Jonathan Huang
...
Irfan Essa
Huisheng Wang
David A. Ross
Bryan Seybold
Lu Jiang
VGen
20
240
0
21 Dec 2023
DreamTuner: Single Image is Enough for Subject-Driven Generation
Miao Hua
Jiawei Liu
Fei Ding
Wei Liu
Jie Wu
Qian He
28
28
0
21 Dec 2023
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Shweta Mahajan
Tanzila Rahman
Kwang Moo Yi
Leonid Sigal
DiffM
43
17
0
19 Dec 2023
CogCartoon: Towards Practical Story Visualization
Zhongyang Zhu
Jie Tang
DiffM
32
3
0
17 Dec 2023
A Survey of Generative AI for Intelligent Transportation Systems
Huan Yan
Yong Li
28
8
0
13 Dec 2023
ToViLaG: Your Visual-Language Generative Model is Also An Evildoer
Xinpeng Wang
Xiaoyuan Yi
Han Jiang
Shanlin Zhou
Zhihua Wei
Xing Xie
33
13
0
13 Dec 2023
The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization
Jiafeng Mao
Xueting Wang
Kiyoharu Aizawa
DiffM
63
3
0
13 Dec 2023
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
59
178
0
11 Dec 2023
ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models
Denis Zavadski
Johann-Friedrich Feiden
Carsten Rother
DiffM
48
5
0
11 Dec 2023
Negative Pre-aware for Noisy Cross-modal Matching
Xu-Yao Zhang
Hao Li
Mang Ye
38
7
0
10 Dec 2023
Previous
1
2
3
4
5
...
9
10
11
Next