ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.14217
  4. Cited By
CogView2: Faster and Better Text-to-Image Generation via Hierarchical
  Transformers

CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers

28 April 2022
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
    VLM
ArXivPDFHTML

Papers citing "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"

50 / 238 papers shown
Title
ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
  Constraints
ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints
Elad Richardson
Kfir Goldberg
Yuval Alaluf
Daniel Cohen-Or
DiffM
31
10
0
03 Aug 2023
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Shilin Lu
Yanzhu Liu
A. Kong
43
51
0
24 Jul 2023
JourneyDB: A Benchmark for Generative Image Understanding
JourneyDB: A Benchmark for Generative Image Understanding
Keqiang Sun
Junting Pan
Yuying Ge
Hao Li
Haodong Duan
...
Yi Wang
Jifeng Dai
Yu Qiao
Limin Wang
Hongsheng Li
54
102
0
03 Jul 2023
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
Yun-Hao Bai
Xintao Wang
Yanpei Cao
Yixiao Ge
Chun Yuan
Ying Shan
DiffM
24
50
0
29 Jun 2023
DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image
  Generation using Limited Data
DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data
Jin Zhu
Huimin Ma
Jiansheng Chen
Jian Yuan
DiffM
24
10
0
25 Jun 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human
  Preferences of Text-to-Image Synthesis
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
46
252
0
15 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
42
29
0
09 Jun 2023
Grounded Text-to-Image Synthesis with Attention Refocusing
Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
30
104
0
08 Jun 2023
Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image
  Diffusion Models
Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Mats L. Richter
Christopher Pal
Marc Aubreville
DiffM
VLM
26
42
0
01 Jun 2023
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine
  Semantic Re-alignment
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment
Guian Fang
Zutao Jiang
Jianhua Han
Guangsong Lu
Hang Xu
Shengcai Liao
Xiaodan Liang
EGVM
29
1
0
31 May 2023
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for
  Text-driven Video Editing
SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing
Nazmul Karim
Umar Khalid
M. Joneidi
Chen Chen
Nazanin Rahnavard
DiffM
VGen
19
5
0
30 May 2023
Controllable Text-to-Image Generation with GPT-4
Controllable Text-to-Image Generation with GPT-4
Tianjun Zhang
Yi Zhang
Vibhav Vineet
Neel Joshi
Xin Wang
DiffM
23
42
0
29 May 2023
Gen-L-Video: Multi-Text to Long Video Generation via Temporal
  Co-Denoising
Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising
Fu Lee Wang
Wenshuo Chen
Guanglu Song
Han-Jia Ye
Yu Liu
Hongsheng Li
VGen
DiffM
48
88
0
29 May 2023
Vision + Language Applications: A Survey
Vision + Language Applications: A Survey
Yutong Zhou
N. Shimada
VLM
30
6
0
24 May 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot
  Text-to-Video Generation
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
28
34
0
23 May 2023
Enhancing Detail Preservation for Customized Text-to-Image Generation: A
  Regularization-Free Approach
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach
Yufan Zhou
Ruiyi Zhang
Tongfei Sun
Jinhui Xu
DiffM
109
37
0
23 May 2023
ControlVideo: Training-free Controllable Text-to-Video Generation
ControlVideo: Training-free Controllable Text-to-Video Generation
Yabo Zhang
Yuxiang Wei
Dongsheng Jiang
Xiaopeng Zhang
W. Zuo
Qi Tian
VGen
DiffM
36
236
0
22 May 2023
Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with
  Images as Pivots
Efficient Cross-Lingual Transfer for Chinese Stable Diffusion with Images as Pivots
Jinyi Hu
Xu Han
Xiaoyuan Yi
Yutong Chen
Wenhao Li
Zhiyuan Liu
Maosong Sun
DiffM
30
4
0
19 May 2023
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
Wenjing Wang
Huan Yang
Zixi Tuo
Huiguo He
Junchen Zhu
Jianlong Fu
Jiaying Liu
DiffM
VGen
45
114
0
18 May 2023
X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation
  with Visual Large Language Models
X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models
Yixiong Chen
Li Liu
C. Ding
34
21
0
18 May 2023
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style
  Transfer
Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer
Nisha Huang
Yuxin Zhang
Weiming Dong
DiffM
VGen
29
16
0
09 May 2023
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive
  Transformers
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers
Rong Wu
Wanchao Su
Kede Ma
Jing Liao
35
34
0
27 Apr 2023
Text2Performer: Text-Driven Human Video Generation
Text2Performer: Text-Driven Human Video Generation
Yuming Jiang
Shuai Yang
Tong Liang Koh
Wayne Wu
Chen Change Loy
Ziwei Liu
DiffM
VGen
45
48
0
17 Apr 2023
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient
  Text-to-Video Generation
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation
Jie An
Songyang Zhang
Harry Yang
Sonal Gupta
Jia-Bin Huang
Jiebo Luo
Xiaoyue Yin
DiffM
VGen
32
106
0
17 Apr 2023
Expressive Text-to-Image Generation with Rich Text
Expressive Text-to-Image Generation with Rich Text
Songwei Ge
Taesung Park
Jun-Yan Zhu
Jia-Bin Huang
DiffM
79
79
0
13 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image
  Generation
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
46
313
0
12 Apr 2023
Gradient-Free Textual Inversion
Gradient-Free Textual Inversion
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
33
31
0
12 Apr 2023
Multi-scale Geometry-aware Transformer for 3D Point Cloud Classification
Multi-scale Geometry-aware Transformer for 3D Point Cloud Classification
Xian Wei
Muyu Wang
S. J. Lin
Zhengyu Li
Jian Yang
Arafat Al-Jawari
Xuan Tang
3DPC
ViT
16
2
0
12 Apr 2023
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image
  Models
HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models
Eslam Mohamed Bakr
Pengzhan Sun
Xiaoqian Shen
Faizan Farooq Khan
Li Erran Li
Mohamed Elhoseiny
VLM
24
76
0
11 Apr 2023
InstantBooth: Personalized Text-to-Image Generation without Test-Time
  Finetuning
InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning
Jing Shi
Wei Xiong
Zhe-nan Lin
H. J. Jung
DiffM
130
279
0
06 Apr 2023
Training-Free Layout Control with Cross-Attention Guidance
Training-Free Layout Control with Cross-Attention Guidance
Minghao Chen
Iro Laina
Andrea Vedaldi
DiffM
135
222
0
06 Apr 2023
Taming Encoder for Zero Fine-tuning Image Customization with
  Text-to-Image Diffusion Models
Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models
Xuhui Jia
Yang Zhao
Kelvin C. K. Chan
Yandong Li
Han-Ying Zhang
Boqing Gong
Tingbo Hou
Haoran Wang
Yu-Chuan Su
DiffM
19
100
0
05 Apr 2023
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image
  Generation
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation
Mayu Otani
Riku Togashi
Yu Sawai
Ryosuke Ishigami
Yuta Nakashima
Esa Rahtu
J. Heikkilä
Shiníchi Satoh
38
62
0
04 Apr 2023
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free
  Videos
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
Yue Ma
Yin-Yin He
Xiaodong Cun
Xintao Wang
Siran Chen
Ying Shan
Xiu Li
Qifeng Chen
DiffM
VGen
37
176
0
03 Apr 2023
Social Biases through the Text-to-Image Generation Lens
Social Biases through the Text-to-Image Generation Lens
Ranjita Naik
Besmira Nushi
107
113
0
30 Mar 2023
Discriminative Class Tokens for Text-to-Image Diffusion Models
Discriminative Class Tokens for Text-to-Image Diffusion Models
Idan Schwartz
Vésteinn Snaebjarnarson
Hila Chefer
Ryan Cotterell
Serge Belongie
Lior Wolf
Sagie Benaim
25
9
0
30 Mar 2023
MDP: A Generalized Framework for Text-Guided Image Editing by
  Manipulating the Diffusion Path
MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
30
17
0
29 Mar 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
  Textual Guidance
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma
Xiaioqing Zhang
Xiaoshuai Sun
Jiayi Ji
Haowei Wang
Guannan Jiang
Weilin Zhuang
Rongrong Ji
23
39
0
28 Mar 2023
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu
Chuan Wen
Weirui Ye
Jiaming Song
Yang Gao
DiffM
VGen
21
40
0
27 Mar 2023
Freestyle Layout-to-Image Synthesis
Freestyle Layout-to-Image Synthesis
Han Xue
Z. Huang
Qianru Sun
Li-Na Song
Wenjun Zhang
DiffM
17
62
0
25 Mar 2023
Ablating Concepts in Text-to-Image Diffusion Models
Ablating Concepts in Text-to-Image Diffusion Models
Nupur Kumari
Bin Zhang
Sheng-Yu Wang
Eli Shechtman
Richard Y. Zhang
Jun-Yan Zhu
VLM
21
184
0
23 Mar 2023
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
Haoxuan You
Mandy Guo
Zhecan Wang
Kai-Wei Chang
Jason Baldridge
Jiahui Yu
DiffM
49
12
0
23 Mar 2023
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video
  Generators
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan
A. Movsisyan
Vahram Tadevosyan
Roberto Henschel
Zhangyang Wang
Shant Navasardyan
Humphrey Shi
VGen
29
541
0
23 Mar 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
Sheng-Siang Yin
Chenfei Wu
Huan Yang
Jianfeng Wang
Xiaodong Wang
...
Gong Ming
Lijuan Wang
Zicheng Liu
Houqiang Li
Nan Duan
VGen
20
125
0
22 Mar 2023
MAGVLT: Masked Generative Vision-and-Language Transformer
MAGVLT: Masked Generative Vision-and-Language Transformer
Sungwoong Kim
DaeJin Jo
Donghoon Lee
Jongmin Kim
VLM
44
11
0
21 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
85
159
0
21 Mar 2023
IRGen: Generative Modeling for Image Retrieval
IRGen: Generative Modeling for Image Retrieval
Yidan Zhang
Ting Zhang
Dong Chen
Yujing Wang
Qi Chen
...
Qi Zhang
Fan Yang
Mao Yang
Q. Liao
B. Guo
3DV
VLM
35
14
0
17 Mar 2023
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
  Tokenizer of a Large-Scale Generative Model
StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model
Zipeng Xu
E. Sangineto
N. Sebe
DiffM
22
12
0
16 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
132
215
0
15 Mar 2023
Architext: Language-Driven Generative Architecture Design
Architext: Language-Driven Generative Architecture Design
Theodoros Galanos
Antonios Liapis
Georgios N. Yannakakis
VLM
AI4CE
26
6
0
13 Mar 2023
Previous
12345
Next