ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.13290
  4. Cited By
CogView: Mastering Text-to-Image Generation via Transformers

CogView: Mastering Text-to-Image Generation via Transformers

26 May 2021
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
Da Yin
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
    ViT
    VLM
ArXivPDFHTML

Papers citing "CogView: Mastering Text-to-Image Generation via Transformers"

50 / 540 papers shown
Title
iDesigner: A High-Resolution and Complex-Prompt Following Text-to-Image
  Diffusion Model for Interior Design
iDesigner: A High-Resolution and Complex-Prompt Following Text-to-Image Diffusion Model for Interior Design
Ruyi Gan
Xiaojun Wu
Junyu Lu
Yuanhe Tian
Di Zhang
...
Renliang Sun
Chang Liu
Jiaxing Zhang
Pingjian Zhang
Yan Song
108
4
0
07 Dec 2023
MotionCtrl: A Unified and Flexible Motion Controller for Video
  Generation
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Zhouxia Wang
Ziyang Yuan
Xintao Wang
Tianshui Chen
Menghan Xia
Ping Luo
Ying Shan
DiffM
VGen
50
198
0
06 Dec 2023
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
Zirui Wang
Zhizhou Sha
Zheng Ding
Yilin Wang
Zhuowen Tu
DiffM
44
20
0
06 Dec 2023
F3-Pruning: A Training-Free and Generalized Pruning Strategy towards
  Faster and Finer Text-to-Video Synthesis
F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Jingkuan Song
Jianzhi Liu
Lianli Gao
Jingkuan Song
DiffM
VGen
25
4
0
06 Dec 2023
FERGI: Automatic Annotation of User Preferences for Text-to-Image
  Generation from Spontaneous Facial Expression Reaction
FERGI: Automatic Annotation of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction
Shuangquan Feng
Junhua Ma
Virginia R. de Sa
EGVM
29
0
0
05 Dec 2023
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera
  Driving Scene Generation
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Jiachen Lu
Ze Huang
Zeyu Yang
Jiahui Zhang
Li Zhang
VGen
22
40
0
05 Dec 2023
Transformer-Based Deep Learning Model for Bored Pile Load-Deformation
  Prediction in Bangkok Subsoil
Transformer-Based Deep Learning Model for Bored Pile Load-Deformation Prediction in Bangkok Subsoil
S. Youwai
Chissanupong Thongnoo
AI4CE
14
0
0
05 Dec 2023
TPA3D: Triplane Attention for Fast Text-to-3D Generation
TPA3D: Triplane Attention for Fast Text-to-3D Generation
Hong-En Chen
Bin-Shih Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
22
2
0
05 Dec 2023
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Stable Diffusion Exposed: Gender Bias from Prompt to Image
Yankun Wu
Yuta Nakashima
Noa Garcia
28
16
0
05 Dec 2023
DiffiT: Diffusion Vision Transformers for Image Generation
DiffiT: Diffusion Vision Transformers for Image Generation
Ali Hatamizadeh
Jiaming Song
Guilin Liu
Jan Kautz
Arash Vahdat
39
67
0
04 Dec 2023
StoryGPT-V: Large Language Models as Consistent Story Visualizers
StoryGPT-V: Large Language Models as Consistent Story Visualizers
Xiaoqian Shen
Mohamed Elhoseiny
VLM
101
10
0
04 Dec 2023
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting
  Activations to 3D
Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey
Paul Guerrero
Matheus Gadelha
Yannick Hold-Geoffroy
Karan Singh
Niloy Mitra
DiffM
34
31
0
02 Dec 2023
VideoBooth: Diffusion-based Video Generation with Image Prompts
VideoBooth: Diffusion-based Video Generation with Image Prompts
Yuming Jiang
Tianxing Wu
Shuai Yang
Chenyang Si
Dahua Lin
Yu Qiao
Chen Change Loy
Ziwei Liu
DiffM
VGen
40
65
0
01 Dec 2023
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Zineng Tang
Ziyi Yang
Mahmoud Khademi
Yang Liu
Chenguang Zhu
Mohit Bansal
LRM
MLLM
AuLLM
56
45
0
30 Nov 2023
Transformer Based Model for Predicting Rapid Impact Compaction Outcomes:
  A Case Study of Utapao International Airport
Transformer Based Model for Predicting Rapid Impact Compaction Outcomes: A Case Study of Utapao International Airport
S. Youwai
Sirasak Detcheewa
AI4CE
22
0
0
29 Nov 2023
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
Yutong Feng
Biao Gong
Di Chen
Yujun Shen
Yu Liu
Jingren Zhou
DiffM
36
43
0
28 Nov 2023
Reason out Your Layout: Evoking the Layout Master from Large Language
  Models for Text-to-Image Synthesis
Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis
Xiaohui Chen
Yongfei Liu
Yingxiang Yang
Jianbo Yuan
Quanzeng You
Liping Liu
Hongxia Yang
DiffM
56
11
0
28 Nov 2023
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation
  in non-English Text-to-Image Generation
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation
Jiancang Ma
Chen Chen
Qingsong Xie
H. Lu
DiffM
VLM
35
3
0
28 Nov 2023
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Sicong Leng
Yangqiaoyu Zhou
Mohammed Haroon Dupty
W. Lee
Sam Joyce
Wei Lu
3DV
40
11
0
27 Nov 2023
Learning Disentangled Identifiers for Action-Customized Text-to-Image
  Generation
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
Siteng Huang
Biao Gong
Yutong Feng
Xi Chen
Yu Fu
Yu Liu
Donglin Wang
DiffM
29
13
0
27 Nov 2023
Check, Locate, Rectify: A Training-Free Layout Calibration System for
  Text-to-Image Generation
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Biao Gong
Siteng Huang
Yutong Feng
Shiwei Zhang
Yuyuan Li
Yu Liu
DiffM
33
11
0
27 Nov 2023
DreamCreature: Crafting Photorealistic Virtual Creatures from
  Imagination
DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination
KamWoh Ng
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
DiffM
20
6
0
27 Nov 2023
CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image
  Personalization
CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization
Ruoyu Zhao
Mingrui Zhu
Shiyin Dong
Nannan Wang
Xinbo Gao
DiffM
35
13
0
24 Nov 2023
Using Human Feedback to Fine-tune Diffusion Models without Any Reward
  Model
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Kai Yang
Jian Tao
Jiafei Lyu
Chunjiang Ge
Jiaxin Chen
Qimai Li
Weihan Shen
Xiaolong Zhu
Xiu Li
EGVM
23
89
0
22 Nov 2023
Steal My Artworks for Fine-tuning? A Watermarking Framework for
  Detecting Art Theft Mimicry in Text-to-Image Models
Steal My Artworks for Fine-tuning? A Watermarking Framework for Detecting Art Theft Mimicry in Text-to-Image Models
Ge Luo
Junqiang Huang
Manman Zhang
Zhenxing Qian
Sheng Li
Xinpeng Zhang
WIGM
25
9
0
22 Nov 2023
High-fidelity Person-centric Subject-to-Image Synthesis
High-fidelity Person-centric Subject-to-Image Synthesis
Yibin Wang
Weizhong Zhang
Jianwei Zheng
Cheng Jin
VGen
31
21
0
17 Nov 2023
Intelligent Generation of Graphical Game Assets: A Conceptual Framework
  and Systematic Review of the State of the Art
Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the Art
Kaisei Fukaya
Damon Daylamani-Zad
Harry Agius
59
2
0
16 Nov 2023
Instant3D: Instant Text-to-3D Generation
Instant3D: Instant Text-to-3D Generation
Ming Li
Pan Zhou
Jia-Wei Liu
Jussi Keppo
Min Lin
Shuicheng Yan
Xiangyu Xu
41
30
0
14 Nov 2023
A Survey of AI Text-to-Image and AI Text-to-Video Generators
A Survey of AI Text-to-Image and AI Text-to-Video Generators
Aditi Singh
24
20
0
10 Nov 2023
Noise-Free Score Distillation
Noise-Free Score Distillation
Oren Katzir
Or Patashnik
Daniel Cohen-Or
Dani Lischinski
DiffM
21
70
0
26 Oct 2023
Fuse Your Latents: Video Editing with Multi-source Latent Diffusion
  Models
Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Tianyi Lu
Xing Zhang
Jiaxi Gu
Hang Xu
Renjing Pei
Songcen Xu
Zuxuan Wu
DiffM
VGen
33
4
0
25 Oct 2023
Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image
  Generative Models
Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models
Shawn Shan
Wenxin Ding
Josephine Passananti
Stanley Wu
Haitao Zheng
Ben Y. Zhao
SILM
DiffM
31
44
0
20 Oct 2023
VcT: Visual change Transformer for Remote Sensing Image Change Detection
VcT: Visual change Transformer for Remote Sensing Image Change Detection
Bo Jiang
Zitian Wang
Xixi Wang
Ziyan Zhang
Lan Chen
Tianlin Li
Bin Luo
ViT
29
39
0
17 Oct 2023
LLM Blueprint: Enabling Text-to-Image Generation with Complex and
  Detailed Prompts
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
Hanan Gani
Shariq Farooq Bhat
Muzammal Naseer
Salman Khan
Peter Wonka
DiffM
44
38
0
16 Oct 2023
Improving Compositional Text-to-image Generation with Large
  Vision-Language Models
Improving Compositional Text-to-image Generation with Large Vision-Language Models
Song Wen
Guian Fang
Renrui Zhang
Peng Gao
Hao Dong
Dimitris N. Metaxas
25
17
0
10 Oct 2023
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video
  editing
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Yuren Cong
Mengmeng Xu
Christian Simon
Shoufa Chen
Jiawei Ren
Yanping Xie
Juan-Manuel Perez-Rua
Bodo Rosenhahn
Tao Xiang
Sen He
DiffM
VGen
35
74
0
09 Oct 2023
Leveraging Unpaired Data for Vision-Language Generative Models via Cycle
  Consistency
Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency
Tianhong Li
Sangnie Bhardwaj
Yonglong Tian
Han Zhang
Jarred Barber
Dina Katabi
Guillaume Lajoie
Huiwen Chang
Dilip Krishnan
VLM
49
4
0
05 Oct 2023
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and
  Latent Diffusion
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Anton Razzhigaev
Arseniy Shakhmatov
Anastasia Maltseva
V.Ya. Arkhipkin
Igor Pavlov
Ilya Ryabov
Angelina Kuts
Alexander Panchenko
Andrey Kuznetsov
Denis Dimitrov
51
78
0
05 Oct 2023
Making LLaMA SEE and Draw with SEED Tokenizer
Making LLaMA SEE and Draw with SEED Tokenizer
Yuying Ge
Sijie Zhao
Ziyun Zeng
Yixiao Ge
Chen Li
Xintao Wang
Ying Shan
38
128
0
02 Oct 2023
Distilling Inductive Bias: Knowledge Distillation Beyond Model
  Compression
Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
VLM
30
0
0
30 Sep 2023
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with
  TikZ
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Jonas Belouadi
Anne Lauscher
Steffen Eger
25
28
0
30 Sep 2023
Social Media Fashion Knowledge Extraction as Captioning
Social Media Fashion Knowledge Extraction as Captioning
Yifei Yuan
Wenxuan Zhang
Yang Deng
Wai Lam
19
1
0
28 Sep 2023
Emu: Enhancing Image Generation Models Using Photogenic Needles in a
  Haystack
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Xiaoliang Dai
Ji Hou
Chih-Yao Ma
Sam S. Tsai
Jialiang Wang
...
Roshan Sumbaly
Vignesh Ramanathan
Zijian He
Peter Vajda
Devi Parikh
VLM
36
198
0
27 Sep 2023
Teaching Text-to-Image Models to Communicate in Dialog
Teaching Text-to-Image Models to Communicate in Dialog
Xiaowen Sun
Jiazhan Feng
Yuxuan Wang
Yuxuan Lai
Xingyu Shen
Dongyan Zhao
DiffM
32
1
0
27 Sep 2023
GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided
  Video DecodER
GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided Video DecodER
Mingzhen Sun
Weining Wang
Zihan Qin
Jiahui Sun
Si-Qing Chen
Qingbin Liu
DiffM
37
3
0
23 Sep 2023
TextCLIP: Text-Guided Face Image Generation And Manipulation Without
  Adversarial Training
TextCLIP: Text-Guided Face Image Generation And Manipulation Without Adversarial Training
Xiaozhou You
Jian Zhang
CLIP
VLM
15
0
0
21 Sep 2023
DreamLLM: Synergistic Multimodal Comprehension and Creation
DreamLLM: Synergistic Multimodal Comprehension and Creation
Runpei Dong
Chunrui Han
Yuang Peng
Zekun Qi
Zheng Ge
...
Hao-Ran Wei
Xiangwen Kong
Xiangyu Zhang
Kaisheng Ma
Li Yi
MLLM
47
176
0
20 Sep 2023
Discuss Before Moving: Visual Language Navigation via Multi-expert
  Discussions
Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions
Yuxing Long
Xiaoqi Li
Wenzhe Cai
Hao Dong
LLMAG
LM&Ro
32
45
0
20 Sep 2023
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model
  Pre-trained from Scratch
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Juntao Li
Zecheng Tang
Yuyang Ding
Pinzheng Wang
Pei Guo
...
Wenliang Chen
Guohong Fu
Qiaoming Zhu
Guodong Zhou
Hao Fei
45
5
0
19 Sep 2023
Looking at words and points with attention: a benchmark for
  text-to-shape coherence
Looking at words and points with attention: a benchmark for text-to-shape coherence
Andrea Amaduzzi
Giuseppe Lisanti
Samuele Salti
Luigi Di Stefano
18
2
0
14 Sep 2023
Previous
123456...91011
Next