ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.13290
  4. Cited By
CogView: Mastering Text-to-Image Generation via Transformers

CogView: Mastering Text-to-Image Generation via Transformers

26 May 2021
Ming Ding
Zhuoyi Yang
Wenyi Hong
Wendi Zheng
Chang Zhou
Da Yin
Junyang Lin
Xu Zou
Zhou Shao
Hongxia Yang
Jie Tang
    ViT
    VLM
ArXivPDFHTML

Papers citing "CogView: Mastering Text-to-Image Generation via Transformers"

50 / 542 papers shown
Title
Reward Incremental Learning in Text-to-Image Generation
Reward Incremental Learning in Text-to-Image Generation
Maorong Wang
Jiafeng Mao
Xueting Wang
Toshihiko Yamasaki
EGVM
103
0
0
26 Nov 2024
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image
  Synthesis
Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis
Boming Miao
C. Li
Xidong Wang
Andi Zhang
Rui Sun
Zizhe Wang
Yao Zhu
DiffM
81
0
0
25 Nov 2024
Image Generation Diversity Issues and How to Tame Them
Image Generation Diversity Issues and How to Tame Them
Mischa Dombrowski
Weitong Zhang
Sarah Cechnicka
Hadrien Reynaud
Bernhard Kainz
77
0
0
25 Nov 2024
AI-Generated Image Quality Assessment Based on Task-Specific Prompt and
  Multi-Granularity Similarity
AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity
Jili Xia
Lihuo He
Fei Gao
Peng Sun
Leida Li
Xinbo Gao
EGVM
75
1
0
25 Nov 2024
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models
Zhi-Yi Chin
Kuan-Chen Mu
Mario Fritz
Pin-Yu Chen
DiffM
90
0
0
25 Nov 2024
PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
Zhendong Liu
Yuanbi Nie
Yingshui Tan
Xiangyu Yue
Qiushi Cui
Chongjun Wang
Xiaoyong Zhu
Jian Xu
Bo Zheng
78
0
0
18 Nov 2024
Spider: Any-to-Many Multimodal LLM
Spider: Any-to-Many Multimodal LLM
Jinxiang Lai
Jie Zhang
Jun Liu
Jian Li
Xiaocheng Lu
Song Guo
MLLM
72
2
0
14 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image
  Synthesis
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
65
4
0
11 Nov 2024
Scalable, Tokenization-Free Diffusion Model Architectures with Efficient
  Initial Convolution and Fixed-Size Reusable Structures for On-Device Image
  Generation
Scalable, Tokenization-Free Diffusion Model Architectures with Efficient Initial Convolution and Fixed-Size Reusable Structures for On-Device Image Generation
Sanchar Palit
Sathya Veera Reddy Dendi
Mallikarjuna Talluri
Raj Narayana Gadde
41
0
0
09 Nov 2024
Autoregressive Models in Vision: A Survey
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
53
9
0
08 Nov 2024
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models
Ashutosh Srivastava
Tarun Ram Menta
Abhinav Java
Avadhoot Jadhav
Silky Singh
Surgan Jandial
Balaji Krishnamurthy
DiffM
43
1
0
06 Nov 2024
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
Wei Cheng
Juncheng Mu
Xianfang Zeng
Xin Chen
Anqi Pang
...
Zhibin Wang
Bin-Bin Fu
Gang Yu
Ziwei Liu
Liang Pan
50
9
0
04 Nov 2024
Towards Unifying Understanding and Generation in the Era of Vision
  Foundation Models: A Survey from the Autoregression Perspective
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie
Wenqiang Zu
Mingyang Zhao
Duo Su
Shilong Liu
Ruohua Shi
Guoqi Li
Shanghang Zhang
Lei Ma
LRM
51
3
0
29 Oct 2024
Semi-supervised Chinese Poem-to-Painting Generation via Cycle-consistent
  Adversarial Networks
Semi-supervised Chinese Poem-to-Painting Generation via Cycle-consistent Adversarial Networks
Zhengyang Lu
Tianhao Guo
Feng Wang
GAN
31
1
0
25 Oct 2024
FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
Christopher T. H. Teo
Milad Abdollahzadeh
Xinda Ma
Ngai-man Cheung
DiffM
26
1
0
24 Oct 2024
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Shuhao Gu
Jialing Zhang
Siyuan Zhou
Kevin Yu
Zhaohu Xing
...
Yufeng Cui
Xinlong Wang
Yaoqi Liu
Fangxiang Feng
Guang Liu
SyDa
VLM
MLLM
32
21
0
24 Oct 2024
Offline Evaluation of Set-Based Text-to-Image Generation
Offline Evaluation of Set-Based Text-to-Image Generation
Negar Arabzadeh
Fernando Diaz
Junfeng He
EGVM
32
0
0
22 Oct 2024
Synergistic Dual Spatial-aware Generation of Image-to-Text and
  Text-to-Image
Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Yu Zhao
Hao Fei
Xiangtai Li
L. Qin
Jiayi Ji
Erik Cambria
Meishan Zhang
Hao Fei
Jianguo Wei
DiffM
31
1
0
20 Oct 2024
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Dibyanayan Bandyopadhyay
Mohammed Hasanuzzaman
Asif Ekbal
AAML
39
0
0
17 Oct 2024
RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images
  with Autonomous Agents
RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images with Autonomous Agents
Zhuoran Liu
Danpei Zhao
Bo Yuan
30
1
0
17 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng Ann Heng
50
5
0
17 Oct 2024
An Online Learning Approach to Prompt-based Selection of Generative Models
An Online Learning Approach to Prompt-based Selection of Generative Models
Xiaoyan Hu
Ho-fung Leung
Farzan Farnia
40
2
0
17 Oct 2024
Generating Intermediate Representations for Compositional Text-To-Image
  Generation
Generating Intermediate Representations for Compositional Text-To-Image Generation
Ran Galun
Sagie Benaim
25
0
0
13 Oct 2024
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Fu-Yun Wang
Ling Yang
Zhaoyang Huang
Mengdi Wang
Hongsheng Li
44
15
0
09 Oct 2024
HyperDet: Generalizable Detection of Synthesized Images by Generating
  and Merging A Mixture of Hyper LoRAs
HyperDet: Generalizable Detection of Synthesized Images by Generating and Merging A Mixture of Hyper LoRAs
Huangsen Cao
Yongwei Wang
Yinfeng Liu
Sixian Zheng
Kangtao Lv
Zhimeng Zhang
Bo Zhang
Xin Ding
Fei Wu
36
1
0
08 Oct 2024
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu
Pengyang Ling
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
DiffM
VGen
35
0
0
08 Oct 2024
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang
Sihwan Park
J. Yang
Yeonsung Jung
Jihun Yun
Souvik Kundu
Sung-Yub Kim
Eunho Yang
51
7
0
04 Oct 2024
CaLMFlow: Volterra Flow Matching using Causal Language Models
CaLMFlow: Volterra Flow Matching using Causal Language Models
Shiyang Zhang
Daniel Levine
Ivan Vrkic
Marco Francesco Bressana
David Zhang
S. Rizvi
Yangtian Zhang
E. Zappala
David van Dijk
27
0
0
03 Oct 2024
ImageFolder: Autoregressive Image Generation with Folded Tokens
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Jiuxiang Gu
Bhiksha Raj
Zhe-nan Lin
VLM
44
18
0
02 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
58
10
0
02 Oct 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
Wenchao Chen
Liqiang Niu
Ziyao Lu
Fandong Meng
Jie Zhou
Mamba
40
4
0
30 Sep 2024
Conditional Image Synthesis with Diffusion Models: A Survey
Conditional Image Synthesis with Diffusion Models: A Survey
Zheyuan Zhan
Defang Chen
Jian-Ping Mei
Zhenghe Zhao
Jiawei Chen
Chun Chen
Siwei Lyu
Can Wang
VLM
53
5
0
28 Sep 2024
Emu3: Next-Token Prediction is All You Need
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
47
166
0
27 Sep 2024
A Unified Hallucination Mitigation Framework for Large Vision-Language
  Models
A Unified Hallucination Mitigation Framework for Large Vision-Language Models
Yue Chang
Liqiang Jing
Xiaopeng Zhang
Yue Zhang
VLM
MLLM
68
2
0
24 Sep 2024
Understanding Implosion in Text-to-Image Generative Models
Understanding Implosion in Text-to-Image Generative Models
Wenxin Ding
Cathy Y. Li
Shawn Shan
Ben Y. Zhao
Haitao Zheng
36
1
0
18 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
47
7
0
31 Aug 2024
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its
  Teacher
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
T. Dao
Thuan Hoang Nguyen
T. Le
D. Vu
Khoi Nguyen
Cuong Pham
Anh Tran
DiffM
49
13
0
26 Aug 2024
BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion
BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion
James Baker
42
1
0
08 Aug 2024
Survey: Transformer-based Models in Data Modality Conversion
Survey: Transformer-based Models in Data Modality Conversion
Elyas Rashno
Amir Eskandari
Aman Anand
F. Zulkernine
MedIm
42
0
0
08 Aug 2024
D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion
  Methods
D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion Methods
Onkar Susladkar
Gayatri S Deshmukh
Sparsh Mittal
Parth Shastri
DiffM
46
3
0
07 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
82
48
0
05 Aug 2024
LEGO: Self-Supervised Representation Learning for Scene Text Images
LEGO: Self-Supervised Representation Learning for Scene Text Images
Yujin Ren
Jiaxin Zhang
Lianwen Jin
SSL
43
0
0
04 Aug 2024
Contrasting Deepfakes Diffusion via Contrastive Learning and
  Global-Local Similarities
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
38
8
0
29 Jul 2024
Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"
Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"
Daniel Gallo Fernández
Ruazvan-Andrei Matisan
Alejandro Monroy Muñoz
Janusz Partyka
42
0
0
29 Jul 2024
LSReGen: Large-Scale Regional Generator via Backward Guidance Framework
LSReGen: Large-Scale Regional Generator via Backward Guidance Framework
Bowen Zhang
Cheng Yang
Xuanhui Liu
DiffM
42
0
0
21 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
41
15
0
10 Jul 2024
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot
  Classification
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Jiaying Shi
Xuetong Xue
Shenghui Xu
VLM
45
0
0
08 Jul 2024
PartCraft: Crafting Creative Objects by Parts
PartCraft: Crafting Creative Objects by Parts
Kam Woh Ng
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
45
6
0
05 Jul 2024
MobileFlow: A Multimodal LLM For Mobile GUI Agent
MobileFlow: A Multimodal LLM For Mobile GUI Agent
Songqin Nong
Jiali Zhu
Rui Wu
Jiongchao Jin
Shuo Shan
Xiutian Huang
Wenhao Xu
35
7
0
05 Jul 2024
Meta 3D Gen
Meta 3D Gen
Raphael Bensadoun
Tom Monnier
Yanir Kleiman
Filippos Kokkinos
Yawar Siddiqui
...
Antoine Toisoul
David Novotny
Oran Gafni
Natalia Neverova
Andrea Vedaldi
52
1
0
02 Jul 2024
Previous
12345...91011
Next