Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.14217
Cited By
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
28 April 2022
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
50 / 238 papers shown
Title
Erased but Not Forgotten: How Backdoors Compromise Concept Erasure
Jonas Henry Grebe
Tobias Braun
Marcus Rohrbach
Anna Rohrbach
AAML
85
0
0
29 Apr 2025
Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding
Kun Li
J. Wang
Yangfan He
Xinyuan Song
Ruoyu Wang
...
Keqin Li
Sida Li
Miao Zhang
Tianyu Shi
Xueqian Wang
50
0
0
25 Apr 2025
PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling
Alara Dirik
Tuanfeng Y. Wang
Duygu Ceylan
Stefanos Zafeiriou
Anna Frühstück
DiffM
47
0
0
19 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
46
1
0
17 Apr 2025
InstructEngine: Instruction-driven Text-to-Image Alignment
Xingyu Lu
Yihan Hu
Yang Zhang
Kaiyu Jiang
Changyi Liu
...
Bin Wen
C. Yuan
Fan Yang
Tingting Gao
Di Zhang
48
0
0
14 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
Jingyang Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
73
0
0
07 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
96
2
0
05 Apr 2025
FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics
Yixuan Li
Yu Tian
Yipo Huang
Wei Lu
Shiqi Wang
Weisi Lin
Anderson de Rezende Rocha
62
0
0
31 Mar 2025
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness
Dian Zheng
Ziqi Huang
Hongbo Liu
Kai Zou
Yinan He
...
Yuyao Zhang
Jingwen He
Wei-Shi Zheng
Yu Qiao
Ziwei Liu
EGVM
VGen
48
5
0
27 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
72
0
0
27 Mar 2025
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
Takashi Isobe
He Cui
Dong Zhou
Mengmeng Ge
D. Li
E. Barsoum
VGen
59
0
0
24 Mar 2025
DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis
Yongjin Choi
Chanhun Park
Seung Jun Baek
DiffM
51
0
0
22 Mar 2025
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi
Eddy Ilg
M. Keuper
Hideki Tanaka
Masao Utiyama
Raj Dabre
Steffen Eger
Simone Paolo Ponzetto
50
0
0
14 Mar 2025
Piece it Together: Part-Based Concepting with IP-Priors
Elad Richardson
Kfir Goldberg
Yuval Alaluf
Daniel Cohen-Or
DiffM
66
0
0
13 Mar 2025
FaceID-6M: A Large-Scale, Open-Source FaceID Customization Dataset
Shuhe Wang
Xiaoya Li
Jiwei Li
G. Wang
Xiaofei Sun
...
Han Qiu
Mo Yu
Shengjie Shen
Tianwei Zhang
Eduard H. Hovy
VLM
63
0
0
10 Mar 2025
Frequency Autoregressive Image Generation with Continuous Tokens
Hu Yu
Hao Luo
Hangjie Yuan
Yu Rong
Feng Zhao
VGen
44
2
0
07 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
56
1
0
02 Mar 2025
Evaluating and Predicting Distorted Human Body Parts for Generated Images
Lu Ma
Kaibo Cao
Hao Liang
Jiaxin Lin
Z. Li
Yuhong Liu
Jihong Zhang
Wentao Zhang
Bin Cui
MedIm
44
0
0
02 Mar 2025
Turn That Frown Upside Down: FaceID Customization via Cross-Training Data
Shuhe Wang
Xiaoya Li
Xiaofei Sun
G. Wang
Tianwei Zhang
Jiwei Li
Eduard H. Hovy
38
0
0
28 Jan 2025
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
82
0
0
28 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
48
9
0
08 Nov 2024
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Weijian Luo
C. Zhang
Debing Zhang
Zhengyang Geng
28
3
0
28 Oct 2024
Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences
Weijian Luo
EGVM
36
6
0
24 Oct 2024
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Shilin Lu
Zihan Zhou
Jiayou Lu
Yuanzhi Zhu
A. Kong
WIGM
94
10
0
24 Oct 2024
TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image Generation
Ruicheng Zhang
Guoheng Huang
Yejing Huo
Xiaochen Yuan
Zhizhen Zhou
Xuhang Chen
Guo Zhong
28
0
0
23 Oct 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Donghao Zhou
Jiancheng Huang
J. Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng Ann Heng
47
5
0
17 Oct 2024
Learning to Customize Text-to-Image Diffusion In Diverse Context
Taewook Kim
Wei Chen
Qiang Qiu
DiffM
38
2
0
14 Oct 2024
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai
Tian-Chun Ye
Wei Chow
Enxin Song
Qing-Guo Chen
Hefei Ling
Zhen Dong
Lei Zhu
63
13
0
10 Oct 2024
Convolutional neural networks applied to modification of images
Carlos I. Aguirre-Velez
Jose Antonio Arciniega-Nevarez
Eric Dolores-Cuenca
16
1
0
08 Oct 2024
CusConcept: Customized Visual Concept Decomposition with Diffusion Models
Zhi Xu
Shaozhe Hao
Kai Han
DiffM
30
4
0
01 Oct 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
36
18
0
24 Sep 2024
LinFusion: 1 GPU, 1 Minute, 16K Image
Songhua Liu
Weihao Yu
Zhenxiong Tan
Xinchao Wang
48
13
0
03 Sep 2024
One-Shot Learning Meets Depth Diffusion in Multi-Object Videos
Anisha Jain
VGen
DiffM
MDE
29
1
0
29 Aug 2024
Iterative Object Count Optimization for Text-to-image Diffusion Models
Oz Zafar
Lior Wolf
Idan Schwartz
VLM
27
3
0
21 Aug 2024
Quality Assessment in the Era of Large Models: A Survey
Zicheng Zhang
Yingjie Zhou
Chunyi Li
Baixuan Zhao
Xiaohong Liu
Guangtao Zhai
42
10
0
17 Aug 2024
Fine-gained Zero-shot Video Sampling
Dengsheng Chen
Jie Hu
Javier Segovia-Aguas
Enhua Wu
VGen
DiffM
29
0
0
31 Jul 2024
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Zhichao Zhang
Xinyue Li
Wei Sun
Jun Jia
Xiongkuo Min
...
Puyi Wang
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Guangtao Zhai
EGVM
50
5
0
31 Jul 2024
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Vikash Sehwag
Xianghao Kong
Jingtao Li
Michael Spranger
Lingjuan Lyu
DiffM
47
9
0
22 Jul 2024
LTSim: Layout Transportation-based Similarity Measure for Evaluating Layout Generation
Mayu Otani
Naoto Inoue
Kotaro Kikuchi
Riku Togashi
3DV
39
4
0
17 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
41
15
0
10 Jul 2024
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
Wanggui He
Siming Fu
Mushui Liu
Xierui Wang
Wenyi Xiao
...
Zhelun Yu
Haoyuan Li
Ziwei Huang
Leilei Gan
Hao Jiang
DiffM
24
23
0
10 Jul 2024
PartCraft: Crafting Creative Objects by Parts
Kam Woh Ng
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
42
6
0
05 Jul 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
77
34
0
26 Jun 2024
Text-Animator: Controllable Visual Text Video Generation
Lin Liu
Quande Liu
Shengju Qian
Yuan Zhou
Wengang Zhou
Houqiang Li
Lingxi Xie
Qi Tian
VGen
33
1
0
25 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
75
31
0
24 Jun 2024
Neural Residual Diffusion Models for Deep Scalable Vision Generation
Zhiyuan Ma
Liangliang Zhao
Biqing Qi
Bowen Zhou
DiffM
64
2
0
19 Jun 2024
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
Team GLM
:
Aohan Zeng
Bin Xu
Bowen Wang
...
Zhaoyu Wang
Zhen Yang
Zhengxiao Du
Zhenyu Hou
Zihan Wang
ALM
65
500
0
18 Jun 2024
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Alireza Ganjdanesh
Reza Shirkavand
Shangqian Gao
Heng Huang
DiffM
VLM
56
4
0
17 Jun 2024
ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Xiang Li
Kai Qiu
Hao Chen
Jason Kuen
Zhe-nan Lin
Rita Singh
Bhiksha Raj
DiffM
43
21
0
14 Jun 2024
Vivid-ZOO: Multi-View Video Generation with Diffusion Model
Bing Li
Cheng Zheng
Wenxuan Zhu
Jinjie Mai
Biao Zhang
Peter Wonka
Bernard Ghanem
45
16
0
12 Jun 2024
1
2
3
4
5
Next