ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.07265
  4. Cited By
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

10 March 2025
Yuwei Niu
Munan Ning
Mengren Zheng
Weiyang Jin
Bin Lin
Peng Jin
Jiaqi Liao
Chaoran Feng
Kunpeng Ning
Bin Zhu
Li Yuan
    EGVM
ArXivPDFHTML

Papers citing "WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation"

50 / 59 papers shown
Title
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
Kaijie Chen
Zihao Lin
Zhiyang Xu
Ying Shen
Yuguang Yao
Joy Rimchala
Jiaxin Zhang
Lifu Huang
EGVM
LRM
27
0
0
29 May 2025
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
Jiayu Wang
Yang Jiao
Yue Yu
Tianwen Qian
Shaoxiang Chen
Jingjing Chen
Yu Jiang
MLLM
LM&MA
ELM
32
0
0
24 May 2025
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
Litao Guo
Xinli Xu
Luozhou Wang
Jiantao Lin
Jinsong Zhou
Zixin Zhang
Bolan Su
Ying-Cong Chen
LLMAG
LRM
36
0
0
23 May 2025
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
Jingjing Jiang
Chongjie Si
Jun Luo
Hanwang Zhang
Chao Ma
93
0
0
23 May 2025
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
Yongliang Wu
Zonghui Li
Xinting Hu
Xinyu Ye
Xianfang Zeng
Gang Yu
Wenbo Zhu
Bernt Schiele
Ming-Hsuan Yang
Xu Yang
VLM
48
0
0
22 May 2025
MMaDA: Multimodal Large Diffusion Language Models
MMaDA: Multimodal Large Diffusion Language Models
Ling Yang
Ye Tian
Bowen Li
Xinchen Zhang
Ke Shen
Yunhai Tong
Mengdi Wang
VLM
LRM
82
2
0
21 May 2025
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Yicheng Xiao
Lin Song
Yukang Chen
Yingmin Luo
Yuxin Chen
Yukang Gan
Wei Huang
Xiu Li
Xiaojuan Qi
Ying Shan
LRM
26
2
0
19 May 2025
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
Yang Liu
Ming Ma
Xiaomin Yu
Pengxiang Ding
Han Zhao
Mingyang Sun
Siteng Huang
Donglin Wang
LRM
74
0
0
18 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li
Xiyang Wu
Guangyao Shi
Yubin Qin
Hongyang Du
Tianyi Zhou
Dinesh Manocha
Jordan Lee Boyd-Graber
MLLM
74
0
0
02 May 2025
WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
D. Zhang
Che Jiang
Ruoshi Xu
Biaoxiang Chen
Zijian Jin
Yutian Lu
Jianguo Zhang
Liang Yong
Jiebo Luo
Shengda Luo
VLM
62
0
0
02 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
Haoyang Li
LRM
103
14
0
01 May 2025
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Ning Li
Jingran Zhang
Justin Cui
MLLM
124
2
0
09 Apr 2025
Transfer between Modalities with MetaQueries
Transfer between Modalities with MetaQueries
Xichen Pan
Satya Narayan Shukla
Aashu Singh
Zhuokai Zhao
Shlok Kumar Mishra
...
Jiuhai Chen
Kunpeng Li
F. Xu
Ji Hou
Saining Xie
DiffM
60
12
0
08 Apr 2025
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
Xiangyu Zhao
Peiyuan Zhang
Kexian Tang
Hao Li
Zicheng Zhang
...
Guangtao Zhai
Junchi Yan
Hua Yang
Xue Yang
Haodong Duan
VLM
LRM
100
5
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
110
16
0
03 Apr 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
Wentao Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
334
5
0
27 Mar 2025
LangBridge: Interpreting Image as a Combination of Language Embeddings
LangBridge: Interpreting Image as a Combination of Language Embeddings
Jiaqi Liao
Yuwei Niu
Fanqing Meng
Hao Li
Changyao Tian
...
Dianqi Li
X. Zhu
Li Yuan
Jifeng Dai
Yu Cheng
MLLM
89
0
0
25 Mar 2025
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Xiaokang Chen
Zhiyu Wu
Xingchao Liu
Zizheng Pan
Wen Liu
Zhenda Xie
X. Yu
Chong Ruan
AI4TS
43
113
0
29 Jan 2025
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
166
10
0
19 Dec 2024
MetaMorph: Multimodal Understanding and Generation via Instruction
  Tuning
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning
Shengbang Tong
David Fan
Jiachen Zhu
Yunyang Xiong
Xinlei Chen
Koustuv Sinha
Michael G. Rabbat
Yann LeCun
Saining Xie
Zhuang Liu
VLM
92
36
0
18 Dec 2024
SynerGen-VL: Towards Synergistic Image Understanding and Generation with
  Vision Experts and Token Folding
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Hao Li
Changyao Tian
Jie Shao
X. Zhu
Zhaokai Wang
...
Wenhan Dou
Xiaogang Wang
Hongsheng Li
Lewei Lu
Jifeng Dai
MLLM
97
12
0
12 Dec 2024
Multimodal Latent Language Modeling with Next-Token Diffusion
Multimodal Latent Language Modeling with Next-Token Diffusion
Yutao Sun
Hangbo Bao
Wenhui Wang
Zhiliang Peng
Li Dong
Shaohan Huang
Jianyong Wang
Furu Wei
75
11
0
11 Dec 2024
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution
  Image Synthesis
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
J. N. Han
Jinlai Liu
Yi Jiang
Bin Yan
Yuqi Zhang
Zehuan Yuan
Bingyue Peng
Xiaobing Liu
80
43
0
05 Dec 2024
Open-Sora Plan: Open-Source Large Video Generation Model
Bin Lin
Yunyang Ge
Xinhua Cheng
Zongjian Li
Bin Zhu
...
Zhang Pan
Xing Zhou
Shaoling Dong
Yonghong Tian
Li-xin Yuan
VLM
VGen
139
68
0
28 Nov 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
172
9
0
28 Nov 2024
Evaluating the Generation of Spatial Relations in Text and Image
  Generative Models
Evaluating the Generation of Spatial Relations in Text and Image Generative Models
Shang Hong Sim
Clarence Lee
A. Tan
Cheston Tan
EGVM
41
3
0
12 Nov 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with
  Continuous Tokens
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
81
50
0
17 Oct 2024
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding
  and Generation
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu
Xiaokang Chen
Z. F. Wu
Yiyang Ma
Xingchao Liu
...
Wen Liu
Zhenda Xie
Xingkai Yu
Chong Ruan
Ping Luo
AI4TS
93
89
0
17 Oct 2024
Emu3: Next-Token Prediction is All You Need
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
63
183
0
27 Sep 2024
ConceptMix: A Compositional Image Generation Benchmark with Controllable
  Difficulty
ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
Xindi Wu
Dingli Yu
Yangsibo Huang
Olga Russakovsky
Sanjeev Arora
CoGe
EGVM
62
18
0
26 Aug 2024
Show-o: One Single Transformer to Unify Multimodal Understanding and
  Generation
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie
Weijia Mao
Zechen Bai
David Junhao Zhang
Weihao Wang
Kevin Qinghong Lin
Yuchao Gu
Zhijie Chen
Zhenheng Yang
Mike Zheng Shou
79
188
0
22 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
70
165
0
20 Aug 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual
  Generation
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
61
31
0
19 Jun 2024
PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image
  Models
PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models
Fanqing Meng
Wenqi Shao
Lixin Luo
Yahong Wang
Yiran Chen
...
Yue Yang
Tianshuo Yang
Kaipeng Zhang
Yu Qiao
Ping Luo
EGVM
71
10
0
17 Jun 2024
Commonsense-T2I Challenge: Can Text-to-Image Generation Models
  Understand Commonsense?
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?
Xingyu Fu
Muyu He
Yujie Lu
William Yang Wang
Dan Roth
EGVM
LRM
39
19
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
83
253
0
10 Jun 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
107
290
0
16 May 2024
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
Yuying Ge
Sijie Zhao
Jinguo Zhu
Yixiao Ge
Kun Yi
Lin Song
Chen Li
Xiaohan Ding
Ying Shan
VLM
79
120
0
22 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
72
281
0
03 Apr 2024
Evaluating Text-to-Visual Generation with Image-to-Text Generation
Evaluating Text-to-Visual Generation with Image-to-Text Generation
Zhiqiu Lin
Deepak Pathak
Baiqi Li
Jiayao Li
Xide Xia
Graham Neubig
Pengchuan Zhang
Deva Ramanan
EGVM
65
143
0
01 Apr 2024
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Xiwei Hu
Rui Wang
Yixiao Fang
Bin-Bin Fu
Pei Cheng
Gang Yu
VLM
69
74
0
08 Mar 2024
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser
Sumith Kulal
A. Blattmann
Rahim Entezari
Jonas Muller
...
Zion English
Kyle Lacey
Alex Goodwin
Yannik Marek
Robin Rombach
DiffM
201
1,187
0
05 Mar 2024
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in
  Text-to-Image Generation
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation
Daiqing Li
Aleks Kamko
Ehsan Akhgari
Ali Sabet
Linmiao Xu
Suhail Doshi
29
97
0
27 Feb 2024
GenEval: An Object-Focused Framework for Evaluating Text-to-Image
  Alignment
GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment
Dhruba Ghosh
Hanna Hajishirzi
Ludwig Schmidt
62
167
0
17 Oct 2023
SDXL: Improving Latent Diffusion Models for High-Resolution Image
  Synthesis
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell
Zion English
Kyle Lacey
A. Blattmann
Tim Dockhorn
Jonas Muller
Joe Penna
Robin Rombach
155
2,242
0
04 Jul 2023
Human Preference Score v2: A Solid Benchmark for Evaluating Human
  Preferences of Text-to-Image Synthesis
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Xiaoshi Wu
Yiming Hao
Keqiang Sun
Yixiong Chen
Feng Zhu
Rui Zhao
Hongsheng Li
65
274
0
15 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELM
ALM
49
67
0
15 Jun 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image
  Generation
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
75
349
0
12 Apr 2023
Text-to-image Diffusion Models in Generative AI: A Survey
Text-to-image Diffusion Models in Generative AI: A Survey
Chenshuang Zhang
Chaoning Zhang
Mengchun Zhang
In So Kweon
VLM
67
276
0
14 Mar 2023
When and why vision-language models behave like bags-of-words, and what
  to do about it?
When and why vision-language models behave like bags-of-words, and what to do about it?
Mert Yuksekgonul
Federico Bianchi
Pratyusha Kalluri
Dan Jurafsky
James Zou
VLM
CoGe
46
378
0
04 Oct 2022
12
Next