Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.11897
Cited By
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
21 March 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
EGVM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering"
50 / 174 papers shown
Title
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Tianlong Chen
Mohit Bansal
AAML
MU
83
3
0
01 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Chenhan Jiang
Yihan Zeng
Hang Xu
Dit-Yan Yeung
44
0
0
28 Apr 2025
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal
Yushi Hu
Oscar Michel
Aniruddha Kembhavi
William T. Freeman
Noah A. Smith
Ranjay Krishna
Antonio Torralba
Ali Farhadi
Wei-Chiu Ma
EGVM
ELM
77
0
0
25 Apr 2025
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin
Hagai Taitelbaum
Yonatan Bitton
Brian Gordon
Michal Sokolik
...
Almog Gueta
Royi Rassin
Itay Laish
Dani Lischinski
Idan Szpektor
EGVM
VGen
41
0
0
24 Apr 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang
Yuxi Zhou
Duo Peng
Joo-Hwee Lim
Zhigang Tu
De Wen Soh
Lin Geng Foo
DiffM
45
1
0
19 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
48
0
0
18 Apr 2025
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Jialuo Li
Wenhao Chai
Xingyu Fu
Haiyang Xu
Saining Xie
MedIm
38
0
0
17 Apr 2025
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Xinli Yue
Jianhui Sun
Junda Lu
Liangchao Yao
Fan Xia
Tianyi Wang
Fengyun Rao
Jing Lyu
Yuetang Deng
21
0
0
16 Apr 2025
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography
I-Sheng Fang
Jun-Cheng Chen
LRM
VLM
30
0
0
14 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
65
0
0
14 Apr 2025
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
Zijian Zhang
Xuhui Zheng
X. Wu
Chong Peng
Xuezhi Cao
32
0
0
10 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
J. Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
73
0
0
07 Apr 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
40
0
0
30 Mar 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
62
0
0
28 Mar 2025
ImageSet2Text: Describing Sets of Images through Text
Piera Riccio
F. Galati
Kajetan Schweighofer
Noa Garcia
Nuria Oliver
VLM
CoGe
74
0
0
25 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Y. Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
56
0
0
21 Mar 2025
T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation
Seyed Mohsen Hosseini
Amir Mohammad Izadi
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
CoGe
78
0
0
14 Mar 2025
Exploring Bias in over 100 Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Saeed Mian
EGVM
67
3
0
11 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
40
0
0
09 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
89
0
0
27 Feb 2025
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
Max W.F. Ku
Thomas Chong
Jonathan Leung
Krish Shah
Alvin Yu
Wenhu Chen
LRM
96
3
0
26 Feb 2025
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
Chuan Cui
Kejiang Chen
Zhihua Wei
Wen Shen
W. Zhang
Nenghai Yu
EGVM
67
0
0
24 Feb 2025
Multi-Agent Multimodal Models for Multicultural Text to Image Generation
Parth Bhalerao
Mounika Yalamarty
Brian Trinh
Oana Ignat
37
0
0
21 Feb 2025
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li
Zhelun Shi
Xuhao Hu
Bowen Dong
Yiran Qin
Xihui Liu
Lu Sheng
Jing Shao
112
1
0
21 Feb 2025
MoVer: Motion Verification for Motion Graphics Animations
Jiaju Ma
Maneesh Agrawala
VGen
51
0
0
20 Feb 2025
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
L. Yang
Xinchen Zhang
Ye Tian
Chenming Shang
Minghao Xu
Wentao Zhang
Bin Cui
96
1
0
17 Feb 2025
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
Weilin Lin
Nanjun Zhou
Y. Wang
Jianze Li
Hui Xiong
Li Liu
AAML
DiffM
169
0
0
17 Feb 2025
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
J. Park
Jungbeom Lee
Jongyoon Song
Sangwon Yu
Dahuin Jung
Sungroh Yoon
45
0
0
19 Jan 2025
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Xiaoying Xing
Avinab Saha
Junfeng He
Susan Hao
Paul Vicol
...
Sahil Singla
Sarah Young
Yinxiao Li
Feng Yang
Deepak Ramachandran
DiffM
48
0
0
11 Jan 2025
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice H. Oh
Seohyon Jung
94
1
0
03 Jan 2025
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance
Renyang Liu
Ziyu Lyu
Wei Zhou
See-Kiong Ng
EGVM
33
0
0
23 Dec 2024
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
90
3
0
18 Dec 2024
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Z. Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
86
0
0
16 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
77
1
0
16 Dec 2024
CAP: Evaluation of Persuasive and Creative Image Generation
Aysan Aghazadeh
Adriana Kovashka
EGVM
97
1
0
10 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
164
0
0
07 Dec 2024
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang
Wanggui He
Quanyu Long
Yandi Wang
Haoyuan Li
...
Fangxun Shu
Long Chen
Hao Jiang
Leilei Gan
Fei Wu
EGVM
187
3
0
05 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
14
0
03 Dec 2024
Detailed Object Description with Controllable Dimensions
Xinran Wang
H. Zhang
Baoteng Li
Kongming Liang
Hao Sun
Zhongjiang He
Z. Ma
Jun Guo
81
0
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
82
0
0
28 Nov 2024
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Fan Yang
Ru Zhen
J. T. Wang
Yanhao Zhang
Haoxiang Chen
Haonan Lu
Sicheng Zhao
Guiguang Ding
71
0
0
26 Nov 2024
Interactive Visual Assessment for Text-to-Image Generation Models
Xiaoyue Mi
Fan Tang
Juan Cao
Qiang Sheng
Ziyao Huang
Peng Li
Y. Liu
Tong-Yee Lee
EGVM
71
0
0
23 Nov 2024
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Jeeyung Kim
Erfan Esmaeili
Qiang Qiu
DiffM
85
1
0
21 Nov 2024
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Saeed Mian
EGVM
71
0
0
21 Nov 2024
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla
Yu Hou
Yang Trista Cao
Hal Daumé III
Rachel Rudinger
ReLM
CoGe
VLM
49
3
0
29 Oct 2024
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
Han Bao
Yue Huang
Yanbo Wang
Jiayi Ye
Xiangqi Wang
Xiuying Chen
Mohamed Elhoseiny
X. Zhang
Mohamed Elhoseiny
Xiangliang Zhang
47
7
0
28 Oct 2024
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
72
3
0
28 Oct 2024
C
2
C^2
C
2
: Scalable Auto-Feedback for LLM-based Chart Generation
Woosung Koh
Jang Han Yoon
M. Lee
Youngjin Song
Jaegwan Cho
Jaehyun Kang
Taehyeon Kim
Se-Young Yun
Youngjae Yu
B. Lee
42
0
0
24 Oct 2024
Offline Evaluation of Set-Based Text-to-Image Generation
Negar Arabzadeh
Fernando Diaz
Junfeng He
EGVM
32
0
0
22 Oct 2024
1
2
3
4
Next