ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.11897
  4. Cited By
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation
  with Question Answering

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

21 March 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
    EGVM
ArXivPDFHTML

Papers citing "TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering"

50 / 174 papers shown
Title
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Tianlong Chen
Mohit Bansal
AAML
MU
83
3
0
01 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Chenhan Jiang
Yihan Zeng
Hang Xu
Dit-Yan Yeung
44
0
0
28 Apr 2025
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal
Yushi Hu
Oscar Michel
Aniruddha Kembhavi
William T. Freeman
Noah A. Smith
Ranjay Krishna
Antonio Torralba
Ali Farhadi
Wei-Chiu Ma
EGVM
ELM
77
0
0
25 Apr 2025
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin
Hagai Taitelbaum
Yonatan Bitton
Brian Gordon
Michal Sokolik
...
Almog Gueta
Royi Rassin
Itay Laish
Dani Lischinski
Idan Szpektor
EGVM
VGen
41
0
0
24 Apr 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang
Yuxi Zhou
Duo Peng
Joo-Hwee Lim
Zhigang Tu
De Wen Soh
Lin Geng Foo
DiffM
45
1
0
19 Apr 2025
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
ESPLoRA: Enhanced Spatial Precision with Low-Rank Adaption in Text-to-Image Diffusion Models for High-Definition Synthesis
Andrea Rigo
Luca Stornaiuolo
Mauro Martino
Bruno Lepri
N. Sebe
48
0
0
18 Apr 2025
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Science-T2I: Addressing Scientific Illusions in Image Synthesis
Jialuo Li
Wenhao Chai
Xingyu Fu
Haiyang Xu
Saining Xie
MedIm
38
0
0
17 Apr 2025
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Instruction-augmented Multimodal Alignment for Image-Text and Element Matching
Xinli Yue
Jianhui Sun
Junda Lu
Liangchao Yao
Fan Xia
Tianyi Wang
Fengyun Rao
Jing Lyu
Yuetang Deng
21
0
0
16 Apr 2025
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography
I-Sheng Fang
Jun-Cheng Chen
LRM
VLM
30
0
0
14 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes
Huijie Liu
Bingcan Wang
Jie Hu
Xiaoming Wei
Guoliang Kang
65
0
0
14 Apr 2025
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
Zijian Zhang
Xuhui Zheng
X. Wu
Chong Peng
Xuezhi Cao
32
0
0
10 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Video-Bench: Human-Aligned Video Generation Benchmark
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
Y. Li
J. Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
73
0
0
07 Apr 2025
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
Leander Girrbach
Stephan Alaniz
Genevieve Smith
Zeynep Akata
40
0
0
30 Mar 2025
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han
Yeonkyung Lee
Chanyoung Kim
Kwanghyun Park
Seong Jae Hwang
DiffM
62
0
0
28 Mar 2025
ImageSet2Text: Describing Sets of Images through Text
ImageSet2Text: Describing Sets of Images through Text
Piera Riccio
F. Galati
Kajetan Schweighofer
Noa Garcia
Nuria Oliver
VLM
CoGe
74
0
0
25 Mar 2025
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan
Zhengfeng Lai
Y. Sun
Peng Zhang
Wei Liu
Kieran Liu
Meng Cao
Ruihua Song
VGen
56
0
0
21 Mar 2025
T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation
Seyed Mohsen Hosseini
Amir Mohammad Izadi
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
CoGe
78
0
0
14 Mar 2025
Exploring Bias in over 100 Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Saeed Mian
EGVM
67
3
0
11 Mar 2025
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Amir Mohammad Izadi
Seyed Mohsen Hosseini
Soroush Vafaie Tabar
Ali Abdollahi
Armin Saghafian
M. Baghshah
EGVM
40
0
0
09 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
89
0
0
27 Feb 2025
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
Max W.F. Ku
Thomas Chong
Jonathan Leung
Krish Shah
Alvin Yu
Wenhu Chen
LRM
96
3
0
26 Feb 2025
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
Chuan Cui
Kejiang Chen
Zhihua Wei
Wen Shen
W. Zhang
Nenghai Yu
EGVM
67
0
0
24 Feb 2025
Multi-Agent Multimodal Models for Multicultural Text to Image Generation
Multi-Agent Multimodal Models for Multicultural Text to Image Generation
Parth Bhalerao
Mounika Yalamarty
Brian Trinh
Oana Ignat
37
0
0
21 Feb 2025
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li
Zhelun Shi
Xuhao Hu
Bowen Dong
Yiran Qin
Xihui Liu
Lu Sheng
Jing Shao
112
1
0
21 Feb 2025
MoVer: Motion Verification for Motion Graphics Animations
MoVer: Motion Verification for Motion Graphics Animations
Jiaju Ma
Maneesh Agrawala
VGen
51
0
0
20 Feb 2025
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
L. Yang
Xinchen Zhang
Ye Tian
Chenming Shang
Minghao Xu
Wentao Zhang
Bin Cui
96
1
0
17 Feb 2025
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
Weilin Lin
Nanjun Zhou
Y. Wang
Jianze Li
Hui Xiong
Li Liu
AAML
DiffM
169
0
0
17 Feb 2025
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
Know "No'' Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
J. Park
Jungbeom Lee
Jongyoon Song
Sangwon Yu
Dahuin Jung
Sungroh Yoon
45
0
0
19 Jan 2025
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
Xiaoying Xing
Avinab Saha
Junfeng He
Susan Hao
Paul Vicol
...
Sahil Singla
Sarah Young
Yinxiao Li
Feng Yang
Deepak Ramachandran
DiffM
48
0
0
11 Jan 2025
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice H. Oh
Seohyon Jung
94
1
0
03 Jan 2025
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance
Renyang Liu
Ziyu Lyu
Wei Zhou
See-Kiong Ng
EGVM
33
0
0
23 Dec 2024
What makes a good metric? Evaluating automatic metrics for text-to-image
  consistency
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
90
3
0
18 Dec 2024
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Z. Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
86
0
0
16 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
77
1
0
16 Dec 2024
CAP: Evaluation of Persuasive and Creative Image Generation
CAP: Evaluation of Persuasive and Creative Image Generation
Aysan Aghazadeh
Adriana Kovashka
EGVM
97
1
0
10 Dec 2024
Evaluating Hallucination in Text-to-Image Diffusion Models with
  Scene-Graph based Question-Answering Agent
Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent
Ziyuan Qin
D. Cheng
Haoyu Wang
Huahui Yi
Yuting Shao
Zhiyuan Fan
Kang Li
Qicheng Lao
EGVM
MLLM
164
0
0
07 Dec 2024
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models
  with Knowledge-Intensive Concepts
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang
Wanggui He
Quanyu Long
Yandi Wang
Haoyuan Li
...
Fangxun Shu
Long Chen
Hao Jiang
Leilei Gan
Fei Wu
EGVM
187
3
0
05 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
14
0
03 Dec 2024
Detailed Object Description with Controllable Dimensions
Detailed Object Description with Controllable Dimensions
Xinran Wang
H. Zhang
Baoteng Li
Kongming Liang
Hao Sun
Zhongjiang He
Z. Ma
Jun Guo
81
0
0
28 Nov 2024
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu
Jieke Wang
Meng Tang
DiffM
82
0
0
28 Nov 2024
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility
  Evaluator
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
Fan Yang
Ru Zhen
J. T. Wang
Yanhao Zhang
Haoxiang Chen
Haonan Lu
Sicheng Zhao
Guiguang Ding
71
0
0
26 Nov 2024
Interactive Visual Assessment for Text-to-Image Generation Models
Interactive Visual Assessment for Text-to-Image Generation Models
Xiaoyue Mi
Fan Tang
Juan Cao
Qiang Sheng
Ziyao Huang
Peng Li
Y. Liu
Tong-Yee Lee
EGVM
71
0
0
23 Nov 2024
Text Embedding is Not All You Need: Attention Control for Text-to-Image
  Semantic Alignment with Text Self-Attention Maps
Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps
Jeeyung Kim
Erfan Esmaeili
Qiang Qiu
DiffM
85
1
0
21 Nov 2024
On the Fairness, Diversity and Reliability of Text-to-Image Generative
  Models
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models
J. Vice
Naveed Akhtar
Richard I. Hartley
Ajmal Saeed Mian
EGVM
71
0
0
21 Nov 2024
Natural Language Inference Improves Compositionality in Vision-Language
  Models
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla
Yu Hou
Yang Trista Cao
Hal Daumé III
Rachel Rudinger
ReLM
CoGe
VLM
49
3
0
29 Oct 2024
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
Han Bao
Yue Huang
Yanbo Wang
Jiayi Ye
Xiangqi Wang
Xiuying Chen
Mohamed Elhoseiny
X. Zhang
Mohamed Elhoseiny
Xiangliang Zhang
47
7
0
28 Oct 2024
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Attention Overlap Is Responsible for The Entity Missing Problem in Text-to-image Diffusion Models!
Arash Marioriyad
Mohammadali Banayeeanzade
Reza Abbasi
M. Rohban
M. Baghshah
DiffM
72
3
0
28 Oct 2024
$C^2$: Scalable Auto-Feedback for LLM-based Chart Generation
C2C^2C2: Scalable Auto-Feedback for LLM-based Chart Generation
Woosung Koh
Jang Han Yoon
M. Lee
Youngjin Song
Jaegwan Cho
Jaehyun Kang
Taehyeon Kim
Se-Young Yun
Youngjae Yu
B. Lee
42
0
0
24 Oct 2024
Offline Evaluation of Set-Based Text-to-Image Generation
Offline Evaluation of Set-Based Text-to-Image Generation
Negar Arabzadeh
Fernando Diaz
Junfeng He
EGVM
32
0
0
22 Oct 2024
1234
Next