Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.06666
Cited By
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
9 April 2025
Ruotian Peng
Haiying He
Yake Wei
Yandong Wen
D. Hu
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception"
2 / 2 papers shown
Title
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang
Jiayan Teng
Wendi Zheng
Ming Ding
Shiyu Huang
...
Weihan Wang
Yean Cheng
Xiaotao Gu
Yuxiao Dong
Jie Tang
DiffM
VGen
314
565
0
12 Aug 2024
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
261
197
0
29 Apr 2024
1