Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.03743
Cited By
Visual News: Benchmark and Challenges in News Image Captioning
8 October 2020
Fuxiao Liu
Yinghan Wang
Tianlu Wang
Vicente Ordonez
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual News: Benchmark and Challenges in News Image Captioning"
19 / 19 papers shown
Title
TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation
Shintaro Ozaki
Kazuki Hayashi
Yusuke Sakai
Jingun Kwon
Hidetaka Kamigaito
Katsuhiko Hayashi
Manabu Okumura
Taro Watanabe
VLM
88
0
0
25 Apr 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
53
20
0
03 Jan 2025
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Xin Zhang
Yanzhao Zhang
Wen Xie
Mingxin Li
Ziqi Dai
Dingkun Long
Pengjun Xie
Meishan Zhang
Wenjie Li
Hao Fei
118
8
0
22 Dec 2024
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs
Sheng-Chieh Lin
Chankyu Lee
M. Shoeybi
Jimmy J. Lin
Bryan Catanzaro
Ming-Yu Liu
73
12
0
04 Nov 2024
EntityCLIP: Entity-Centric Image-Text Matching via Multimodal Attentive Contrastive Learning
Yaxiong Wang
Yufei Wang
Lianwei Wu
Lechao Cheng
Zhun Zhong
Meng Wang
VLM
35
0
0
23 Oct 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
68
3
0
17 Jun 2024
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu
Zekun Li
Peipei Li
Shuhan Xia
Xing Cui
Linzhi Huang
Huaibo Huang
Weihong Deng
Zhaofeng He
56
14
0
13 Jun 2024
Learning Domain-Invariant Features for Out-of-Context News Detection
Yimeng Gu
Mengqi Zhang
Ignacio Castro
Shu Wu
Gareth Tyson
50
2
0
11 Jun 2024
Exposing Text-Image Inconsistency Using Diffusion Models
Mingzhen Huang
Shan Jia
Zhou Zhou
Yan Ju
Jialing Cai
Siwei Lyu
46
7
0
28 Apr 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Wang
Xin Li
Luisa Verdoliva
Shu Hu
88
58
0
22 Jan 2024
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Cong Wei
Yang Chen
Haonan Chen
Hexiang Hu
Ge Zhang
Jie Fu
Alan Ritter
Wenhu Chen
47
53
0
28 Nov 2023
EDIS: Entity-Driven Image Search over Multimodal Web Content
Siqi Liu
Weixi Feng
Tsu-jui Fu
Wenhu Chen
Wei Wang
VLM
48
9
0
23 May 2023
Detecting and Grounding Multi-Modal Media Manipulation
Rui Shao
Tianxing Wu
Ziwei Liu
44
57
0
05 Apr 2023
Bike Frames: Understanding the Implicit Portrayal of Cyclists in the News
Xingmeng Zhao
Dan Schumacher
Sashank Nalluri
Xavier Walton
Suhana Shrestha
Anthony Rios
29
2
0
15 Jan 2023
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
36
10
0
21 Sep 2022
WikiDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types
Xuwu Wang
Junfeng Tian
Min Gui
Zhixu Li
Rui-cang Wang
Ming Yan
Lihan Chen
Yanghua Xiao
VGen
24
48
0
13 Apr 2022
Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Sahar Abdelnabi
Rakibul Hasan
Mario Fritz
26
74
0
30 Nov 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
67
254
0
14 Jul 2021
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
200
434
0
27 Mar 2018
1