Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.10728
Cited By
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding
20 December 2021
Revanth Reddy Gangi Reddy
Xilin Rui
Manling Li
Xudong Lin
Haoyang Wen
Jaemin Cho
Lifu Huang
Joey Tianyi Zhou
Avirup Sil
Shih-Fu Chang
A. Schwing
Heng Ji
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding"
18 / 18 papers shown
Title
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning
Seunghee Kim
Changhyeon Kim
Taeuk Kim
LRM
89
1
0
20 Feb 2025
Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence
Wenbo Huang
Jinghui Zhang
Bernard Ghanem
Lei Zhang
Shuoyuan Wang
Fang Dong
Jiahui Jin
Takahiro Ogawa
Miki Haseyama
Mamba
93
1
0
10 Dec 2024
VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Shailaja Keyur Sampat
Mutsumi Nakamura
Shankar Kailas
Kartik Aggarwal
Mandy Zhou
Yezhou Yang
Chitta Baral
MLLM
CoGe
ReLM
VLM
LRM
37
0
0
17 Oct 2024
SOAP: Enhancing Spatio-Temporal Relation and Motion Information Capturing for Few-Shot Action Recognition
Wenbo Huang
Jinghui Zhang
Xuwei Qian
Zhen Wu
Meng Wang
Lei Zhang
31
1
0
23 Jul 2024
Advancing Chart Question Answering with Robust Chart Component Recognition
Hanwen Zheng
Sijia Wang
Chris Thomas
Lifu Huang
40
1
0
19 Jul 2024
Exploring Hybrid Question Answering via Program-based Prompting
Qi Shi
Han Cui
Haofeng Wang
Qingfu Zhu
Wanxiang Che
Ting Liu
35
4
0
16 Feb 2024
Video Summarization: Towards Entity-Aware Captions
Hammad A. Ayyubi
Tianqi Liu
Arsha Nagrani
Xudong Lin
Mingda Zhang
Anurag Arnab
Feng Han
Yukun Zhu
Jialu Liu
Shih-Fu Chang
39
0
0
01 Dec 2023
Zero-Shot Video Question Answering with Procedural Programs
Rohan Choudhury
Koichiro Niinuma
Kris M. Kitani
László A. Jeni
19
21
0
01 Dec 2023
MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images
Weihao Liu
Fangyu Lei
Tongxu Luo
Jiahe Lei
Shizhu He
Jun Zhao
Kang Liu
LMTD
29
9
0
09 Sep 2023
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
Kate Sanders
David Etter
Reno Kriz
Benjamin Van Durme
VGen
39
7
0
06 Jul 2023
Unified Language Representation for Question Answering over Text, Tables, and Images
Yu Bowen
Cheng Fu
Haiyang Yu
Fei Huang
Yongbin Li
LMTD
24
20
0
29 Jun 2023
A Benchmark Generative Probabilistic Model for Weak Supervised Learning
Georgios Th. Papadopoulos
Fran Silavong
Sean J. Moran
SyDa
21
0
0
31 Mar 2023
ViperGPT: Visual Inference via Python Execution for Reasoning
Dídac Surís
Sachit Menon
Carl Vondrick
MLLM
LRM
ReLM
45
431
0
14 Mar 2023
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kan Chen
Xiangqian Wu
CoGe
19
8
0
05 Mar 2023
Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation
Qian Yang
Qian Chen
Wen Wang
Baotian Hu
Min Zhang
37
24
0
16 Dec 2022
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text
Wenhu Chen
Hexiang Hu
Xi Chen
Pat Verga
William W. Cohen
RALM
13
141
0
06 Oct 2022
Beyond Grounding: Extracting Fine-Grained Event Hierarchies Across Modalities
Hammad A. Ayyubi
Christopher Thomas
Lovish Chum
R. Lokesh
Long Chen
...
Xudong Lin
Xuande Feng
Jaywon Koo
Sounak Ray
Shih-Fu Chang
AI4TS
23
0
0
14 Jun 2022
MLQA: Evaluating Cross-lingual Extractive Question Answering
Patrick Lewis
Barlas Oğuz
Ruty Rinott
Sebastian Riedel
Holger Schwenk
ELM
246
492
0
16 Oct 2019
1