Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.02928
Cited By
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text
6 October 2022
Wenhu Chen
Hexiang Hu
Xi Chen
Pat Verga
William W. Cohen
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text"
32 / 32 papers shown
Title
TinyAlign: Boosting Lightweight Vision-Language Models by Mitigating Modal Alignment Bottlenecks
Yuanze Hu
Zhaoxin Fan
Xinyu Wang
Gen Li
Ye Qiu
...
Wenjun Wu
Kejian Wu
Yifan Sun
Xiaotie Deng
Jin Song Dong
25
0
0
19 May 2025
VR-RAG: Open-vocabulary Species Recognition with RAG-Assisted Large Multi-Modal Models
Fahad Shahbaz Khan
Jun Chen
Youssef Mohamed
Chun-Mei Feng
Mohamed Elhoseiny
VLM
59
0
0
08 May 2025
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights
Ozan Gokdemir
Carlo Siebenschuh
Alexander Brace
Azton Wells
Brian Hsu
...
A. Anandkumar
Ian Foster
R. Stevens
V. Vishwanath
A. Ramanathan
VLM
44
0
0
07 May 2025
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities
Woongyeong Yeo
Kangsan Kim
Soyeong Jeong
Jinheon Baek
Sung Ju Hwang
54
1
0
29 Apr 2025
FinSage: A Multi-aspect RAG System for Financial Filings Question Answering
Xinyu Wang
Jijun Chi
Zhenghan Tai
Tung Sum Thomas Kwok
Muzhi Li
...
Suyuchen Wang
Yihong Wu
Jerry Huang
Jingrui Tian
Ling Zhou
81
0
0
20 Apr 2025
OnRL-RAG: Real-Time Personalized Mental Health Dialogue System
Ahsan Bilal
Beiyu Lin
OffRL
RALM
AI4MH
58
1
0
02 Apr 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Qiang Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
85
3
0
11 Mar 2025
Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation
Yinuo Liu
Zenghui Yuan
Guiyao Tie
Jiawen Shi
Lichao Sun
Lichao Sun
Neil Zhenqiang Gong
61
1
0
08 Mar 2025
MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks
Hyeonjeong Ha
Qiusi Zhan
Jeonghwan Kim
Dimitrios Bralios
Saikrishna Sanniboina
Nanyun Peng
Kai-Wei Chang
Daniel Kang
Heng Ji
KELM
AAML
78
2
0
25 Feb 2025
OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering
Jiahao Nick Li
Zhuohao Jerry Zhang
Zhang
104
1
0
24 Feb 2025
Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models
Peter Carragher
Abhinand Jha
R Raghav
Kathleen M. Carley
RALM
82
0
0
20 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
117
5
0
12 Feb 2025
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge
Daniel Tamayo
Aitor Gonzalez-Agirre
Javier Hernando
Marta Villegas
KELM
124
4
0
04 Feb 2025
ALoFTRAG: Automatic Local Fine Tuning for Retrieval Augmented Generation
Peter Devine
52
0
0
21 Jan 2025
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
112
3
0
14 Dec 2024
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Jaemin Cho
Debanjan Mahata
Ozan Irsoy
Yujie He
Joey Tianyi Zhou
VLM
60
12
0
07 Nov 2024
Self-adaptive Multimodal Retrieval-Augmented Generation
Wenjia Zhai
VLM
49
0
0
15 Oct 2024
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Wenbo Hu
Jia-Chen Gu
Zi-Yi Dou
Mohsen Fayyaz
Pan Lu
Kai-Wei Chang
Nanyun Peng
VLM
69
5
0
10 Oct 2024
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue
Xinru Xu
Börje F. Karlsson
Zongqing Lu
53
0
0
04 Oct 2024
Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation
Liwen Sun
James Zhao
Megan Han
Chenyan Xiong
MedIm
72
8
0
21 Jul 2024
Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary
Meiling Tao
Xuechen Liang
Ziyi Wang
Yiling Tao
Yiling Tao
Jianhui Wang
Sun Li Tianyu Shi
57
1
0
23 Jun 2024
Demystifying Chains, Trees, and Graphs of Thoughts
Maciej Besta
Florim Memedi
Zhenyu Zhang
Robert Gerstenberger
Guangyuan Piao
...
Aleš Kubíček
H. Niewiadomski
Aidan O'Mahony
Onur Mutlu
Torsten Hoefler
AI4CE
LRM
85
27
0
25 Jan 2024
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
44
7
0
04 Dec 2023
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Cong Wei
Yang Chen
Haonan Chen
Hexiang Hu
Ge Zhang
Jie Fu
Alan Ritter
Wenhu Chen
51
59
0
28 Nov 2023
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Le Zhang
Yihong Wu
Fengran Mo
Jian-Yun Nie
Aishwarya Agrawal
MLLM
RALM
51
6
0
20 Oct 2023
MultiVENT: Multilingual Videos of Events with Aligned Natural Text
Kate Sanders
David Etter
Reno Kriz
Benjamin Van Durme
VGen
68
7
0
06 Jul 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Zhuolin Yang
Ming-Yu Liu
Zihan Liu
V. Korthikanti
Weili Nie
...
Yuke Zhu
Mohammad Shoeybi
Bryan Catanzaro
Chaowei Xiao
Anima Anandkumar
VLM
RALM
41
39
0
09 Feb 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
50
36
0
12 Jan 2023
Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation
Qian Yang
Qian Chen
Wen Wang
Baotian Hu
Min Zhang
49
25
0
16 Dec 2022
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
27
96
0
22 Nov 2022
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
323
1,092
0
17 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
278
929
0
24 Sep 2019
1