MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

6 October 2022

Papers citing "MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text"

32 / 32 papers shown

Title
TinyAlign: Boosting Lightweight Vision-Language Models by Mitigating Modal Alignment Bottlenecks Yuanze Hu Zhaoxin Fan Xinyu Wang Gen Li Ye Qiu ... Wenjun Wu Kejian Wu Yifan Sun Xiaotie Deng Jin Song Dong 25 0 0 19 May 2025
VR-RAG: Open-vocabulary Species Recognition with RAG-Assisted Large Multi-Modal Models Fahad Shahbaz Khan Jun Chen Youssef Mohamed Chun-Mei Feng Mohamed Elhoseiny VLM 59 0 0 08 May 2025
HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights Ozan Gokdemir Carlo Siebenschuh Alexander Brace Azton Wells Brian Hsu ... A. Anandkumar Ian Foster R. Stevens V. Vishwanath A. Ramanathan VLM 44 0 0 07 May 2025
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities Woongyeong Yeo Kangsan Kim Soyeong Jeong Jinheon Baek Sung Ju Hwang 54 1 0 29 Apr 2025
FinSage: A Multi-aspect RAG System for Financial Filings Question Answering Xinyu Wang Jijun Chi Zhenghan Tai Tung Sum Thomas Kwok Muzhi Li ... Suyuchen Wang Yihong Wu Jerry Huang Jingrui Tian Ling Zhou 81 0 0 20 Apr 2025
OnRL-RAG: Real-Time Personalized Mental Health Dialogue System Ahsan Bilal Beiyu Lin OffRL RALM AI4MH 58 1 0 02 Apr 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation Mingyue Cheng Yucong Luo Jie Ouyang Qiang Liu Huijie Liu ... Bohou Zhang Jiawei Cao Jie Ma Daoyu Wang Enhong Chen 3DV 85 3 0 11 Mar 2025
Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation Yinuo Liu Zenghui Yuan Guiyao Tie Jiawen Shi Lichao Sun Lichao Sun Neil Zhenqiang Gong 61 1 0 08 Mar 2025
MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks Hyeonjeong Ha Qiusi Zhan Jeonghwan Kim Dimitrios Bralios Saikrishna Sanniboina Nanyun Peng Kai-Wei Chang Daniel Kang Heng Ji KELM AAML 78 2 0 25 Feb 2025
OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering Jiahao Nick Li Zhuohao Jerry Zhang Zhang 104 1 0 24 Feb 2025
Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models Peter Carragher Abhinand Jha R Raghav Kathleen M. Carley RALM 82 0 0 20 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation Mohammad Mahdi Abootorabi Amirhosein Zobeiri Mahdi Dehghani Mohammadali Mohammadkhani Bardia Mohammadi Omid Ghahroodi M. Baghshah Ehsaneddin Asgari RALM 117 5 0 12 Feb 2025
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge Daniel Tamayo Aitor Gonzalez-Agirre Javier Hernando Marta Villegas KELM 124 4 0 04 Feb 2025
ALoFTRAG: Automatic Local Fine Tuning for Retrieval Augmented Generation Peter Devine 52 0 0 21 Jan 2025
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Manan Suri Puneet Mathur Franck Dernoncourt Kanika Goswami Ryan Rossi Dinesh Manocha 112 3 0 14 Dec 2024
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Jaemin Cho Debanjan Mahata Ozan Irsoy Yujie He Joey Tianyi Zhou VLM 60 12 0 07 Nov 2024
Self-adaptive Multimodal Retrieval-Augmented Generation Wenjia Zhai VLM 49 0 0 15 Oct 2024
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models Wenbo Hu Jia-Chen Gu Zi-Yi Dou Mohsen Fayyaz Pan Lu Kai-Wei Chang Nanyun Peng VLM 69 5 0 10 Oct 2024
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents Junpeng Yue Xinru Xu Börje F. Karlsson Zongqing Lu 53 0 0 04 Oct 2024
Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation Liwen Sun James Zhao Megan Han Chenyan Xiong MedIm 72 8 0 21 Jul 2024
Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary Meiling Tao Xuechen Liang Ziyi Wang Yiling Tao Yiling Tao Jianhui Wang Sun Li Tianyu Shi 57 1 0 23 Jun 2024
Demystifying Chains, Trees, and Graphs of Thoughts Maciej Besta Florim Memedi Zhenyu Zhang Robert Gerstenberger Guangyuan Piao ... Aleš Kubíček H. Niewiadomski Aidan O'Mahony Onur Mutlu Torsten Hoefler AI4CE LRM 85 27 0 25 Jan 2024
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models Bingshuai Liu Chenyang Lyu Zijun Min Zhanyu Wang Jinsong Su Longyue Wang LRM 44 7 0 04 Dec 2023
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers Cong Wei Yang Chen Haonan Chen Hexiang Hu Ge Zhang Jie Fu Alan Ritter Wenhu Chen 51 59 0 28 Nov 2023
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model Le Zhang Yihong Wu Fengran Mo Jian-Yun Nie Aishwarya Agrawal MLLM RALM 51 6 0 20 Oct 2023
MultiVENT: Multilingual Videos of Events with Aligned Natural Text Kate Sanders David Etter Reno Kriz Benjamin Van Durme VGen 68 7 0 06 Jul 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning Zhuolin Yang Ming-Yu Liu Zihan Liu V. Korthikanti Weili Nie ... Yuke Zhu Mohammad Shoeybi Bryan Catanzaro Chaowei Xiao Anima Anandkumar VLM RALM 41 39 0 09 Feb 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning Zhenfang Chen Qinhong Zhou Songlin Yang Yining Hong Hao Zhang Chuang Gan LRM VLM 50 36 0 12 Jan 2023
Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation Qian Yang Qian Chen Wen Wang Baotian Hu Min Zhang 49 25 0 16 Dec 2022
Retrieval-Augmented Multimodal Language Modeling Michihiro Yasunaga Armen Aghajanyan Weijia Shi Rich James J. Leskovec Percy Liang M. Lewis Luke Zettlemoyer Wen-tau Yih RALM 27 96 0 22 Nov 2022
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts Soravit Changpinyo P. Sharma Nan Ding Radu Soricut VLM 323 1,092 0 17 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA Luowei Zhou Hamid Palangi Lei Zhang Houdong Hu Jason J. Corso Jianfeng Gao MLLM VLM 278 929 0 24 Sep 2019