Title
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets Lorenz Brehme Thomas Ströhle Ruth Breu 59 0 0 28 Apr 2025
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey Aoran Gan Hao Yu Kai Zhang Qi Liu Wenyu Yan Zhenya Huang Shiwei Tong Guoping Hu RALM 3DV 38 0 0 21 Apr 2025
Retrieval-Augmented Generation with Conflicting Evidence Han Wang Archiki Prasad Elias Stengel-Eskin Mohit Bansal RALM 71 1 0 17 Apr 2025
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation Z. Zhang Ning Li Qi Liu Rui Li W. Gao Qingyang Mao Zhenya Huang Baosheng Yu Dacheng Tao RALM 34 0 0 11 Apr 2025
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving Wenqi Jiang Suvinay Subramanian Cat Graves Gustavo Alonso Amir Yazdanbakhsh Vidushi Dadu 47 6 0 18 Mar 2025
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs Reham Omar Omij Mangukiya Essam Mansour 37 0 0 20 Jan 2025
RAG-based Question Answering over Heterogeneous Data and Text Philipp Christmann G. Weikum LMTD RALM 83 3 0 10 Dec 2024
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation Mufei Li Siqi Miao Pan Li RALM 30 7 0 28 Oct 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems Nandan Thakur Suleman Kazi Ge Luo Jimmy J. Lin Amin Ahmad VLM RALM 28 7 0 17 Oct 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models Fei Wang Xingchen Wan Ruoxi Sun Jiefeng Chen Sercan Ö. Arık RALM 32 7 0 09 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang Jingyuan Huang Kai Mei Yifei Yao Zhenting Wang Chenlu Zhan Hongwei Wang Yongfeng Zhang AAML LLMAG ELM 51 18 0 03 Oct 2024
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants Zeyu Zhang Quanyu Dai Luyu Chen Zeren Jiang Rui Li Jieming Zhu Xu Chen Yi Xie Zhenhua Dong Ji-Rong Wen LLMAG 28 4 0 30 Sep 2024
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Dongzhi Jiang Renrui Zhang Ziyu Guo Yanmin Wu Jiayi Lei ... Guanglu Song Peng Gao Yu Liu Chunyuan Li Hongsheng Li MLLM 27 16 0 19 Sep 2024
Evaluation of RAG Metrics for Question Answering in the Telecom Domain Sujoy Roychowdhury Sumit Soman H. G. Ranjani Neeraj Gunda Vansh Chhabra Sai Krishna Bala 53 14 0 15 Jul 2024
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales Zhepei Wei Wei-Lin Chen Yu Meng RALM 58 12 0 19 Jun 2024