Towards Lighter and Robust Evaluation for Retrieval Augmented Generation

Towards Lighter and Robust Evaluation for Retrieval Augmented Generation

20 March 2025

Alex-Razvan Ispas

Charles-Elie Simon

ArXiv (abs)PDF HTML

Papers citing "Towards Lighter and Robust Evaluation for Retrieval Augmented Generation"

10 / 10 papers shown

Title
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework Hannah Sansford Nicholas Richardson Hermina Petric Maretic Juba Nait Saada 70 17 0 15 Jul 2024
Mixtral of Experts Albert Q. Jiang Alexandre Sablayrolles Antoine Roux A. Mensch Blanche Savary ... Théophile Gervet Thibaut Lavril Thomas Wang Timothée Lacroix William El Sayed MoE LLMAG 155 1,117 0 08 Jan 2024
FinanceBench: A New Benchmark for Financial Question Answering Pranab Islam Anand Kannappan Douwe Kiela Rebecca Qian Nino Scherrer Bertie Vidgen RALM 54 91 0 20 Nov 2023
Ragas: Automated Evaluation of Retrieval Augmented Generation ES Shahul Jithin James Luis Espinosa-Anke Steven Schockaert 123 195 0 26 Sep 2023
GPT-4 Technical Report OpenAI OpenAI OpenAI Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad ... Shengjia Zhao Tianhao Zheng Juntang Zhuang William Zhuk Barret Zoph LLMAG MLLM 1.5K 14,748 0 15 Mar 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 845 9,683 0 28 Jan 2022
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching Vaibhav Adlakha Shehzaad Dhuliawala Kaheer Suleman H. D. Vries Siva Reddy BDL 88 91 0 02 Oct 2021
BERTScore: Evaluating Text Generation with BERT Tianyi Zhang Varsha Kishore Felix Wu Kilian Q. Weinberger Yoav Artzi 352 5,868 0 21 Apr 2019
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Zhilin Yang Peng Qi Saizheng Zhang Yoshua Bengio William W. Cohen Ruslan Salakhutdinov Christopher D. Manning RALM 191 2,694 0 25 Sep 2018
SQuAD: 100,000+ Questions for Machine Comprehension of Text Pranav Rajpurkar Jian Zhang Konstantin Lopyrev Percy Liang RALM 316 8,169 0 16 Jun 2016