ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09476
  4. Cited By
ARES: An Automated Evaluation Framework for Retrieval-Augmented
  Generation Systems

ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems

16 November 2023
Jon Saad-Falcon
Omar Khattab
Christopher Potts
Matei A. Zaharia
    RALM
ArXivPDFHTML

Papers citing "ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems"

31 / 81 papers shown
Title
Evaluation of RAG Metrics for Question Answering in the Telecom Domain
Evaluation of RAG Metrics for Question Answering in the Telecom Domain
Sujoy Roychowdhury
Sumit Soman
H. G. Ranjani
Neeraj Gunda
Vansh Chhabra
Sai Krishna Bala
66
14
0
15 Jul 2024
Lynx: An Open Source Hallucination Evaluation Model
Lynx: An Open Source Hallucination Evaluation Model
Selvan Sunitha Ravi
B. Mielczarek
Anand Kannappan
Douwe Kiela
Rebecca Qian
VLM
RALM
HILM
53
17
0
11 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges
  and Lessons Learned (Survey)
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
39
12
0
10 Jul 2024
Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using
  LLMs with Human in the Loop
Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using LLMs with Human in the Loop
Anum Afzal
Alexander Kowsik
Rajna Fani
Florian Matthes
52
6
0
08 Jul 2024
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation
David Rau
Hervé Déjean
Nadezhda Chirkova
Thibault Formal
Shuai Wang
Vassilina Nikoulina
S. Clinchant
45
12
0
01 Jul 2024
When Search Engine Services meet Large Language Models: Visions and
  Challenges
When Search Engine Services meet Large Language Models: Visions and Challenges
Haoyi Xiong
Jiang Bian
Yuchen Li
Xuhong Li
Jundong Li
Shuaiqiang Wang
Dawei Yin
Sumi Helal
53
28
0
28 Jun 2024
Evaluating Quality of Answers for Retrieval-Augmented Generation: A
  Strong LLM Is All You Need
Evaluating Quality of Answers for Retrieval-Augmented Generation: A Strong LLM Is All You Need
Yang Wang
Alberto Garcia Hernandez
Roman Kyslyi
Nicholas S. Kersting
36
3
0
26 Jun 2024
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework
Zackary Rackauckas
Arthur Camara
Jakub Zavrel
42
8
0
20 Jun 2024
Stratified Prediction-Powered Inference for Hybrid Language Model
  Evaluation
Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation
Adam Fisch
Joshua Maynez
R. A. Hofer
Bhuwan Dhingra
Amir Globerson
William W. Cohen
44
8
0
06 Jun 2024
The Challenges of Evaluating LLM Applications: An Analysis of Automated,
  Human, and LLM-Based Approaches
The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches
Bhashithe Abeysinghe
Ruhan Circi
ELM
38
22
0
05 Jun 2024
Luna: An Evaluation Foundation Model to Catch Language Model
  Hallucinations with High Accuracy and Low Cost
Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost
Masha Belyi
Robert Friel
Shuai Shao
Atindriyo Sanyal
HILM
RALM
64
5
0
03 Jun 2024
CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in
  Compound AI Systems
CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems
Yanlin Feng
Sajjadur Rahman
Aaron Feng
Vincent Chen
Eser Kandogan
48
4
0
02 Jun 2024
Automated Evaluation of Retrieval-Augmented Language Models with
  Task-Specific Exam Generation
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet
Behrooz Omidvar-Tehrani
Anoop Deoras
Laurent Callot
RALM
73
17
0
22 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
62
82
0
13 May 2024
Self-Improving Customer Review Response Generation Based on LLMs
Self-Improving Customer Review Response Generation Based on LLMs
Guy Azov
Tatiana Pelc
Adi Fledel Alon
Gila Kamhi
40
0
0
06 May 2024
On the Evaluation of Machine-Generated Reports
On the Evaluation of Machine-Generated Reports
James Mayfield
Eugene Yang
Dawn J Lawrie
Sean MacAvaney
Paul McNamee
...
Orion Weller
Efsun Kayi
Kate Sanders
Marc Mason
Noah Hibbler
ALM
101
12
0
02 May 2024
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural
  Language Processing
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing
Yucheng Hu
Yuxing Lu
RALM
60
18
0
30 Apr 2024
GRAMMAR: Grounded and Modular Methodology for Assessment of
  Closed-Domain Retrieval-Augmented Language Model
GRAMMAR: Grounded and Modular Methodology for Assessment of Closed-Domain Retrieval-Augmented Language Model
Xinzhe Li
Ming Liu
Shang Gao
RALM
40
0
0
30 Apr 2024
InspectorRAGet: An Introspection Platform for RAG Evaluation
InspectorRAGet: An Introspection Platform for RAG Evaluation
Kshitij P. Fadnis
Siva Sankalp Patel
O. Boni
Yannis Katsis
Sara Rosenthal
Benjamin Sznajder
Marina Danilevsky
40
2
0
26 Apr 2024
Evaluating Retrieval Quality in Retrieval-Augmented Generation
Evaluating Retrieval Quality in Retrieval-Augmented Generation
Alireza Salemi
Hamed Zamani
RALM
46
62
0
21 Apr 2024
Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented
  Generation
Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation
Guanhua Chen
Wenhan Yu
Lei Sha
3DV
39
0
0
19 Apr 2024
A Survey on Retrieval-Augmented Text Generation for Large Language
  Models
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Yizheng Huang
Jimmy X. Huang
3DV
RALM
66
46
0
17 Apr 2024
ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence
ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence
Kevin Wu
Eric Wu
James Zou
AAML
61
40
0
16 Apr 2024
AutoEval Done Right: Using Synthetic Data for Model Evaluation
AutoEval Done Right: Using Synthetic Data for Model Evaluation
Pierre Boyeau
Anastasios Nikolas Angelopoulos
N. Yosef
Jitendra Malik
Michael I. Jordan
SyDa
38
14
0
09 Mar 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Penghao Zhao
Hailin Zhang
Qinhan Yu
Zhengren Wang
Yunteng Geng
Fangcheng Fu
Ling Yang
Wentao Zhang
Jie Jiang
Bin Cui
3DV
121
228
0
29 Feb 2024
Prediction-Powered Ranking of Large Language Models
Prediction-Powered Ranking of Large Language Models
Ivi Chatzi
Eleni Straitouri
Suhas Thejaswi
Manuel Gomez Rodriguez
ALM
29
5
0
27 Feb 2024
RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for
  Short-form Open-Domain Question Answering
RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering
Zihan Zhang
Meng Fang
Ling-Hao Chen
RALM
56
13
0
26 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based
  Question-Answering
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
29
4
0
13 Feb 2024
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented
  Generation of Large Language Models
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
Yuanjie Lyu
Zhiyu Li
Simin Niu
Zhiyu Li
Bo Tang
Wenjin Wang
Hao Wu
Huan Liu
Tong Xu
Enhong Chen
RALM
44
32
0
30 Jan 2024
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop
  Queries
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries
Yixuan Tang
Yi Yang
RALM
41
79
0
27 Jan 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DV
RALM
61
1,530
1
18 Dec 2023
Previous
12