ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.01431
  4. Cited By
Benchmarking Large Language Models in Retrieval-Augmented Generation

Benchmarking Large Language Models in Retrieval-Augmented Generation

4 September 2023
Jiawei Chen
Hongyu Lin
Xianpei Han
Le Sun
    3DV
    RALM
ArXivPDFHTML

Papers citing "Benchmarking Large Language Models in Retrieval-Augmented Generation"

45 / 45 papers shown
Title
PoisonArena: Uncovering Competing Poisoning Attacks in Retrieval-Augmented Generation
PoisonArena: Uncovering Competing Poisoning Attacks in Retrieval-Augmented Generation
Liuji Chen
Xiaofang Yang
Yuanzhuo Lu
Jinghao Zhang
Xin Sun
Qiang Liu
Shu Wu
Jing Dong
Liang Wang
AAML
2
0
0
18 May 2025
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs
Chuan Xu
Qiaosheng Chen
Yutong Feng
Gong Cheng
RALM
3DV
VLM
36
0
0
16 May 2025
Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency
Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency
Adel Ammar
Anis Koubaa
Omer Nacar
W. Boulila
RALM
3DV
40
0
0
13 May 2025
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun
Xianrui Zhong
Sizhe Zhou
Jiawei Han
RALM
31
0
0
12 May 2025
Towards Requirements Engineering for RAG Systems
Towards Requirements Engineering for RAG Systems
Tor Sporsem
Rasmus Ulfsnes
26
0
0
12 May 2025
Defending against Indirect Prompt Injection by Instruction Detection
Defending against Indirect Prompt Injection by Instruction Detection
Tongyu Wen
Chenglong Wang
Xiyuan Yang
Haoyu Tang
Yueqi Xie
Lingjuan Lyu
Zhicheng Dou
Fangzhao Wu
AAML
34
0
0
08 May 2025
Retrieval Augmented Generation Evaluation for Health Documents
Retrieval Augmented Generation Evaluation for Health Documents
Mario Ceresa
Lorenzo Bertolini
Valentin Comte
Nicholas Spadaro
Barbara Raffael
...
Sergio Consoli
Amalia Muñoz Piñeiro
Alex Patak
Maddalena Querci
Tobias Wiesenthal
RALM
3DV
39
0
1
07 May 2025
CDE-Mapper: Using Retrieval-Augmented Language Models for Linking Clinical Data Elements to Controlled Vocabularies
CDE-Mapper: Using Retrieval-Augmented Language Models for Linking Clinical Data Elements to Controlled Vocabularies
Komal Gilani
Marlo Verket
Christof Peters
Michel Dumontier
Hans-Peter Brunner-La Rocca
V. Urovi
40
0
0
07 May 2025
Traceback of Poisoning Attacks to Retrieval-Augmented Generation
Traceback of Poisoning Attacks to Retrieval-Augmented Generation
Baolei Zhang
Haoran Xin
Minghong Fang
Zhuqing Liu
Biao Yi
Tong Li
Zheli Liu
SILM
AAML
66
0
0
30 Apr 2025
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets
Lorenz Brehme
Thomas Ströhle
Ruth Breu
65
0
0
28 Apr 2025
The Viability of Crowdsourcing for RAG Evaluation
The Viability of Crowdsourcing for RAG Evaluation
Lukas Gienapp
Tim Hagen
Maik Frobe
Matthias Hagen
Benno Stein
Martin Potthast
Harrisen Scells
23
0
0
22 Apr 2025
TPU-Gen: LLM-Driven Custom Tensor Processing Unit Generator
Deepak Vungarala
Mohammed E. Elbtity
Sumiya Syed
Sakila Alam
Kartik Pandit
Arnob Ghosh
Ramtin Zand
Shaahin Angizi
34
1
0
07 Mar 2025
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling
Shenghong He
OffRL
183
0
0
10 Feb 2025
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
Robert Friel
Masha Belyi
Atindriyo Sanyal
82
19
0
17 Jan 2025
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Aidan Hogan
Xin Luna Dong
Denny Vrandečić
Gerhard Weikum
52
1
0
12 Jan 2025
Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification
Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification
Kyle Thompson
Nuno Saavedra
Pedro Carrott
Kevin Fisher
Alex Sanchez-Stern
Yuriy Brun
J. Ferreira
Sorin Lerner
E. First
LRM
100
1
0
18 Dec 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
Nandan Thakur
Suleman Kazi
Ge Luo
Jimmy J. Lin
Amin Ahmad
VLM
RALM
28
7
0
17 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
Huimin Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
50
7
0
17 Oct 2024
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks
G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks
Guibin Zhang
Xinfeng Li
Xiangguo Sun
Guancheng Wan
Miao Yu
Junfeng Fang
Kun Wang
Dawei Cheng
Dawei Cheng
AAML
AI4CE
51
7
0
15 Oct 2024
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed
  KV Caches for Chunked Text
TurboRAG: Accelerating Retrieval-Augmented Generation with Precomputed KV Caches for Chunked Text
Songshuo Lu
Hua Wang
Yutian Rong
Zhi Chen
Yaohua Tang
VLM
31
14
0
10 Oct 2024
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Chenyu You
HILM
112
16
0
30 Sep 2024
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
Quanting Xie
So Yeon Min
Tianyi Zhang
Kedi Xu
Aarav Bajaj
Ruslan Salakhutdinov
Matthew Johnson-Roberson
Yonatan Bisk
Matthew Johnson-Roberson
Yonatan Bisk
LM&Ro
55
7
0
26 Sep 2024
MMSearch: Benchmarking the Potential of Large Models as Multi-modal
  Search Engines
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanmin Wu
Jiayi Lei
...
Guanglu Song
Peng Gao
Yu Liu
Chunyuan Li
Hongsheng Li
MLLM
32
16
0
19 Sep 2024
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference
  Serving at Scale
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Jaehong Cho
Minsu Kim
Hyunmin Choi
Guseul Heo
Jongse Park
38
9
0
10 Aug 2024
High-Throughput Phenotyping of Clinical Text Using Large Language Models
High-Throughput Phenotyping of Clinical Text Using Large Language Models
D. B. Hier
S. I. Munzir
Anne Stahlfeld
Tayo Obafemi-Ajayi
M. Carrithers
LM&MA
53
1
0
02 Aug 2024
PersLLM: A Personified Training Approach for Large Language Models
PersLLM: A Personified Training Approach for Large Language Models
Zheni Zeng
Jiayi Chen
Huimin Chen
Yukun Yan
Yuxuan Chen
Zhenghao Liu
Zhiyuan Liu
Maosong Sun
LLMAG
49
2
0
17 Jul 2024
Better RAG using Relevant Information Gain
Better RAG using Relevant Information Gain
Marc Pickett
Jeremy Hartman
Ayan Kumar Bhowmick
Raquib-ul Alam
Aditya Vempaty
RALM
40
3
0
16 Jul 2024
A Chatbot for Asylum-Seeking Migrants in Europe
A Chatbot for Asylum-Seeking Migrants in Europe
Bettina Fazzinga
Elena Palmieri
Margherita Vestoso
Luca Bolognini
Andrea Galassi
F. Furfaro
Paolo Torroni
17
1
0
12 Jul 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
42
43
0
09 Jul 2024
A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems
A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems
Florin Cuconasu
Giovanni Trappolini
Nicola Tonellotto
Fabrizio Silvestri
53
2
0
21 Jun 2024
SEC-QA: A Systematic Evaluation Corpus for Financial QA
SEC-QA: A Systematic Evaluation Corpus for Financial QA
Viet Dac Lai
Michael Krumdick
Charles Lovering
Varshini Reddy
Craig W. Schmidt
Chris Tanner
56
3
0
20 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip Torr
João F. Henriques
Jakob N. Foerster
32
1
0
05 Jun 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal
  Research Tools
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILM
ELM
AILaw
27
66
0
30 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
41
4
0
29 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
62
82
0
13 May 2024
ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence
ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence
Kevin Wu
Eric Wu
James Zou
AAML
61
40
0
16 Apr 2024
Automating Research Synthesis with Domain-Specific Large Language Model
  Fine-Tuning
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
Teo Susnjak
Peter Hwang
N. Reyes
A. Barczak
Timothy R. McIntosh
Surangika Ranathunga
70
22
0
08 Apr 2024
Dialectical Alignment: Resolving the Tension of 3H and Security Threats
  of LLMs
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
37
5
0
30 Mar 2024
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based
  Search Engines
Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines
Lijia Ma
Xingchen Xu
Yong-Ming Tan
32
7
0
29 Feb 2024
DevBots can co-design APIs
DevBots can co-design APIs
Vinicius Soares Silva Marques
21
0
0
10 Dec 2023
Piecing Together Clues: A Benchmark for Evaluating the Detective Skills
  of Large Language Models
Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models
Zhouhong Gu
Lin Zhang
Jiangjie Chen
Haoning Ye
Xiaoxuan Zhu
...
Jianchen Wang
Yikai Zhang
Wenhao Huang
Yanghua Xiao
Hongwei Feng
RALM
ELM
34
0
0
11 Jul 2023
Rethinking with Retrieval: Faithful Large Language Model Inference
Rethinking with Retrieval: Faithful Large Language Model Inference
Hangfeng He
Hongming Zhang
Dan Roth
KELM
LRM
141
158
0
31 Dec 2022
Compositional Semantic Parsing with Large Language Models
Compositional Semantic Parsing with Large Language Models
Andrew Drozdov
Nathanael Scharli
Ekin Akyuurek
Nathan Scales
Xinying Song
Xinyun Chen
Olivier Bousquet
Denny Zhou
ReLM
LRM
200
92
0
29 Sep 2022
Factual Error Correction for Abstractive Summarization Models
Factual Error Correction for Abstractive Summarization Models
Mengyao Cao
Yue Dong
Jiapeng Wu
Jackie C.K. Cheung
HILM
KELM
169
159
0
17 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1