Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.03214
Cited By
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
5 October 2023
Tu Vu
Mohit Iyyer
Xuezhi Wang
Noah Constant
Jerry W. Wei
Jason W. Wei
Chris Tar
Yun-hsuan Sung
Denny Zhou
Quoc Le
Thang Luong
KELM
HILM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation"
50 / 149 papers shown
Title
XRAG: Cross-lingual Retrieval-Augmented Generation
Wei Liu
Sony Trenous
Leonardo F. R. Ribeiro
Bill Byrne
Felix Hieber
RALM
29
0
0
15 May 2025
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Massimiliano Pronesti
Joao Bettencourt-Silva
Paul Flanagan
Alessandra Pascale
Oisin Redmond
Anya Belz
Yufang Hou
38
0
0
09 May 2025
LLM-Independent Adaptive RAG: Let the Question Speak for Itself
Maria Marina
Nikolay Ivanov
Sergey Pletenev
Mikhail Salnikov
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Alexander Panchenko
Viktor Moskvoretskii
RALM
44
0
0
07 May 2025
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese
Peilin Zhou
Bruce Leon
Xiang Ying
C. Zhang
Yifan Shao
...
Sixin Hong
J. Ren
Jian Chen
Chao-Hong Liu
Yining Hua
RALM
ELM
LRM
50
0
0
27 Apr 2025
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
92
1
0
24 Apr 2025
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Chanhee Park
Hyeonseok Moon
Chanjun Park
Heuiseok Lim
RALM
65
0
0
23 Apr 2025
Synergizing RAG and Reasoning: A Systematic Review
Yunfan Gao
Yun Xiong
Yijie Zhong
Yuxi Bi
Ming Xue
Haoyu Wang
LRM
AI4CE
138
2
0
22 Apr 2025
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey
Aoran Gan
Hao Yu
Kai Zhang
Qi Liu
Wenyu Yan
Zhenya Huang
Shiwei Tong
Guoping Hu
RALM
3DV
43
0
0
21 Apr 2025
ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition
Haidar Khan
H. A. Alyahya
Yazeed Alnumay
M Saiful Bari
B. Yener
ELM
LRM
57
0
0
17 Apr 2025
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Aochong Oliver Li
Tanya Goyal
KELM
50
1
0
16 Apr 2025
ToolRL: Reward is All Tool Learning Needs
Cheng Qian
Emre Can Acikgoz
Qi He
Hongru Wang
Xiusi Chen
Dilek Hakkani-Tur
Gokhan Tur
Heng Ji
OffRL
LRM
38
6
0
16 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
61
0
0
02 Apr 2025
The Illusionist's Prompt: Exposing the Factual Vulnerabilities of Large Language Models with Linguistic Nuances
Yining Wang
Yibo Wang
Xi Li
Mi Zhang
Geng Hong
Min Yang
AAML
HILM
70
0
0
01 Apr 2025
Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions
Shih-Han Chan
AAML
60
0
0
29 Mar 2025
OAEI-LLM-T: A TBox Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching
Zhangcheng Qiang
Kerry Taylor
Weiqing Wang
Jing Jiang
54
0
0
25 Mar 2025
Battling Misinformation: An Empirical Study on Adversarial Factuality in Open-Source Large Language Models
Shahnewaz Karim Sakib
Anindya Bijoy Das
Shibbir Ahmed
AAML
58
1
0
12 Mar 2025
DAFE: LLM-Based Evaluation Through Dynamic Arbitration for Free-Form Question-Answering
Sher Badshah
Hassan Sajjad
68
1
0
11 Mar 2025
ZeroSumEval: An Extensible Framework For Scaling LLM Evaluation with Inter-Model Competition
H. A. Alyahya
Haidar Khan
Yazeed Alnumay
M Saiful Bari
B. Yener
LRM
67
1
0
10 Mar 2025
Understanding the Limits of Lifelong Knowledge Editing in LLMs
Lukas Thede
Karsten Roth
Matthias Bethge
Zeynep Akata
Tom Hartvigsen
KELM
CLL
75
2
0
07 Mar 2025
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Richard Ren
Arunim Agarwal
Mantas Mazeika
Cristina Menghini
Robert Vacareanu
...
Matias Geralnik
Adam Khoja
Dean Lee
Summer Yue
Dan Hendrycks
HILM
ALM
90
0
0
05 Mar 2025
SAFE: A Sparse Autoencoder-Based Framework for Robust Query Enrichment and Hallucination Mitigation in LLMs
Samir Abdaljalil
Filippo Pallucchini
Andrea Seveso
Hasan Kurban
Fabio Mercorio
Erchin Serpedin
HILM
77
0
0
04 Mar 2025
FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models
Radu Marinescu
D. Bhattacharjya
Junkyu Lee
T. Tchrakian
Javier Carnerero-Cano
Yufang Hou
Elizabeth M. Daly
Alessandra Pascale
HILM
LRM
61
0
0
25 Feb 2025
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Zhenheng Tang
Xiang Liu
Qian Wang
Peijie Dong
Bingsheng He
Xiaowen Chu
Bo Li
LRM
61
1
0
24 Feb 2025
Language Model Re-rankers are Steered by Lexical Similarities
Lovisa Hagström
Ercong Nie
Ruben Halifa
Helmut Schmid
Richard Johansson
Alexander Junge
48
0
0
24 Feb 2025
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild
Saad Obaid ul Islam
Anne Lauscher
Goran Glavas
HILM
LRM
122
1
0
21 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations
Borui Yang
Md Afif Al Mamun
Jie M. Zhang
Gias Uddin
HILM
64
0
0
20 Feb 2025
SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian
Emre Can Acikgoz
H. Wang
Xiusi Chen
Avirup Sil
Dilek Hakkani-Tur
Gokhan Tur
Heng Ji
LLMAG
KELM
LRM
71
4
0
17 Feb 2025
Valuable Hallucinations: Realizable Non-realistic Propositions
Qiucheng Chen
Bo Wang
LRM
59
0
0
16 Feb 2025
C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation
Guoxin Chen
Minpeng Liao
Peiying Yu
Dingmin Wang
Zile Qiao
Chao Yang
Xin Zhao
Kai Fan
66
1
0
10 Feb 2025
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Satyapriya Krishna
Kalpesh Krishna
Anhad Mohananey
Steven Schwarcz
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
ReLM
3DV
LRM
RALM
47
14
0
28 Jan 2025
Iterative Tree Analysis for Medical Critics
Zenan Huang
Mingwei Li
Zheng Zhou
Youxin Jiang
151
0
0
18 Jan 2025
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Aidan Hogan
Xin Luna Dong
Denny Vrandečić
Gerhard Weikum
52
1
0
12 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Yanwen Huang
Yong Zhang
Ning Cheng
Zhitao Li
Shaojun Wang
Jing Xiao
88
0
0
02 Jan 2025
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong
Zhaoxiang Wang
Tianshi Zheng
Xiyu Ren
Yangqiu Song
64
1
0
31 Dec 2024
Real-time Fake News from Adversarial Feedback
Sanxing Chen
Yukun Huang
Bhuwan Dhingra
39
0
0
31 Dec 2024
MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge
Jie He
Nan Hu
Wanqiu Long
Jiaoyan Chen
Jeff Z. Pan
ELM
LRM
99
6
0
22 Dec 2024
A Reality Check on Context Utilisation for Retrieval-Augmented Generation
Lovisa Hagström
Sara Vera Marjanović
Haeun Yu
Arnav Arora
Christina Lioma
Maria Maistro
Pepa Atanasova
Isabelle Augenstein
82
0
0
22 Dec 2024
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
Zhuoran Jin
Hongbang Yuan
Tianyi Men
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
ALM
82
7
0
18 Dec 2024
DMQR-RAG: Diverse Multi-Query Rewriting for RAG
Zhicong Li
Jiahao Wang
Zhishu Jiang
Hangyu Mao
Zhongxia Chen
Jiazhen Du
Yuanxing Zhang
Fuzheng Zhang
Di Zhang
Yong Liu
211
3
0
20 Nov 2024
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
Ian Poey
Jiajun Liu
Qishuai Zhong
Adrien Chenailler
63
0
0
06 Nov 2024
VERITAS: A Unified Approach to Reliability Evaluation
Rajkumar Ramamurthy
Meghana Arakkal Rajeev
Oliver Molenschot
James Zou
Nazneen Rajani
HILM
55
1
0
05 Nov 2024
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Yangning Li
Hai-Tao Zheng
Xinyu Wang
Yong Jiang
Zhen Zhang
...
Hui Wang
Hai-Tao Zheng
Pengjun Xie
Philip S. Yu
Fei Huang
65
16
0
05 Nov 2024
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
Mufei Li
Siqi Miao
Pan Li
RALM
38
8
0
28 Oct 2024
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Qitan Lv
Jie Wang
Hanzhu Chen
Bin Li
Yongdong Zhang
Feng Wu
HILM
31
3
0
19 Oct 2024
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning
Pengfei He
Zitao Li
Yue Xing
Yaling Li
Jiliang Tang
Bolin Ding
LLMAG
LRM
35
1
0
18 Oct 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
Nandan Thakur
Suleman Kazi
Ge Luo
Jimmy J. Lin
Amin Ahmad
VLM
RALM
28
7
0
17 Oct 2024
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
Catarina G. Belem
Pouya Pezeskhpour
Hayate Iso
Seiji Maekawa
Nikita Bhutani
Estevam R. Hruschka
HILM
73
1
0
17 Oct 2024
Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?
Yifan Feng
Chengwu Yang
Xingliang Hou
S. Du
Shihui Ying
Zongze Wu
Yue Gao
32
3
0
14 Oct 2024
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
S. Yu
C. Tang
Bokai Xu
Junbo Cui
Junhao Ran
...
Zhenghao Liu
Shuo Wang
Xu Han
Zhiyuan Liu
Maosong Sun
VLM
39
23
0
14 Oct 2024
GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps
Muhammad Umair Nasir
Steven D. James
Julian Togelius
ELM
LRM
34
2
0
10 Oct 2024
1
2
3
Next