ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.08411
  4. Cited By
Large Language Models Struggle to Learn Long-Tail Knowledge
v1v2 (latest)

Large Language Models Struggle to Learn Long-Tail Knowledge

15 November 2022
Nikhil Kandpal
H. Deng
Adam Roberts
Eric Wallace
Colin Raffel
    RALMKELM
ArXiv (abs)PDFHTML

Papers citing "Large Language Models Struggle to Learn Long-Tail Knowledge"

50 / 260 papers shown
Title
From Data to Knowledge: Evaluating How Efficiently Language Models Learn Facts
From Data to Knowledge: Evaluating How Efficiently Language Models Learn Facts
Daniel Christoph
Max Ploner
Patrick Haller
Alan Akbik
KELM
15
0
0
20 Jun 2025
Inter-Passage Verification for Multi-evidence Multi-answer QA
Inter-Passage Verification for Multi-evidence Multi-answer QA
Bingsen Chen
Shengjie Wang
Xi Ye
Chen Zhao
RALM
35
0
0
31 May 2025
OntoRAG: Enhancing Question-Answering through Automated Ontology Derivation from Unstructured Knowledge Bases
OntoRAG: Enhancing Question-Answering through Automated Ontology Derivation from Unstructured Knowledge Bases
Yash Tiwari
Owais Ahmad Lone
Mayukha Pal
36
0
0
31 May 2025
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
Xuan Gong
Hanbo Huang
Shiyu Liang
39
0
0
29 May 2025
The Coverage Principle: A Framework for Understanding Compositional Generalization
The Coverage Principle: A Framework for Understanding Compositional Generalization
Hoyeon Chang
Jinho Park
Hanseul Cho
Sohee Yang
Miyoung Ko
Hyeonbin Hwang
Seungpil Won
Dohaeng Lee
Youbin Ahn
Minjoon Seo
61
0
0
26 May 2025
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models
Tingjia Shen
Hao Wang
Chuan Qin
Ruijun Sun
Yang Song
Defu Lian
Hengshu Zhu
Enhong Chen
57
0
0
26 May 2025
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning
Thang Nguyen
Peter Chin
Yu-Wing Tai
LRM
80
1
0
26 May 2025
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Minki Kang
Jongwon Jeong
Seanie Lee
Jaewoong Cho
Sung Ju Hwang
LRM
271
2
0
23 May 2025
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu
Kaifeng Lyu
Jiazheng Li
Jingzhao Zhang
83
0
0
23 May 2025
Diagnosing our datasets: How does my language model learn clinical information?
Diagnosing our datasets: How does my language model learn clinical information?
Furong Jia
David Sontag
Monica Agrawal
LM&MA
212
1
0
21 May 2025
Enhancing LLMs via High-Knowledge Data Selection
Enhancing LLMs via High-Knowledge Data Selection
Feiyu Duan
Xuemiao Zhang
Sirui Wang
Haoran Que
Yuqi Liu
Wenge Rong
Xunliang Cai
237
0
0
20 May 2025
GAP: Graph-Assisted Prompts for Dialogue-based Medication Recommendation
GAP: Graph-Assisted Prompts for Dialogue-based Medication Recommendation
Jialun Zhong
Yanzeng Li
Sen Hu
Yang Zhang
Teng Xu
Lei Zou
LM&MA
95
0
0
19 May 2025
Emergent Specialization: Rare Token Neurons in Language Models
Emergent Specialization: Rare Token Neurons in Language Models
Jing Liu
Haozheng Wang
Yueheng Li
MILMLRM
70
0
0
19 May 2025
From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling
From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling
Mohsinul Kabir
Tasfia Tahsin
Sophia Ananiadou
KELMAI4CE
59
0
0
18 May 2025
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation
Chengwei Qin
Wenxuan Zhou
Karthik Abinav Sankararaman
Nanshu Wang
Tengyu Xu
...
Aditya Tayade
Sinong Wang
Shafiq Joty
Han Fang
Hao Ma
HILMLRM
103
0
0
18 May 2025
CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning
CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning
S. Wang
Li Zhang
Zheren Fu
Zhendong Mao
51
0
0
15 May 2025
IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation
IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation
Kazuki Hayashi
Hidetaka Kamigaito
Shinya Kouda
Taro Watanabe
RALM
106
1
0
13 May 2025
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAMLSILM
223
2
0
07 May 2025
Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry
Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry
J. Kim
Chaeeun Shim
Sungjin Park
Su Yeon Lee
Gee Young Suh
...
Yong Soo Kim
Hee-Joon Bae
Sung Yoon Lim
Han-Gil Jeong
Edward Choi
LRM
111
1
0
05 May 2025
CHORUS: Zero-shot Hierarchical Retrieval and Orchestration for Generating Linear Programming Code
CHORUS: Zero-shot Hierarchical Retrieval and Orchestration for Generating Linear Programming Code
Tasnim Ahmed
Salimur Choudhury
56
0
0
02 May 2025
EnronQA: Towards Personalized RAG over Private Documents
EnronQA: Towards Personalized RAG over Private Documents
Michael J. Ryan
Danmei Xu
Chris Nivera
Daniel Campos
SILM
136
2
0
01 May 2025
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
Chong Chen
Daochang Liu
M. Shah
Chang Xu
121
1
0
25 Apr 2025
HalluLens: LLM Hallucination Benchmark
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
132
5
0
24 Apr 2025
FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation
FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation
Chanyeol Choi
Jihoon Kwon
Jaeseon Ha
Hojun Choi
Chaewoon Kim
Yongjae Lee
Jy-yong Sohn
Alejandro Lopez-Lira
RALM
199
1
0
22 Apr 2025
CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge
CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge
Armin Toroghi
Willis Guo
Scott Sanner
RALMLRM
70
0
0
20 Apr 2025
Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
Yejun Yoon
Jaeyoon Jung
Seunghyun Yoon
Kunwoo Park
63
0
0
19 Apr 2025
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur
Jimmy J. Lin
Sam Havens
Michael Carbin
Omar Khattab
Andrew Drozdov
124
5
0
17 Apr 2025
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
Zhenru Zhang
Ning Li
Qi Liu
Rui Li
W. Gao
Qingyang Mao
Zhenya Huang
Baosheng Yu
Dacheng Tao
RALM
104
0
0
11 Apr 2025
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation
Bo Zhang
Hui Ma
Dailin Li
Jian Ding
Jian Wang
Bo Xu
Hongfei Lin
KELM
99
0
0
10 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLMALMLRM
237
26
0
09 Apr 2025
Retrieval Augmented Generation with Collaborative Filtering for Personalized Text Generation
Retrieval Augmented Generation with Collaborative Filtering for Personalized Text Generation
Teng Shi
Jun Xu
Xiao Zhang
Xiaoxue Zang
Kai Zheng
Yang Song
Han Li
RALM3DV
88
0
0
08 Apr 2025
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models
Minki Kang
Jongwon Jeong
Jaewoong Cho
ALMLRM
116
4
0
07 Apr 2025
How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices?
How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices?
Andres Algaba
Vincent Holst
Floriano Tori
Melika Mobini
Brecht Verbeken
Sylvia Wenmackers
Vincent Ginis
123
1
0
03 Apr 2025
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models
Zhenting Wang
Zhongxin Liu
Ying Li
Hongyu Sun
Meng Xu
Yuqing Zhang
HILM
96
0
0
25 Mar 2025
Fact-checking AI-generated news reports: Can LLMs catch their own lies?
Fact-checking AI-generated news reports: Can LLMs catch their own lies?
Jiayi Yao
Haibo Sun
Nianwen Xue
HILM
78
0
0
24 Mar 2025
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Zefeng Zhang
Hengzhu Tang
Shuaiyi Nie
Zhenyu Zhang
Yiming Ren
Zhenyang Li
Dawei Yin
Duohe Ma
Tingwen Liu
117
1
0
23 Mar 2025
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper
Sargam Yadav
Asifa Mehmood Qureshi
Abhishek Kaushik
Shubham Sharma
Roisin Loughran
...
. Nikhil Singh
Padraic O'Hara
Pranay Jaiswal
Roshan Chandru
David Lillis
157
1
0
10 Mar 2025
Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge
Xinyue Cui
Johnny Tian-Zheng Wei
Swabha Swayamdipta
Robin Jia
WaLM
146
2
0
06 Mar 2025
Forecasting Rare Language Model Behaviors
Erik Jones
Meg Tong
Jesse Mu
Mohammed Mahfoud
Jan Leike
Roger C. Grosse
Jared Kaplan
William Fithian
Ethan Perez
Mrinank Sharma
99
1
0
24 Feb 2025
Deep Minimax Classifiers for Imbalanced Datasets with a Small Number of Minority Samples
Hansung Choi
Daewon Seo
75
0
0
24 Feb 2025
Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
Peng Yifeng
Wu Zhizheng
Chen Chen
AAML
76
0
0
23 Feb 2025
Interrogating LLM design under a fair learning doctrine
Interrogating LLM design under a fair learning doctrine
Johnny Tian-Zheng Wei
Maggie Wang
Ameya Godbole
Jonathan H. Choi
Robin Jia
123
0
0
22 Feb 2025
KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse
Jingbo Yang
Bairu Hou
Wei Wei
Yujia Bao
Shiyu Chang
VLM
188
3
0
21 Feb 2025
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
Tristan Karch
Luca Engel
Philippe Schwaller
Frédéric Kaplan
160
0
0
19 Feb 2025
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
LRM
104
0
0
05 Feb 2025
LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System
Tianfu Wang
Yi Zhan
Jianxun Lian
Zhengyu Hu
N. Yuan
Qi Zhang
Xing Xie
Hui Xiong
82
3
0
28 Jan 2025
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
83
2
0
21 Jan 2025
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions
Aidan Hogan
Xin Luna Dong
Denny Vrandečić
Gerhard Weikum
121
5
0
12 Jan 2025
QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance
Binita Saha
Utsha Saha
Muhammad Zubair Malik
RALM3DV
92
6
0
06 Jan 2025
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
Shengbin Yue
Siyuan Wang
Wei Chen
Xuanjing Huang
Zhongyu Wei
LLMAG
163
11
0
03 Jan 2025
123456
Next