Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.03551
Cited By
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
9 May 2017
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension"
50 / 578 papers shown
Title
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs
Chuan Xu
Qiaosheng Chen
Yutong Feng
Gong Cheng
RALM
3DV
VLM
36
0
0
16 May 2025
Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models
Camille Couturier
Spyros Mastorakis
Haiying Shen
Saravan Rajmohan
Victor Rühle
KELM
17
0
0
16 May 2025
CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning
S. Wang
L. Zhang
Zheren Fu
Zhendong Mao
27
0
0
15 May 2025
Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
Hongjin Qian
Zhengyang Liang
RALM
LRM
38
0
0
14 May 2025
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis
Heydar Soudani
Evangelos Kanoulas
Faegheh Hasibi
36
0
0
12 May 2025
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection
Kai Hua
Steven Wu
Ge Zhang
Ke Shen
LRM
28
0
0
12 May 2025
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun
Xianrui Zhong
Sizhe Zhou
Jiawei Han
RALM
31
0
0
12 May 2025
The Distracting Effect: Understanding Irrelevant Passages in RAG
Chen Amiraz
Florin Cuconasu
Simone Filice
Zohar Karnin
34
0
0
11 May 2025
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev
Christian Herold
Baohao Liao
Seyyed Hadi Hashemi
Shahram Khadivi
Christof Monz
MU
197
0
0
09 May 2025
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
Zehao Fan
Garrett Gagnon
Zhenyu Liu
Liu Liu
29
0
0
09 May 2025
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
Hao Sun
Zile Qiao
Jiayan Guo
Xuanbo Fan
Yingyan Hou
Yong Jiang
Pengjun Xie
Yan Zhang
Fei Huang
Jingren Zhou
OffRL
64
2
0
07 May 2025
LLM-Independent Adaptive RAG: Let the Question Speak for Itself
Maria Marina
Nikolay Ivanov
Sergey Pletenev
Mikhail Salnikov
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Alexander Panchenko
Viktor Moskvoretskii
RALM
44
0
0
07 May 2025
A Reasoning-Focused Legal Retrieval Benchmark
Lucia Zheng
Neel Guha
Javokhir Arifov
Sarah Zhang
Michal Skreta
Christopher D. Manning
Peter Henderson
Daniel E. Ho
AILaw
RALM
ELM
102
3
0
06 May 2025
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
Kazuki Fujii
Yukito Tajima
Sakae Mizuki
Hinari Shimada
Taihei Shiotani
...
Kakeru Hattori
Youmi Ma
Hiroya Takamura
Rio Yokota
Naoaki Okazaki
SyDa
54
0
0
05 May 2025
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Jihao Zhao
Chunlai Zhou
Biao Qin
55
0
0
05 May 2025
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Yiming Du
Wenyu Huang
Danna Zheng
Zhaowei Wang
Sébastien Montella
Mirella Lapata
Kam-Fai Wong
Jeff Z. Pan
KELM
MU
86
2
0
01 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
71
0
0
01 May 2025
EnronQA: Towards Personalized RAG over Private Documents
Michael J. Ryan
Danmei Xu
Chris Nivera
Daniel Campos
SILM
69
0
0
01 May 2025
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
28
0
0
30 Apr 2025
Computational Reasoning of Large Language Models
Haitao Wu
Zongbo Han
Joey Tianyi Zhou
Huaxi Huang
Changqing Zhang
ELM
LRM
62
0
0
29 Apr 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
56
0
0
29 Apr 2025
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Yuxiao Chen
Haoran Li
Yuan Sui
Yi Liu
Yufei He
Yangqiu Song
Bryan Hooi
AAML
SILM
63
0
0
29 Apr 2025
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities
Woongyeong Yeo
Kangsan Kim
Soyeong Jeong
Jinheon Baek
Sung Ju Hwang
54
1
0
29 Apr 2025
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese
Peilin Zhou
Bruce Leon
Xiang Ying
C. Zhang
Yifan Shao
...
Sixin Hong
J. Ren
Jian Chen
Chao-Hong Liu
Yining Hua
RALM
ELM
LRM
50
0
0
27 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
91
2
0
26 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
94
0
0
25 Apr 2025
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
92
1
0
24 Apr 2025
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
66
1
0
23 Apr 2025
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
80
0
0
21 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
70
15
1
14 Apr 2025
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation
Hanghui Guo
Jia Zhu
Shimin Di
Weijie Shi
Zhangze Chen
Jiajie Xu
35
0
0
14 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
36
0
0
13 Apr 2025
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang
Haitao Wu
Changqing Zhang
Peilin Zhao
Yatao Bian
ReLM
LRM
87
3
0
08 Apr 2025
Sigma: A dataset for text-to-code semantic parsing with statistical analysis
Saleh Almohaimeed
Shenyang Liu
May Alsofyani
Saad Almohaimeed
Liqiang Wang
39
0
0
05 Apr 2025
VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality Generation
Yuhao Wang
Heyang Liu
Ziyang Cheng
Ronghua Wu
Qunshan Gu
Yanfeng Wang
Yu Wang
187
0
0
05 Apr 2025
Safe Screening Rules for Group OWL Models
Runxue Bao
Quanchao Lu
Yanfu Zhang
41
0
0
04 Apr 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
53
1
0
18 Mar 2025
GPU-Accelerated Motion Planning of an Underactuated Forestry Crane in Cluttered Environments
M. Vu
Gerald Ebmer
Alexander Watcher
Marc-Philip Ecker
Giang Nguyen
Tobias Glueck
77
2
0
18 Mar 2025
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving
Wenqi Jiang
Suvinay Subramanian
Cat Graves
Gustavo Alonso
Amir Yazdanbakhsh
Vidushi Dadu
49
6
0
18 Mar 2025
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
51
3
0
17 Mar 2025
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Neusha Javidnia
B. Rouhani
F. Koushanfar
184
0
0
14 Mar 2025
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Bowen Jin
Hansi Zeng
Zhenrui Yue
Dong Wang
Sercan Ö. Arik
Dong Wang
Hamed Zamani
J. Han
RALM
ReLM
KELM
OffRL
AI4TS
LRM
84
29
0
12 Mar 2025
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction
Lu Dai
Yijie Xu
Jinhui Ye
Hao Liu
Hui Xiong
3DV
RALM
86
2
0
03 Mar 2025
Self-Memory Alignment: Mitigating Factual Hallucinations with Generalized Improvement
Siyuan Zhang
Y. Zhang
Yinpeng Dong
Hang Su
HILM
KELM
230
0
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
94
1
0
26 Feb 2025
Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods
Nicola Cecere
Andrea Bacciu
Ignacio Fernández Tobías
Amin Mantrach
66
1
0
25 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
71
0
0
24 Feb 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
76
4
0
24 Feb 2025
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
Jamshid Mozafari
Abdelrahman Abdallah
Bhawna Piryani
Adam Jatowt
47
0
0
22 Feb 2025
1
2
3
4
...
10
11
12
Next