Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.03551
Cited By
v1
v2 (latest)
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
9 May 2017
Mandar Joshi
Eunsol Choi
Daniel S. Weld
Luke Zettlemoyer
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension"
50 / 1,823 papers shown
Title
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
80
0
0
30 Apr 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
107
0
0
29 Apr 2025
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Yuxiao Chen
Haoran Li
Yuan Sui
Yi Liu
Yufei He
Yangqiu Song
Bryan Hooi
AAML
SILM
161
1
0
29 Apr 2025
Computational Reasoning of Large Language Models
Haitao Wu
Zongbo Han
Joey Tianyi Zhou
Huaxi Huang
Changqing Zhang
ELM
LRM
108
0
0
29 Apr 2025
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities
Woongyeong Yeo
Kangsan Kim
Soyeong Jeong
Jinheon Baek
Sung Ju Hwang
153
1
0
29 Apr 2025
BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese
Peilin Zhou
Bruce Leon
Xiang Ying
Chen Zhang
Yifan Shao
...
Sixin Hong
J. Ren
Jian Chen
Chao-Hong Liu
Yining Hua
RALM
ELM
LRM
138
5
0
27 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
253
7
0
26 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
174
0
0
25 Apr 2025
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
132
5
0
24 Apr 2025
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
134
1
0
23 Apr 2025
Credible plan-driven RAG method for Multi-hop Question Answering
Ningning Zhang
Chi Zhang
Zhizhong Tan
Xingxing Yang
Weiping Deng
Wenyong Wang
LRM
86
1
0
23 Apr 2025
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Chanhee Park
Hyeonseok Moon
Chanjun Park
Heuiseok Lim
RALM
109
1
0
23 Apr 2025
Compass-V2 Technical Report
Sophia Maria
MoE
LRM
120
0
0
22 Apr 2025
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
343
0
0
21 Apr 2025
DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization
Xinzhe Huang
Kedong Xiu
T. Zheng
Churui Zeng
Wangze Ni
Zhan Qiin
K. Ren
Chong Chen
AAML
55
0
0
21 Apr 2025
AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning
Jiaqi Wei
Hao Zhou
Xiang Zhang
Di Zhang
Zijie Qiu
Wei Wei
Jinzhe Li
Wanli Ouyang
Siqi Sun
90
0
0
21 Apr 2025
Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
Yejun Yoon
Jaeyoon Jung
Seunghyun Yoon
Kunwoo Park
63
0
0
19 Apr 2025
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Andrea Santilli
Adam Goliñski
Michael Kirchhof
Federico Danieli
Arno Blaas
Miao Xiong
Luca Zappella
Sinead Williamson
71
3
0
18 Apr 2025
STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings
Saksham Rastogi
Pratyush Maini
Danish Pruthi
178
0
0
18 Apr 2025
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur
Jimmy J. Lin
Sam Havens
Michael Carbin
Omar Khattab
Andrew Drozdov
130
5
0
17 Apr 2025
ACoRN: Noise-Robust Abstractive Compression in Retrieval-Augmented Language Models
Singon Kim
Gunho Jung
Seong-Whan Lee
RALM
87
0
0
17 Apr 2025
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models
Zhouhao Sun
Xiao Ding
Li Du
Yunpeng Xu
Yixuan Ma
Yang Zhao
Bing Qin
Ting Liu
93
0
0
17 Apr 2025
Shared Disk KV Cache Management for Efficient Multi-Instance Inference in RAG-Powered LLMs
Hyungwoo Lee
Kihyun Kim
Jinwoo Kim
Jungmin So
Myung-Hoon Cha
H. Kim
James J. Kim
Youngjae Kim
79
0
0
16 Apr 2025
BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents
Jason W. Wei
Zhiqing Sun
Spencer Papay
S. McKinney
Jeffrey Han
Isa Fulford
Hyung Won Chung
Alex Tachard Passos
W. Fedus
Amelia Glaese
72
19
0
16 Apr 2025
Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?
Hansi Zeng
Kai Hui
Honglei Zhuang
Zhen Qin
Zhenrui Yue
Hamed Zamani
Dana Alon
63
0
0
16 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
230
132
1
14 Apr 2025
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation
Hanghui Guo
Jia Zhu
Shimin Di
Weijie Shi
Zhangze Chen
Jiajie Xu
120
0
0
14 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
84
1
0
13 Apr 2025
HalluShift: Measuring Distribution Shifts towards Hallucination Detection in LLMs
Sharanya Dasgupta
Sujoy Nath
Arkaprabha Basu
Pourya Shamsolmoali
Swagatam Das
HILM
95
1
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
83
1
0
13 Apr 2025
HeteRAG: A Heterogeneous Retrieval-augmented Generation Framework with Decoupled Knowledge Representations
Peiru Yang
Xintian Li
Zhiyang Hu
Jiapeng Wang
Jinhua Yin
...
Lizhi He
Shuai Yang
Shangguang Wang
Yongfeng Huang
Tao Qi
88
1
0
12 Apr 2025
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
Haotian Ye
Himanshu Jain
Chong You
A. Suresh
Haowei Lin
James Zou
Felix X. Yu
67
0
0
12 Apr 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
161
14
0
11 Apr 2025
TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models
Sher Badshah
Ali Emami
Hassan Sajjad
LLMAG
ELM
107
0
0
10 Apr 2025
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Mengjia Niu
Hamed Haddadi
Guansong Pang
HILM
117
0
0
10 Apr 2025
Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation
Bo Zhang
Hui Ma
Dailin Li
Jian Ding
Jian Wang
Bo Xu
Hongfei Lin
KELM
101
0
0
10 Apr 2025
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang
Haitao Wu
Changqing Zhang
Peilin Zhao
Yatao Bian
ReLM
LRM
188
19
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
57
0
0
08 Apr 2025
Knowledge-Instruct: Effective Continual Pre-training from Limited Data using Instructions
O. Ovadia
Meni Brief
Rachel Lemberg
Eitam Sheetrit
CLL
KELM
76
1
0
08 Apr 2025
Graph-based Approaches and Functionalities in Retrieval-Augmented Generation: A Comprehensive Survey
Zulun Zhu
Tiancheng Huang
Kai Wang
Junda Ye
Xiao Chen
Siqiang Luo
3DV
146
0
0
08 Apr 2025
PipeDec: Low-Latency Pipeline-based Inference with Dynamic Speculative Decoding towards Large-scale Models
Haofei Yin
Mengbai Xiao
Rouzhou Lu
Xiao Zhang
Dongxiao Yu
Guanghui Zhang
AI4CE
79
0
0
05 Apr 2025
Rethinking Reflection in Pre-Training
Essential AI
Darsh J Shah
Peter Rushton
Somanshu Singla
Mohit Parmar
...
Philip Monk
Platon Mazarakis
Ritvik Kapila
Saurabh Srivastava
Tim Romanski
ReLM
LRM
171
14
0
05 Apr 2025
Sigma: A dataset for text-to-code semantic parsing with statistical analysis
Saleh Almohaimeed
Shenyang Liu
May Alsofyani
Saad Almohaimeed
Liqiang Wang
117
0
0
05 Apr 2025
QE-RAG: A Robust Retrieval-Augmented Generation Benchmark for Query Entry Errors
Kepu Zhang
Zhongxiang Sun
Weijie Yu
Xiaoxue Zang
Kai Zheng
Yang Song
Han Li
Jun Xu
3DV
72
1
0
05 Apr 2025
VocalNet: Speech LLM with Multi-Token Prediction for Faster and High-Quality Generation
Yuhao Wang
Heyang Liu
Ziyang Cheng
Ronghua Wu
Qunshan Gu
Yanfeng Wang
Yu Wang
458
3
0
05 Apr 2025
Safe Screening Rules for Group OWL Models
Runxue Bao
Quanchao Lu
Yanfu Zhang
110
0
0
04 Apr 2025
Hallucination Detection on a Budget: Efficient Bayesian Estimation of Semantic Entropy
K. Ciosek
Nicolò Felicioni
Sina Ghiassian
115
0
0
04 Apr 2025
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators
Beichen Huang
Yueming Yuan
Zelei Shao
Minjia Zhang
MQ
MoE
152
0
0
03 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
96
1
0
03 Apr 2025
Adapting Large Language Models for Multi-Domain Retrieval-Augmented-Generation
Alexandre Misrahi
Nadezhda Chirkova
Maxime Louis
Vassilina Nikoulina
RALM
137
0
0
03 Apr 2025
Previous
1
2
3
4
5
...
35
36
37
Next