Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.14508
Cited By
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
28 August 2023
Yushi Bai
Xin Lv
Jiajie Zhang
Hong Lyu
Jiankai Tang
Zhidian Huang
Zhengxiao Du
Xiao Liu
Aohan Zeng
Lei Hou
Yuxiao Dong
Jie Tang
Juanzi Li
LLMAG
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding"
50 / 87 papers shown
Title
Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query
Yixuan Wang
Shiyu Ji
Yijun Liu
Yuzhuang Xu
Yang Xu
Qingfu Zhu
Wanxiang Che
13
0
0
24 May 2025
Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find
Owen Bianchi
Mathew J. Koretsky
Maya Willey
Chelsea X. Alvarado
Tanay Nayak
Adi Asija
Nicole Kuznetsov
M. Nalls
F. Faghri
Daniel Khashabi
22
0
0
23 May 2025
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Fanqi Wan
Weizhou Shen
Shengyi Liao
Yingcheng Shi
Chenliang Li
Ziyi Yang
Ji Zhang
Fei Huang
Jingren Zhou
Ming Yan
OffRL
LLMAG
ReLM
LRM
25
0
0
23 May 2025
SELF: Self-Extend the Context Length With Logistic Growth Function
Phat Thanh Dang
Saahil Thoppay
Wang Yang
Qifan Wang
Vipin Chaudhary
Xiaotian Han
45
0
0
22 May 2025
LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
54
0
0
22 May 2025
MacRAG: Compress, Slice, and Scale-up for Multi-Scale Adaptive Context RAG
Woosang Lim
Zekun Li
Gyuwan Kim
Sungyoung Ji
HyeonJung Kim
Kyuri Choi
Jin Hyuk Lim
K. Park
William Yang Wang
49
0
0
10 May 2025
FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension
Jushi Kai
Boyi Zeng
Yansen Wang
Haoli Bai
Ziwei He
Bo Jiang
Zhouhan Lin
54
0
0
01 May 2025
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Yiming Du
Wenyu Huang
Danna Zheng
Zhaowei Wang
Sébastien Montella
Mirella Lapata
Kam-Fai Wong
Jeff Z. Pan
KELM
MU
137
3
0
01 May 2025
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park
Dalton Jones
Matthew J Morse
Raghavv Goel
Mingu Lee
Chris Lott
44
0
0
21 Apr 2025
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Yifei Yu
Qian Zhang
Lingfeng Qiao
Di Yin
Fang Li
Jie Wang
Zheyu Chen
Suncong Zheng
Xiaolong Liang
Xingwu Sun
59
0
0
07 Apr 2025
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
Mohan Zhang
Pingzhi Li
Jie Peng
Mufan Qiu
Tianlong Chen
MoE
103
0
0
02 Apr 2025
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
Yijiong Yu
LRM
AIMat
99
1
0
26 Mar 2025
WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM Inference
Youhui Zuo
Sibo Wei
C. Zhang
Zhuorui Liu
Wenpeng Lu
Dawei Song
VLM
85
0
0
23 Mar 2025
TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification
Junnan Zhu
Min Xiao
Yining Wang
Feifei Zhai
Yu Zhou
Chengqing Zong
91
0
0
19 Mar 2025
GPU-Accelerated Motion Planning of an Underactuated Forestry Crane in Cluttered Environments
M. Vu
Gerald Ebmer
Alexander Watcher
Marc-Philip Ecker
Giang Nguyen
Tobias Glueck
88
3
0
18 Mar 2025
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
...
Eun-Ah Kim
M. Brenner
Viren Jain
Sameera Ponda
Subhashini Venugopalan
ELM
LRM
72
2
0
14 Mar 2025
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Neusha Javidnia
B. Rouhani
F. Koushanfar
381
0
0
14 Mar 2025
MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System
Jihao Zhao
Zhiyuan Ji
Zhaoxin Fan
Hanyu Wang
Pengnian Qi
Simin Niu
Feiyu Xiong
Zhiyu Li
105
0
0
12 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
127
14
0
05 Mar 2025
Neural Attention Search
Difan Deng
Marius Lindauer
108
0
0
21 Feb 2025
Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering
Rongzhi Zhu
Xiangyu Liu
Zequn Sun
Yiwei Wang
Wei Hu
RALM
KELM
LRM
132
2
0
20 Feb 2025
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference
Bingzhe Zhao
Ke Cheng
Aomufei Yuan
Yuxuan Tian
Ruiguang Zhong
Chengchen Hu
Tong Yang
Lian Yu
64
0
0
19 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu Cheng
KELM
98
3
0
19 Feb 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
Jiaqi Zhao
Miao Zhang
Ming Wang
Yuzhang Shang
Kaihao Zhang
Weili Guan
Yaowei Wang
Min Zhang
MQ
80
0
0
18 Feb 2025
LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data
Cehao Yang
Xueyuan Lin
Chengjin Xu
Xuhui Jiang
Shengjie Ma
Aofan Liu
Hui Xiong
Jian Guo
LRM
21
2
0
18 Feb 2025
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Haoyang Li
Xuejia Chen
Zhanchao Xu
Darian Li
Nicole Hu
...
Yongbin Li
Luyu Qiu
C. Zhang
Qing Li
Lei Chen
ELM
LRM
71
1
0
16 Feb 2025
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Heejun Lee
G. Park
Jaduk Suh
Sung Ju Hwang
110
4
0
13 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
141
0
0
10 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
331
5
0
04 Feb 2025
Twilight: Adaptive Attention Sparsity with Hierarchical Top-
p
p
p
Pruning
C. Lin
Jiaming Tang
Shuo Yang
Hanshuo Wang
Tian Tang
Boyu Tian
Ion Stoica
Enze Xie
Mingyu Gao
106
2
0
04 Feb 2025
RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models
Can Jin
Hongwu Peng
Anxiang Zhang
Nuo Chen
Jiahui Zhao
...
Keqin Li
Shuya Feng
Kai Zhong
Caiwen Ding
Dimitris N. Metaxas
141
2
0
02 Feb 2025
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Yuan Feng
Junlin Lv
Yukun Cao
Xike Xie
S. K. Zhou
VLM
72
33
0
28 Jan 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yue Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
64
4
0
25 Jan 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
97
2
0
22 Jan 2025
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Weizhi Fei
Xueyan Niu
Guoqing Xie
Yingqing Liu
Bo Bai
Wei Han
68
1
0
22 Jan 2025
Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL
Yeounoh Chung
Gaurav Tarlok Kakkar
Yu Gan
Brenton Milne
Fatma Ozcan
RALM
80
6
0
21 Jan 2025
From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
Eunseong Choi
Sunkyung Lee
Minjin Choi
June Park
Jongwuk Lee
109
1
0
03 Jan 2025
LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System
Hyucksung Kwon
Kyungmo Koo
Janghyeon Kim
W. Lee
Minjae Lee
...
Yongkee Kwon
Ilkon Kim
Euicheol Lim
John Kim
Jungwook Choi
91
4
0
28 Dec 2024
AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
Xiaobao Wu
Liangming Pan
Yuxi Xie
Ruiwen Zhou
Shuai Zhao
Yubo Ma
Mingzhe Du
Rui Mao
Anh Tuan Luu
William Yang Wang
171
12
0
18 Dec 2024
Expansion Span: Combining Fading Memory and Retrieval in Hybrid State Space Models
Elvis Nunez
Luca Zancato
Benjamin Bowman
Aditya Golatkar
Wei Xia
Stefano Soatto
128
3
0
17 Dec 2024
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
114
4
0
14 Dec 2024
Unifying KV Cache Compression for Large Language Models with LeanKV
Yanqi Zhang
Yuwei Hu
Runyuan Zhao
John C. S. Lui
Haibo Chen
MQ
187
6
0
04 Dec 2024
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
344
0
0
07 Nov 2024
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
Wei Wu
Zhuoshi Pan
Chao Wang
L. Chen
Y. Bai
Kun Fu
Zehua Wang
Hui Xiong
Hui Xiong
LLMAG
99
7
0
05 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
81
10
0
31 Oct 2024
Guide-LLM: An Embodied LLM Agent and Text-Based Topological Map for Robotic Guidance of People with Visual Impairments
Sangmim Song
S. Kodagoda
A. Gunatilake
Marc G. Carmichael
Karthick Thiyagarajan
Jodi Martin
LM&Ro
79
1
0
28 Oct 2024
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee
Chanwoong Yoon
Kyochul Jang
Donghyeon Lee
Minju Song
Hyunjae Kim
Jaewoo Kang
ELM
55
1
0
22 Oct 2024
Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs
Runchu Tian
Yanghao Li
Yuepeng Fu
Siyang Deng
Qinyu Luo
...
Zhong Zhang
Yesai Wu
Yankai Lin
Huadong Wang
Xiaojiang Liu
63
1
0
18 Oct 2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Yizhao Gao
Zhichen Zeng
Dayou Du
Shijie Cao
Hayden Kwok-Hay So
...
Junjie Lai
Mao Yang
Ting Cao
Fan Yang
M. Yang
80
20
0
17 Oct 2024
An Evolved Universal Transformer Memory
Edoardo Cetin
Qi Sun
Tianyu Zhao
Yujin Tang
346
0
0
17 Oct 2024
1
2
Next