ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.13718
  4. Cited By
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens

∞\infty∞Bench: Extending Long Context Evaluation Beyond 100K Tokens

21 February 2024
Xinrong Zhang
Yingfa Chen
Shengding Hu
Zihang Xu
Junhao Chen
Moo Khai Hao
Xu Han
Zhen Leng Thai
Shuo Wang
Zhiyuan Liu
Maosong Sun
    RALM
    LRM
ArXivPDFHTML

Papers citing "$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens"

50 / 112 papers shown
Title
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction
Jeffrey Willette
Heejun Lee
Sung Ju Hwang
17
0
0
16 May 2025
CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
Han Peng
Jinhao Jiang
Zican Dong
Wayne Xin Zhao
Lei Fang
RALM
42
0
0
15 May 2025
LongCodeBench: Evaluating Coding LLMs at 1M Context Windows
LongCodeBench: Evaluating Coding LLMs at 1M Context Windows
Stefano Rando
Luca Romani
Alessio Sampieri
Yuta Kyuragi
Luca Franco
Fabio Galasso
Tatsunori Hashimoto
John Yang
LLMAG
44
0
0
12 May 2025
Divide, Optimize, Merge: Fine-Grained LLM Agent Optimization at Scale
Divide, Optimize, Merge: Fine-Grained LLM Agent Optimization at Scale
Jiale Liu
Yifan Zeng
Shaokun Zhang
Chi Zhang
Malte Højmark-Bertelsen
Marie Normann Gadeberg
H. Wang
Qingyun Wu
41
0
0
06 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
96
2
0
26 Apr 2025
LiveLongBench: Tackling Long-Context Understanding for Spoken Texts from Live Streams
LiveLongBench: Tackling Long-Context Understanding for Spoken Texts from Live Streams
Yongxuan Wu
Runyu Chen
Peiyu Liu
Hongjin Qian
RALM
39
1
0
24 Apr 2025
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang
Zhuoqun Hao
Y. Cho
B. Li
Yuan Yuan
Sihao Chen
Lyle Ungar
Camillo J Taylor
Dan Roth
44
0
0
19 Apr 2025
Long-context Non-factoid Question Answering in Indic Languages
Long-context Non-factoid Question Answering in Indic Languages
Ritwik Mishra
R. Shah
Ponnurangam Kumaraguru
33
0
0
18 Apr 2025
Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation
Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation
Yichao Feng
Shuai Zhao
Heng Chang
Luwei Xiao
Xiaobao Wu
Anh Tuan Luu
RALM
34
0
0
17 Apr 2025
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
33
0
0
17 Apr 2025
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
Linda He
Jue Wang
Maurice Weber
Shang Zhu
Ben Athiwaratkun
Ce Zhang
SyDa
LRM
50
1
0
17 Apr 2025
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference
Yangshen Deng
Zhengxin You
Long Xiang
Qilong Li
Peiqi Yuan
...
Man Lung Yiu
Huan Li
Qiaomu Shen
Rui Mao
Bo Tang
42
0
0
14 Apr 2025
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding
Aniket Pal
Sanket Biswas
Alloy Das
Ayush Lodh
Priyanka Banerjee
Soumitri Chattopadhyay
Dimosthenis Karatzas
Josep Lladós
C. V. Jawahar
VLM
32
0
0
12 Apr 2025
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Yu Fu
Haz Sameen Shahgir
Hui Liu
Xianfeng Tang
Qi He
Yue Dong
KELM
57
0
0
11 Apr 2025
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models
C. Xu
Ming-Yu Liu
P. Xu
Z. Liu
Wei Ping
M. Shoeybi
Bo Li
Bryan Catanzaro
27
1
0
08 Apr 2025
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Yifei Yu
Qian Zhang
Lingfeng Qiao
Di Yin
Fang Li
Jie Wang
Z. Chen
Suncong Zheng
Xiaolong Liang
Xingchen Sun
44
0
0
07 Apr 2025
Reasoning on Multiple Needles In A Haystack
Reasoning on Multiple Needles In A Haystack
Yidong Wang
LRM
31
0
0
05 Apr 2025
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance
Jeremy D. Webb
Michael Bowman
Songpo Li
Xiaoli Zhang
36
0
0
04 Apr 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Zehan Li
Lefei Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
59
0
0
31 Mar 2025
If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs
If an LLM Were a Character, Would It Know Its Own Story? Evaluating Lifelong Learning in LLMs
Siqi Fan
Xiusheng Huang
Yiqun Yao
Xuezhi Fang
Kang Liu
Peng Han
Shuo Shang
Aixin Sun
Yequan Wang
LLMAG
45
0
0
30 Mar 2025
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Weisheng Jin
Maojia Song
Tej Deep Pala
Yew Ken Chia
Amir Zadeh
Chuan Li
Soujanya Poria
VLM
57
0
0
30 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
83
0
0
17 Mar 2025
Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval
Yuwei Zhang
Jayanth Srinivasa
Gaowen Liu
Jingbo Shang
LRM
LLMAG
RALM
95
1
0
12 Mar 2025
Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation
Junhao Zhang
Richong Zhang
Fanshuang Kong
Ziyang Miao
Yanhan Ye
Yaowei Zheng
SyDa
46
0
0
10 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
116
0
0
05 Mar 2025
U-NIAH: Unified RAG and LLM Evaluation for Long Context Needle-In-A-Haystack
Yunfan Gao
Yun Xiong
Wenlong Wu
Zijing Huang
Bohan Li
Haoyu Wang
60
3
0
01 Mar 2025
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai
Jianqiao Lu
Yao Luo
Yiyuan Ma
Xun Zhou
71
5
0
28 Feb 2025
Long-Context Inference with Retrieval-Augmented Speculative Decoding
Long-Context Inference with Retrieval-Augmented Speculative Decoding
Guanzheng Chen
Qilong Feng
Jinjie Ni
Xin Li
Michael Shieh
RALM
55
2
0
27 Feb 2025
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm
Siwei Wu
Yong Li
Xingwei Qu
Rishi Ravikumar
Yunshui Li
Tyler Loakman Shanghaoran Quan Xiaoyong Wei
Shanghaoran Quan
Xiaoyong Wei
R. Batista-Navarro
Hongpeng Zhou
160
3
0
26 Feb 2025
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering
Teng Lin
RALM
68
2
0
26 Feb 2025
DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities
DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities
Tianyi Zhuang
Chuqiao Kuang
Xiaoguang Li
Yihua Teng
Jihao Wu
Yufei Wang
Lifeng Shang
RALM
ELM
LRM
72
0
0
25 Feb 2025
LongSafety: Evaluating Long-Context Safety of Large Language Models
LongSafety: Evaluating Long-Context Safety of Large Language Models
Yida Lu
Jiale Cheng
Zhexin Zhang
Shiyao Cui
C. Wang
Xiaotao Gu
Yuxiao Dong
J. Tang
Hongning Wang
Minlie Huang
ALM
ELM
LRM
50
0
0
24 Feb 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
215
0
0
23 Feb 2025
CLIPPER: Compression enables long-context synthetic data generation
CLIPPER: Compression enables long-context synthetic data generation
Chau Minh Pham
Yapei Chang
Mohit Iyyer
SyDa
85
1
0
21 Feb 2025
Self-Taught Agentic Long Context Understanding
Self-Taught Agentic Long Context Understanding
Yufan Zhuang
Xiaodong Yu
Jialian Wu
Xingchen Sun
Zihan Wang
Jiang Liu
Yusheng Su
Jingbo Shang
Zicheng Liu
Emad Barsoum
LRM
36
0
0
21 Feb 2025
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
Q. Xiao
Jiachuan Wang
Haoyang Li
Cheng Deng
Xiangbo Shu
Shuangyin Li
Yongqi Zhang
Jun Wang
Lei Chen
LLMSV
54
1
0
20 Feb 2025
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
Yuxiang Huang
Mingye Li
Xu Han
Chaojun Xiao
Weilin Zhao
Sun Ao
Hao Zhou
Jie Zhou
Zhiyuan Liu
Maosong Sun
44
0
0
17 Feb 2025
The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval
The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval
Ting-Rui Chiang
Dani Yogatama
41
0
0
16 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
100
0
0
10 Feb 2025
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
Yang Zhou
Hongyi Liu
Zhuoming Chen
Yuandong Tian
Beidi Chen
LRM
69
8
0
07 Feb 2025
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
66
1
0
05 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
221
3
0
04 Feb 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yuqing Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
52
3
0
25 Jan 2025
NExtLong: Toward Effective Long-Context Training without Long Documents
NExtLong: Toward Effective Long-Context Training without Long Documents
Chaochen Gao
Xing Wu
Zijia Lin
Debing Zhang
Songlin Hu
SyDa
68
1
0
22 Jan 2025
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
59
0
0
21 Jan 2025
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Thibaut Thonet
Jos Rozen
Laurent Besacier
RALM
142
2
0
20 Jan 2025
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu
Meng Chen
Baotong Lu
Huiqiang Jiang
Zhenhua Han
...
Kaipeng Zhang
Chong Chen
Fan Yang
Yuqing Yang
Lili Qiu
60
30
0
03 Jan 2025
MLVU: Benchmarking Multi-task Long Video Understanding
MLVU: Benchmarking Multi-task Long Video Understanding
Yueze Wang
Yan Shu
Bo Zhao
Boya Wu
Junjie Zhou
...
Xi Yang
Y. Xiong
Bo Zhang
Tiejun Huang
Zheng Liu
VLM
58
11
0
03 Jan 2025
A Silver Bullet or a Compromise for Full Attention? A Comprehensive
  Study of Gist Token-based Context Compression
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression
Chenlong Deng
Zhisong Zhang
Kelong Mao
Shuaiyi Li
Xinting Huang
Dong Yu
Zhicheng Dou
40
1
0
23 Dec 2024
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long
  Context Extension for Large Language Models
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
Haoran Lian
Junmin Chen
Wei Huang
Yizhe Xiong
Wenping Hu
...
Hui Chen
Jianwei Niu
Zijia Lin
Fuzheng Zhang
Di Zhang
83
0
0
10 Dec 2024
123
Next