Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.03170
Cited By
Focused Transformer: Contrastive Training for Context Scaling
6 July 2023
Szymon Tworkowski
Konrad Staniszewski
Mikolaj Pacek
Yuhuai Wu
Henryk Michalewski
Piotr Milo's
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Focused Transformer: Contrastive Training for Context Scaling"
18 / 18 papers shown
Title
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
...
Eun-Ah Kim
M. Brenner
Viren Jain
Sameera Ponda
Subhashini Venugopalan
ELM
LRM
57
0
0
14 Mar 2025
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
J. Tang
Jinhui Tang
VLM
60
0
0
02 Feb 2025
From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs
Alireza Rezazadeh
Zichao Li
Wei Wei
Yujia Bao
37
4
0
17 Oct 2024
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
45
8
0
05 Oct 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng-Tao Xu
Wei Ping
Xianchao Wu
Zihan Liu
M. Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
52
14
0
19 Jul 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
74
56
0
11 Jun 2024
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Yijiong Yu
Huiqiang Jiang
Xufang Luo
Qianhui Wu
Chin-Yew Lin
Dongsheng Li
Yuqing Yang
Yongfeng Huang
L. Qiu
44
9
0
04 Jun 2024
In-Context Learning with Long-Context Models: An In-Depth Exploration
Amanda Bertsch
Maor Ivgi
Uri Alon
Jonathan Berant
Matthew R. Gormley
Matthew R. Gormley
Graham Neubig
ReLM
AIMat
91
64
0
30 Apr 2024
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao
Yuanbin Qu
Konrad Staniszewski
Szymon Tworkowski
Wei Liu
Piotr Milo's
Yuxiang Wu
Pasquale Minervini
34
14
0
21 Feb 2024
User-LLM: Efficient LLM Contextualization with User Embeddings
Lin Ning
Luyang Liu
Jiaxing Wu
Neo Wu
D. Berlowitz
Sushant Prakash
Bradley Green
S. O’Banion
Jun Xie
55
33
0
21 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
131
369
0
09 Feb 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
33
12
0
26 Jan 2024
TableLlama: Towards Open Large Generalist Models for Tables
Tianshu Zhang
Xiang Yue
Yifei Li
Huan Sun
LMTD
ALM
20
81
0
15 Nov 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
42
152
0
21 Sep 2023
ExpeL: LLM Agents Are Experiential Learners
Andrew Zhao
Daniel Huang
Quentin Xu
Matthieu Lin
Yao Liu
Gao Huang
LLMAG
22
193
0
20 Aug 2023
Magnushammer: A Transformer-Based Approach to Premise Selection
Maciej Mikuła
Szymon Tworkowski
Szymon Antoniak
Bartosz Piotrowski
Albert Qiaochu Jiang
Jinyi Zhou
Christian Szegedy
Lukasz Kuciñski
Piotr Milo's
Yuhuai Wu
44
42
0
08 Mar 2023
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
234
128
0
25 May 2022
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
280
2,015
0
28 Jul 2020
1