ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.03626
  4. Cited By
Training LLMs over Neurally Compressed Text

Training LLMs over Neurally Compressed Text

4 April 2024
Brian Lester
Jaehoon Lee
A. Alemi
Jeffrey Pennington
Adam Roberts
Jascha Narain Sohl-Dickstein
Noah Constant
ArXivPDFHTML

Papers citing "Training LLMs over Neurally Compressed Text"

13 / 13 papers shown
Title
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
97
0
0
10 Feb 2025
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text
  Compression
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression
J. Zhang
Zhengxue Cheng
Yan Zhao
Shihao Wang
Dajiang Zhou
Guo Lu
Li-Na Song
81
1
0
21 Dec 2024
When Worse is Better: Navigating the compression-generation tradeoff in
  visual tokenization
When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization
Vivek Ramanujan
Kushal Tirumala
Armen Aghajanyan
Luke Zettlemoyer
Ali Farhadi
DiffM
74
2
0
20 Dec 2024
Compression via Pre-trained Transformers: A Study on Byte-Level
  Multimodal Data
Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data
David Heurtel-Depeiges
Anian Ruoss
Joel Veness
Tim Genewein
25
1
0
07 Oct 2024
Compressed-Language Models for Understanding Compressed File Formats: a
  JPEG Exploration
Compressed-Language Models for Understanding Compressed File Formats: a JPEG Exploration
Juan C. Pérez
Alejandro Pardo
Mattia Soldan
Hani Itani
Juan Carlos León Alcázar
Guohao Li
16
2
0
27 May 2024
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
Kevin Slagle
37
3
0
22 Apr 2024
Unpacking Tokenization: Evaluating Text Compression and its Correlation
  with Model Performance
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance
Omer Goldman
Avi Caciularu
Matan Eyal
Kris Cao
Idan Szpektor
Reut Tsarfaty
48
22
0
10 Mar 2024
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Luke Vilnis
Yury Zemlyanskiy
Patrick C. Murray
Alexandre Passos
Sumit Sanghai
59
9
0
18 Oct 2022
Sequence Length is a Domain: Length-based Overfitting in Transformer
  Models
Sequence Length is a Domain: Length-based Overfitting in Transformer Models
Dusan Varis
Ondrej Bojar
51
57
0
15 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
256
1,996
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
282
2,015
0
28 Jul 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
246
4,489
0
23 Jan 2020
1