ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.16829
  4. Cited By
Understanding and Mitigating Tokenization Bias in Language Models

Understanding and Mitigating Tokenization Bias in Language Models

24 June 2024
Buu Phan
Marton Havasi
Matthew Muckley
Karen Ullrich
ArXivPDFHTML

Papers citing "Understanding and Mitigating Tokenization Bias in Language Models"

5 / 5 papers shown
Title
SuperBPE: Space Travel for Language Models
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
51
3
0
17 Mar 2025
Toward a Theory of Tokenization in LLMs
Toward a Theory of Tokenization in LLMs
Nived Rajaraman
Jiantao Jiao
Kannan Ramchandran
LLMAG
29
5
0
12 Apr 2024
Unpacking Tokenization: Evaluating Text Compression and its Correlation
  with Model Performance
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance
Omer Goldman
Avi Caciularu
Matan Eyal
Kris Cao
Idan Szpektor
Reut Tsarfaty
51
22
0
10 Mar 2024
Tokenization Is More Than Compression
Tokenization Is More Than Compression
Craig W. Schmidt
Varshini Reddy
Haoran Zhang
Alec Alameddine
Omri Uzan
Yuval Pinter
Chris Tanner
61
28
0
28 Feb 2024
Getting the most out of your tokenizer for pre-training and domain
  adaptation
Getting the most out of your tokenizer for pre-training and domain adaptation
Gautier Dagan
Gabriele Synnaeve
Baptiste Rozière
34
20
0
01 Feb 2024
1