ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.04599
  4. Cited By
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer
  Training

BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training

6 September 2024
Pavel Chizhov
Catherine Arnett
Elizaveta Korotkova
Ivan P. Yamshchikov
ArXivPDFHTML

Papers citing "BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training"

3 / 3 papers shown
Title
Toward a Theory of Tokenization in LLMs
Toward a Theory of Tokenization in LLMs
Nived Rajaraman
Jiantao Jiao
Kannan Ramchandran
LLMAG
24
19
0
12 Apr 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
135
358
0
01 Feb 2024
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1