ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15020
  4. Cited By
An Efficient Multilingual Language Model Compression through Vocabulary
  Trimming

An Efficient Multilingual Language Model Compression through Vocabulary Trimming

24 May 2023
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
ArXivPDFHTML

Papers citing "An Efficient Multilingual Language Model Compression through Vocabulary Trimming"

14 / 14 papers shown
Title
A comparison of data filtering techniques for English-Polish LLM-based machine translation in the biomedical domain
A comparison of data filtering techniques for English-Polish LLM-based machine translation in the biomedical domain
Jorge del Pozo Lérida
Kamil Kojs
János Máté
Mikołaj Antoni Barański
Christian Hardmeier
45
0
0
27 Jan 2025
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
197
0
0
07 Oct 2024
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer
  Training
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training
Pavel Chizhov
Catherine Arnett
Elizaveta Korotkova
Ivan P. Yamshchikov
48
2
0
06 Sep 2024
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient
  Language Model Inference
An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference
Atsuki Yamaguchi
Aline Villavicencio
Nikolaos Aletras
27
7
0
16 Feb 2024
Mitigating Language-Dependent Ethnic Bias in BERT
Mitigating Language-Dependent Ethnic Bias in BERT
Jaimeen Ahn
Alice Oh
142
92
0
13 Sep 2021
Larger-Scale Transformers for Multilingual Masked Language Modeling
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
95
98
0
02 May 2021
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis
  and Beyond
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
90
213
0
25 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
265
373
0
28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
218
137
0
23 Jan 2021
When Being Unseen from mBERT is just the Beginning: Handling New
  Languages With Multilingual Language Models
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
136
165
0
24 Oct 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
77
65
0
24 Oct 2020
SberQuAD -- Russian Reading Comprehension Dataset: Description and
  Analysis
SberQuAD -- Russian Reading Comprehension Dataset: Description and Analysis
Pavel Efimov
Andrey Chertok
Leonid Boytsov
Pavel Braslavski
60
59
0
20 Dec 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
Word Translation Without Parallel Data
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
189
1,639
0
11 Oct 2017
1