Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.06226
Cited By
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
19 August 2018
Taku Kudo
John Richardson
Re-assign community
ArXiv (abs)
PDF
HTML
Github (10925★)
Papers citing
"SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing"
50 / 1,950 papers shown
Title
ChemTEB: Chemical Text Embedding Benchmark, an Overview of Embedding Models Performance & Efficiency on a Specific Domain
Ali Shiraee Kasmaee
Mohammad Khodadad
Mohammad Arshi Saloot
Nick Sherck
Stephen Dokas
H. Mahyar
Soheila Samiee
ELM
636
2
0
30 Nov 2024
Linguistic Laws Meet Protein Sequences: A Comparative Analysis of Subword Tokenization Methods
Burak Suyunu
Enes Taylan
Arzucan Özgür
99
3
0
26 Nov 2024
Why do language models perform worse for morphologically complex languages?
Catherine Arnett
Benjamin Bergen
114
12
0
21 Nov 2024
The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims
Shu Zhou
Xin Wang
Zhengda Zhou
Haohan Yi
Xuhui Zheng
Hao Wan
117
1
0
21 Nov 2024
Watermark under Fire: A Robustness Evaluation of LLM Watermarking
Jiacheng Liang
Zian Wang
Lauren Hong
Shouling Ji
Ting Wang
AAML
211
0
0
20 Nov 2024
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation
Tim Elsner
Paula Usinger
Julius Nehring-Wirxel
Gregor Kobsik
Victor Czech
Yanjiang He
I. Lim
Leif Kobbelt
91
1
0
15 Nov 2024
Xmodel-1.5: An 1B-scale Multilingual LLM
Wang Qun
Liu Yang
Lin Qingquan
Jiang Ling
LRM
71
0
0
15 Nov 2024
A Practical Guide to Fine-tuning Language Models with Limited Data
Márton Szép
Daniel Rueckert
Rüdiger von Eisenhart-Rothe
Florian Hinterwimmer
SyDa
ALM
130
2
0
14 Nov 2024
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Yoshiki Masuyama
Koichi Miyazaki
Masato Murata
Mamba
76
0
0
11 Nov 2024
When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
Jacob Nielsen
Lukas Galke
Peter Schneider-Kamp
MQ
98
1
0
08 Nov 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
123
4
0
08 Nov 2024
Deploying Multi-task Online Server with Large Language Model
Yincen Qu
Chao Ma
Xiangying Dai
Hui Zhou
Yiting Wu
Hengyue Liu
58
0
0
06 Nov 2024
Classification Done Right for Vision-Language Pre-Training
Zilong Huang
Qinghao Ye
Bingyi Kang
Jiashi Feng
Haoqi Fan
CLIP
VLM
120
4
0
05 Nov 2024
Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
A. Haliassos
Rodrigo Mira
Honglie Chen
Zoe Landgraf
Stavros Petridis
Maja Pantic
SSL
86
7
0
04 Nov 2024
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
101
0
0
03 Nov 2024
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
270
3
0
03 Nov 2024
Optimizing Contextual Speech Recognition Using Vector Quantization for Efficient Retrieval
Nikolaos Flemotomos
Roger Hsiao
P. Swietojanski
Takaaki Hori
Dogan Can
Xiaodan Zhuang
126
1
0
01 Nov 2024
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
94
4
0
28 Oct 2024
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages
Artur Kiulian
Anton Polishko
M. Khandoga
Yevhen Kostiuk
Guillermo Gabrielli
...
Hrishikesh Garud
Wendy Wing Yee Mak
Dmytro Chaplynskyi
Selma Belhadj Amor
Grigol Peradze
82
0
0
24 Oct 2024
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
A. S. Rawat
Veeranjaneyulu Sadhanala
Afshin Rostamizadeh
Ayan Chakrabarti
Wittawat Jitkrittum
...
Rakesh Shivanna
Sashank J. Reddi
A. Menon
Rohan Anil
Sanjiv Kumar
146
3
0
24 Oct 2024
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
74
4
0
24 Oct 2024
Scalable Influence and Fact Tracing for Large Language Model Pretraining
Tyler A. Chang
Dheeraj Rajagopal
Tolga Bolukbasi
Lucas Dixon
Ian Tenney
TDI
94
5
0
22 Oct 2024
PLDR-LLM: Large Language Model from Power Law Decoder Representations
Burc Gokden
59
1
0
22 Oct 2024
Methods of improving LLM training stability
Oleg Rybakov
Mike Chrzanowski
Peter Dykas
Jinze Xue
Ben Lanir
80
1
0
22 Oct 2024
Action abstractions for amortized sampling
Oussama Boussif
Léna Néhale Ezzine
J. Viviano
Michał Koziarski
Moksh Jain
Nikolay Malkin
Emmanuel Bengio
Rim Assouel
Yoshua Bengio
96
0
0
19 Oct 2024
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLM
DiffM
131
57
0
17 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
107
4
0
17 Oct 2024
Nominal Class Assignment in Swahili: A Computational Account
Giada Palmieri
Konstantinos Kogkalidis
21
0
0
16 Oct 2024
Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari
Danilo S. Carvalho
André Freitas
142
3
0
16 Oct 2024
Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5
Thao Anh Dang
Limor Raviv
Lukas Galke
52
1
0
15 Oct 2024
LargePiG: Your Large Language Model is Secretly a Pointer Generator
Zhongxiang Sun
Zihua Si
Xiaoxue Zang
Kai Zheng
Yang Song
Xiao Zhang
Jun Xu
HILM
RALM
82
0
0
15 Oct 2024
Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations
M. Germán-Morales
A. J. Rivera-Rivas
M. J. del Jesus Díaz
C. J. Carmona
AI4TS
AI4CE
279
0
0
15 Oct 2024
ChakmaNMT: A Low-resource Machine Translation On Chakma Language
Aunabil Chakma
Aditya Chakma
Soham Khisa
Chumui Tripura
Masum Hasan
Rifat Shahriyar
36
1
0
14 Oct 2024
Predicting from Strings: Language Model Embeddings for Bayesian Optimization
Tung Nguyen
Qiuyi Zhang
Bangding Yang
Chansoo Lee
J. Bornschein
Yingjie Miao
Sagi Perel
Yutian Chen
Xingyou Song
BDL
99
4
0
14 Oct 2024
Text Classification using Graph Convolutional Networks: A Comprehensive Survey
Syed Mustafa Haider Rizvi
Ramsha Imran
Arif Mahmood
GNN
OOD
FaML
51
2
0
12 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
89
3
0
12 Oct 2024
OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
Linhui Xiao
Xiaoshan Yang
Fang Peng
Yaowei Wang
Changsheng Xu
ObjD
122
7
0
10 Oct 2024
Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
Cyrile Delestre
Yoann Sola
34
0
0
10 Oct 2024
Transducer Consistency Regularization for Speech to Text Applications
Cindy Tseng
Yun Tang
Vijendra Raj Apsingekar
69
0
0
09 Oct 2024
Generative Model for Less-Resourced Language with 1 billion parameters
Domen Vreš
Martin Božič
Aljaž Potočnik
Tomaž Martinčič
Marko Robnik-Šikonja
49
1
0
09 Oct 2024
Inference over Unseen Entities, Relations and Literals on Knowledge Graphs
Caglar Demir
N'Dah Jean Kouagou
Arnab Sharma
Axel-Cyrille Ngonga Ngomo
53
0
0
09 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
608
1
0
07 Oct 2024
Language Model-Driven Data Pruning Enables Efficient Active Learning
Abdul Hameed Azeemi
I. Qazi
Agha Ali Raza
VLM
90
1
0
05 Oct 2024
Adaptive BPE Tokenization for Enhanced Vocabulary Adaptation in Finetuning Pretrained Language Models
Gunjan Balde
Soumyadeep Roy
Mainack Mondal
Niloy Ganguly
48
1
0
04 Oct 2024
Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages
Seonjeong Hwang
Yunsu Kim
Gary Geunbae Lee
68
0
0
04 Oct 2024
MELODI: Exploring Memory Compression for Long Contexts
Yinpeng Chen
DeLesley Hutchins
Aren Jansen
Andrey Zhmoginov
David Racz
Jesper Andersen
70
2
0
04 Oct 2024
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
103
0
0
04 Oct 2024
Morphological evaluation of subwords vocabulary used by BETO language model
Óscar García-Sierra
Ana Fernández-Pampillón Cesteros
Miguel Ortega-Martín
61
0
0
03 Oct 2024
Selective Attention Improves Transformer
Yaniv Leviathan
Matan Kalman
Yossi Matias
119
12
0
03 Oct 2024
HAINAN: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
Hainan Xu
Travis M. Bartley
Vladimir Bataev
Boris Ginsburg
426
0
0
03 Oct 2024
Previous
1
2
3
4
5
6
...
37
38
39
Next