Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.13626
Cited By
ByT5: Towards a token-free future with pre-trained byte-to-byte models
28 May 2021
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ByT5: Towards a token-free future with pre-trained byte-to-byte models"
50 / 111 papers shown
Title
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
Elementwise Language Representation
Du-Yeong Kim
Jeeeun Kim
33
0
0
27 Feb 2023
RETVec: Resilient and Efficient Text Vectorizer
Elie Bursztein
Marina Zhang
Owen Vallis
Xinyu Jia
Alexey Kurakin
VLM
32
4
0
18 Feb 2023
Distillation of encoder-decoder transformers for sequence labelling
M. Farina
D. Pappadopulo
Anant Gupta
Leslie Huang
Ozan Irsoy
Thamar Solorio
VLM
103
3
0
10 Feb 2023
Truveta Mapper: A Zero-shot Ontology Alignment Framework
Mariyam Amir
Murchana Baruah
Mahsa Eslamialishah
Sina Ehsani
Alireza Bahramali
Sadra Naddaf-sh
Saman Zarandioon
30
7
0
24 Jan 2023
Character-Aware Models Improve Visual Text Rendering
Rosanne Liu
Daniel H Garrette
Chitwan Saharia
William Chan
Adam Roberts
Sharan Narang
Irina Blok
R. Mical
Mohammad Norouzi
Noah Constant
VLM
31
71
0
20 Dec 2022
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
Jonas Belouadi
Steffen Eger
54
24
0
20 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
24
13
0
19 Dec 2022
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
William B. Held
Christopher Hidey
Fei Liu
Eric Zhu
Rahul Goel
Diyi Yang
Rushin Shah
34
0
0
15 Dec 2022
Advancing Multilingual Pre-training: TRIP Triangular Document-level Pre-training for Multilingual Language Models
Hongyuan Lu
Haoyang Huang
Shuming Ma
Dongdong Zhang
W. Lam
Furu Wei
27
4
0
15 Dec 2022
Subword-Delimited Downsampling for Better Character-Level Translation
Lukas Edman
Antonio Toral
Gertjan van Noord
22
6
0
02 Dec 2022
Word-Level Representation From Bytes For Language Modeling
Chul Lee
Qipeng Guo
Xipeng Qiu
15
1
0
23 Nov 2022
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
E. Ponti
20
42
0
17 Nov 2022
A Benchmark and Dataset for Post-OCR text correction in Sanskrit
Ayush Maheshwari
Nikhil Singh
Amrith Krishna
Ganesh Ramakrishnan
28
12
0
15 Nov 2022
Local Structure Matters Most in Most Languages
Louis Clouâtre
Prasanna Parthasarathi
Amal Zouaq
Sarath Chandar
36
1
0
09 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
32
12
0
06 Nov 2022
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
Chan-Jan Hsu
Ho-Lam Chung
Hung-yi Lee
Yu Tsao
21
6
0
01 Nov 2022
Graphemic Normalization of the Perso-Arabic Script
R. Doctor
Alexander Gutkin
Cibu Johny
Brian Roark
R. Sproat
41
4
0
21 Oct 2022
SLING: Sino Linguistic Evaluation of Large Language Models
Yixiao Song
Kalpesh Krishna
R. Bhatt
Mohit Iyyer
24
8
0
21 Oct 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
67
2,989
0
20 Oct 2022
Incorporating Context into Subword Vocabularies
Shaked Yehezkel
Yuval Pinter
47
8
0
13 Oct 2022
One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks
Gregor Geigle
Chen Cecilia Liu
Jonas Pfeiffer
Iryna Gurevych
VLM
28
1
0
12 Oct 2022
Non-Axiomatic Term Logic: A Computational Theory of Cognitive Symbolic Reasoning
Kotaro Funakoshi
NAI
21
1
0
12 Oct 2022
Look Ma, Only 400 Samples! Revisiting the Effectiveness of Automatic N-Gram Rule Generation for Spelling Normalization in Filipino
Lorenzo Jaime Yu Flores
Dragomir Radev
20
0
0
06 Oct 2022
MonoByte: A Pool of Monolingual Byte-level Language Models
Hugo Queiroz Abonizio
Leandro Rodrigues de Souza
R. Lotufo
Rodrigo Nogueira
40
1
0
22 Sep 2022
Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?
Doan Nam Long Vu
N. Moosavi
Steffen Eger
26
9
0
06 Sep 2022
CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations
Borun Chen
Hongyin Tang
Jiahao Bu
Kai Zhang
Jingang Wang
Qifan Wang
Haitao Zheng
Wei Wu
Liqian Yu
VLM
27
1
0
23 Aug 2022
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
38
46
0
14 Jul 2022
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Jonas Pfeiffer
Naman Goyal
Xi Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
40
139
0
12 May 2022
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
57
296
0
10 May 2022
How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?
Shiyue Zhang
Vishrav Chaudhary
Naman Goyal
James Cross
Guillaume Wenzek
Joey Tianyi Zhou
Francisco Guzman
36
16
0
29 Apr 2022
Impact of Tokenization on Language Models: An Analysis for Turkish
Cagri Toraman
E. Yilmaz
Furkan Şahinuç
Oguzhan Ozcelik
38
74
0
19 Apr 2022
A Hierarchical N-Gram Framework for Zero-Shot Link Prediction
Mingchen Li
J. Chen
Samuel Mensah
Nikolaos Aletras
Xiulong Yang
Yang Ye
15
13
0
16 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
96
801
0
14 Apr 2022
ByT5 model for massively multilingual grapheme-to-phoneme conversion
Jian Zhu
Cong Zhang
David Jurgens
16
36
0
06 Apr 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
40
99
0
24 Mar 2022
IT5: Text-to-text Pretraining for Italian Language Understanding and Generation
Gabriele Sarti
Malvina Nissim
AILaw
15
42
0
07 Mar 2022
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Lees
Vinh Q. Tran
Yi Tay
Jeffrey Scott Sorensen
Jai Gupta
Donald Metzler
Lucy Vasserman
25
173
0
22 Feb 2022
Correcting diacritics and typos with a ByT5 transformer model
Lukas Stankevicius
M. Lukoševičius
J. Kapočiūtė-Dzikienė
Monika Briediene
Tomas Krilavičius
11
20
0
31 Jan 2022
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
32
141
0
20 Dec 2021
Large Dual Encoders Are Generalizable Retrievers
Jianmo Ni
Chen Qu
Jing Lu
Zhuyun Dai
Gustavo Hernández Ábrego
...
Vincent Zhao
Yi Luan
Keith B. Hall
Ming-Wei Chang
Yinfei Yang
DML
33
431
0
15 Dec 2021
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5
David Samuel
Milan Straka
10
15
0
28 Oct 2021
Deciphering the Language of Nature: A transformer-based language model for deleterious mutations in proteins
Theodore Jiang
Li Fang
Kai Wang
MedIm
33
17
0
27 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
34
99
0
25 Oct 2021
Why don't people use character-level machine translation?
Jindrich Libovický
Helmut Schmid
Alexander Fraser
65
28
0
15 Oct 2021
Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings
Kalpesh Krishna
Deepak Nathani
Xavier Garcia
Bidisha Samanta
Partha P. Talukdar
40
24
0
14 Oct 2021
A Proposed Conceptual Framework for a Representational Approach to Information Retrieval
Jimmy J. Lin
21
51
0
04 Oct 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
19
52
0
20 Sep 2021
Single-Read Reconstruction for DNA Data Storage Using Transformers
Yotam Nahum
Eyar Ben-Tolila
Leon Anavy
66
5
0
12 Sep 2021
mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset
L. Bonifacio
Vitor Jeronymo
Hugo Queiroz Abonizio
Israel Campiotti
Marzieh Fadaee
R. Lotufo
Rodrigo Nogueira
40
108
0
31 Aug 2021
Previous
1
2
3
Next