Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1512.00103
Cited By
Multilingual Language Processing From Bytes
1 December 2015
D. Gillick
Clifford Brunk
Oriol Vinyals
A. Subramanya
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multilingual Language Processing From Bytes"
50 / 95 papers shown
Title
MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies
Ehsaneddin Asgari
Yassine El Kheir
Mohammad Ali Sadraei Javaheri
68
0
0
02 Feb 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
40
2
0
28 Oct 2024
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition
Ke Bao
Chonghuan Yang
34
0
0
24 Jul 2024
Optimizing Byte-level Representation for End-to-end ASR
Roger Hsiao
Liuhui Deng
Erik McDermott
R. Travadi
Xiaodan Zhuang
21
0
0
14 Jun 2024
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky
Neha Verma
Philipp Koehn
Matt Post
26
14
0
23 May 2023
Language-universal phonetic encoder for low-resource speech recognition
Siyuan Feng
Ming Tu
Rui Xia
Chuanzeng Huang
Yuxuan Wang
39
2
0
19 May 2023
What is the best recipe for character-level encoder-only modelling?
Kris Cao
36
2
0
09 May 2023
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
Jiang-Dong Liu
Donghong Ji
Jingye Li
Dongdong Xie
Chong Teng
Liang Zhao
Fei Li
27
15
0
01 Nov 2022
A multi-level interpretable sleep stage scoring system by infusing experts' knowledge into a deep network architecture
H. Niknazar
S. Mednick
16
4
0
11 Jul 2022
Bilingual End-to-End ASR with Byte-Level Subwords
Liuhui Deng
Roger Hsiao
Arnab Ghoshal
18
4
0
01 May 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
40
100
0
24 Mar 2022
Unified Named Entity Recognition as Word-Word Relation Classification
Jingye Li
Hao Fei
Jiang-Dong Liu
Shengqiong Wu
Meishan Zhang
Chong Teng
Donghong Ji
Fei Li
42
243
0
19 Dec 2021
ML Based Lineage in Databases
Michael Leybovich
O. Shmueli
AI4TS
22
2
0
13 Sep 2021
Active Learning for Massively Parallel Translation of Constrained Text into Low Resource Languages
Zhong Zhou
A. Waibel
12
5
0
16 Aug 2021
byteSteady: Fast Classification Using Byte-Level n-Gram Embeddings
Xiang Zhang
Alexandre Drouin
Raymond Li
14
1
0
24 Jun 2021
Dutch Named Entity Recognition and De-identification Methods for the Human Resource Domain
C. V. Toledo
F. V. Dijk
Marco Spruit
11
3
0
04 Jun 2021
A Unified Generative Framework for Various NER Subtasks
Hang Yan
Tao Gui
Junqi Dai
Qipeng Guo
Zheng-Wei Zhang
Xipeng Qiu
34
288
0
02 Jun 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
38
464
0
28 May 2021
Towards A Multi-agent System for Online Hate Speech Detection
Gaurav Sahu
R. Cohen
Olga Vechtomova
16
9
0
03 May 2021
Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation
Zhong Zhou
Alexander Waibel
19
4
0
12 Apr 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
36
210
0
11 Mar 2021
Recent Trends in Named Entity Recognition (NER)
Aryan Roy
28
37
0
25 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
25
19
0
23 Jan 2021
Global Attention for Name Tagging
Boliang Zhang
Spencer Whitehead
Lifu Huang
Heng Ji
50
17
0
19 Oct 2020
Knowledge Efficient Deep Learning for Natural Language Processing
Hai Wang
12
2
0
28 Aug 2020
Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
T. Tsai
Kevin Ji
VLM
22
17
0
29 Jul 2020
Sources of Transfer in Multilingual Named Entity Recognition
David Mueller
Nicholas Andrews
Mark Dredze
20
20
0
02 May 2020
Bootstrapping NLU Models with Multi-task Learning
Shubham Kapoor
C. Tirkaz
9
3
0
15 Nov 2019
Using Interlinear Glosses as Pivot in Low-Resource Multilingual Machine Translation
Zhong Zhou
Lori S. Levin
David R. Mortensen
A. Waibel
26
10
0
07 Nov 2019
Hierarchical Contextualized Representation for Named Entity Recognition
Ying Luo
Fengshun Xiao
Zhao Hai
27
129
0
06 Nov 2019
A Survey on Recent Advances in Named Entity Recognition from Deep Learning models
Vikas Yadav
Steven Bethard
3DV
13
588
0
25 Oct 2019
Improving Pre-Trained Multilingual Models with Vocabulary Expansion
Hai Wang
Dian Yu
Kai Sun
Jianshu Chen
Dong Yu
30
40
0
26 Sep 2019
Neural Correction Model for Open-Domain Named Entity Recognition
Mengdi Zhu
Zheye Deng
Wenhan Xiong
Mo Yu
Ming Zhang
William Yang Wang
29
6
0
13 Sep 2019
Neural Machine Translation with Byte-Level Subwords
Changhan Wang
Kyunghyun Cho
Jiatao Gu
18
173
0
07 Sep 2019
A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition
L. Simeonova
K. Simov
P. Osenova
Preslav Nakov
23
8
0
27 Aug 2019
Neural Architectures for Nested NER through Linearization
Jana Straková
Milan Straka
Jan Hajic
8
246
0
19 Aug 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Z. Chen
Yonghui Wu
28
423
0
11 Jul 2019
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text
Michael Hahn
Marco Baroni
LMTD
22
15
0
17 Jun 2019
Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu
Linghao Sun
Huixiong Yi
Yong-jin Liu
Huanhuan Chen
Steven C. H. Hoi
21
0
0
04 Jun 2019
Sentiment Tagging with Partial Labels using Modular Architectures
Xiao Zhang
Dan Goldwasser
12
9
0
03 Jun 2019
Effective Context and Fragment Feature Usage for Named Entity Recognition
Nargiza Nosirova
Mingbin Xu
Hui Jiang
16
0
0
05 Apr 2019
Measuring scheduling efficiency of RNNs for NLP applications
Urmish Thakker
Ganesh S. Dasika
Jesse G. Beu
Matthew Mattina
19
13
0
05 Apr 2019
A Multi-task Learning Approach for Named Entity Recognition using Local Detection
Nargiza Nosirova
Mingbin Xu
Hui Jiang
21
2
0
05 Apr 2019
COMIC: Towards A Compact Image Captioning Model with Attention
J. Tan
Chee Seng Chan
Joon Huang Chuah
VLM
20
40
0
04 Mar 2019
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Bo-wen Li
Yu Zhang
Tara N. Sainath
Yonghui Wu
William Chan
AuLLM
11
129
0
22 Nov 2018
Neural CRF transducers for sequence labeling
Kai-Mo Hu
Zhijian Ou
Min Hu
Junlan Feng
14
5
0
04 Nov 2018
Chargrid: Towards Understanding 2D Documents
Anoop R. Katti
C. Reisswig
Cordula Guder
Sebastian Brarda
S. Bickel
Johannes Höhne
Jean Baptiste Faddoul
26
191
0
24 Sep 2018
A Byte-sized Approach to Named Entity Recognition
Emily Sheng
Premkumar Natarajan
17
0
0
22 Sep 2018
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training
Peng Xu
Andrea Madotto
Chien-Sheng Wu
Ji Ho Park
Pascale Fung
24
68
0
12 Sep 2018
Paraphrases as Foreign Languages in Multilingual Neural Machine Translation
Zhong Zhou
Matthias Sperber
A. Waibel
LRM
22
19
0
25 Aug 2018
1
2
Next