25
43

Hierarchical Character-Word Models for Language Identification

Abstract

Social media messages' brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching.

View on arXiv
Comments on this paper