Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.11811
Cited By
MasakhaNER: Named Entity Recognition for African Languages
22 March 2021
David Ifeoluwa Adelani
Jade Z. Abbott
Graham Neubig
Daniel D'souza
Julia Kreutzer
Constantine Lignos
Chester Palen-Michel
Happy Buzaaba
Shruti Rijhwani
Sebastian Ruder
Stephen D. Mayhew
Israel Abebe Azime
Shamsuddeen Hassan Muhammad
Chris C. Emezue
J. Nakatumba‐Nabende
Perez Ogayo
Anuoluwapo Aremu
Catherine Gitau
Derguene Mbaye
Jesujoba Oluwadara Alabi
Seid Muhie Yimam
T. Gwadabe
Ignatius M Ezeani
Andre Niyongabo Rubungo
Jonathan Mukiibi
V. Otiende
Iroro Orife
Davis David
Samba Ngom
Tosin P. Adewumi
Paul Rayson
Mofetoluwa Adeyemi
Gerald Muriuki
E. Anebi
Chiamaka Chukwuneke
N. Odu
Eric Peter Wairagala
S. Oyerinde
Clemencia Siro
Tobius Saul Bateesa
Temilola Oloyede
Yvonne Wambui
Victor Akinode
Deborah Nabagereka
Maurice Katusiime
Ayodele Awokoya
Mouhamadane Mboup
Dibora Gebreyohannes
Henok Tilaye
Kelechi Nwaike
Degaga Wolde
A. Faye
Blessing K. Sibanda
Orevaoghene Ahia
Bonaventure F. P. Dossou
Kelechi Ogueji
T. Diop
A. Diallo
Adewale Akinfaderin
T. Marengereke
Salomey Osei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MasakhaNER: Named Entity Recognition for African Languages"
48 / 48 papers shown
Title
LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama
Naome A. Etori
Kevin Lu
Randu Karisa
Arturs Kanepajs
LRM
ELM
202
0
0
14 Mar 2025
NaijaNLP: A Survey of Nigerian Low-Resource Languages
Isa Inuwa-Dutse
44
0
0
27 Feb 2025
Beyond Release: Access Considerations for Generative AI Systems
Irene Solaiman
Rishi Bommasani
Dan Hendrycks
Ariel Herbert-Voss
Yacine Jernite
Aviya Skowron
Andrew Trask
74
1
0
23 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
81
2
0
10 Feb 2025
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
Surangika Ranathunga
Asanka Ranasinghea
Janaka Shamala
Ayodya Dandeniyaa
Rashmi Galappaththia
Malithi Samaraweeraa
76
0
0
03 Dec 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
43
3
0
26 Sep 2024
A Perspective on Literary Metaphor in the Context of Generative AI
Imke van Heerden
Anil Bas
25
1
0
02 Sep 2024
SSP: Self-Supervised Prompting for Cross-Lingual Transfer to Low-Resource Languages using Large Language Models
Vipul Rathore
Aniruddha Deb
Ankish Chandresh
Parag Singla
Mausam
LRM
52
0
0
27 Jun 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
Rahmad Mahendra
Salsabil Maulana Akbar
Lester James V. Miranda
Jennifer Santoso
...
Genta Indra Winata
Ruochen Zhang
Fajri Koto
Zheng-Xin Yong
Samuel Cahyawijaya
95
9
0
14 Jun 2024
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
David Ifeoluwa Adelani
Jessica Ojo
Israel Abebe Azime
Jian Yun Zhuang
Jesujoba Oluwadara Alabi
...
Salomey Osei
Sokhar Samb
Tadesse Kebede Guge
Pontus Stenetorp
Pontus Stenetorp
ELM
65
7
0
05 Jun 2024
ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using Wikidata
Jonne Saleva
Constantine Lignos
CVBM
34
2
0
15 May 2024
Unknown Script: Impact of Script on Cross-Lingual Transfer
Wondimagegnhue Tufa
Ilia Markov
Piek Vossen
45
0
0
29 Apr 2024
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica
David Ifeoluwa Adelani
46
0
0
03 Apr 2024
When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Tyler A. Chang
Catherine Arnett
Zhuowen Tu
Benjamin Bergen
LRM
43
7
0
15 Nov 2023
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Songbo Hu
Han Zhou
Mete Hergul
Milan Gritta
Guchun Zhang
Ignacio Iacobacci
Ivan Vulić
Anna Korhonen
36
10
0
26 Jul 2023
Improving Language Plasticity via Pretraining with Active Forgetting
Yihong Chen
Kelly Marchisio
Roberta Raileanu
David Ifeoluwa Adelani
Pontus Stenetorp
Sebastian Riedel
Mikel Artetx
KELM
AI4CE
CLL
30
24
0
03 Jul 2023
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
Akari Asai
Sneha Kudugunta
Xinyan Velocity Yu
Terra Blevins
Hila Gonen
Machel Reid
Yulia Tsvetkov
Sebastian Ruder
Hannaneh Hajishirzi
44
54
0
24 May 2023
LLM-powered Data Augmentation for Enhanced Cross-lingual Performance
Chenxi Whitehouse
Monojit Choudhury
Alham Fikri Aji
SyDa
LRM
32
68
0
23 May 2023
MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages
Cheikh M. Bamba Dione
David Ifeoluwa Adelani
Peter Nabende
Jesujoba Oluwadara Alabi
Thapelo Sindane
...
Seydou T. Traoré
C. Uchechukwu
Aliyu Yusuf
M. Abdullahi
Dietrich Klakow
27
13
0
23 May 2023
PrOnto: Language Model Evaluations for 859 Languages
Luke Gessler
21
1
0
22 May 2023
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder
J. Clark
Alexander Gutkin
Mihir Kale
Min Ma
...
Dan Garrette
R. Ingle
Melvin Johnson
Dmitry Panteleev
Partha P. Talukdar
ELM
26
38
0
19 May 2023
How Good are Commercial Large Language Models on African Languages?
Jessica Ojo
Kelechi Ogueji
26
5
0
11 May 2023
Low-Resourced Machine Translation for Senegalese Wolof Language
Derguene Mbaye
Moussa Diallo
T. Diop
27
4
0
01 May 2023
MasakhaNEWS: News Topic Classification for African languages
David Ifeoluwa Adelani
Marek Masiak
Israel Abebe Azime
Jesujoba Oluwadara Alabi
A. Tonja
...
Moges Ahmed Mehamed
Evrard Ngabire
Jules Jules
Ivan Ssenkungu
Pontus Stenetorp
28
24
0
19 Apr 2023
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)
Shamsuddeen Hassan Muhammad
Idris Abdulmumin
Seid Muhie Yimam
David Ifeoluwa Adelani
I. Ahmad
N. Ousidhoum
A. Ayele
Saif M. Mohammad
Meriem Beloucif
Sebastian Ruder
38
67
0
13 Apr 2023
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
Chris C. Emezue
Sanchit Gandhi
Lewis Tunstall
Abubakar Abid
Josh Meyer
...
Douwe Kiela
Yacine Jernite
Julien Chaumond
Merve Noyan
Omar Sanseviero
33
2
0
22 Mar 2023
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe
E. Razumovskaia
Liane Guillou
Ivan Vulić
Anna Korhonen
Alexandra Birch
40
13
0
20 Dec 2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
...
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
39
48
0
19 Dec 2022
Beyond Counting Datasets: A Survey of Multilingual Dataset Construction and Necessary Resources
Xinyan Velocity Yu
Akari Asai
Trina Chatterjee
Junjie Hu
Eunsol Choi
29
21
0
28 Nov 2022
High-Resource Methodological Bias in Low-Resource Investigations
Maartje ter Hoeve
David Grangier
Natalie Schluter
33
2
0
14 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
21
12
0
04 Nov 2022
TaTa: A Multilingual Table-to-Text Dataset for African Languages
Sebastian Gehrmann
Sebastian Ruder
Vitaly Nikolaev
Jan A. Botha
Michael Chavinda
Ankur P. Parikh
Clara E. Rivera
LMTD
27
10
0
31 Oct 2022
Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
Idris Abdulmumin
Michael Beukman
Jesujoba Oluwadara Alabi
Chris C. Emezue
Everlyn Asiko
...
Shamsuddeen Hassan Muhammad
Mofetoluwa Adeyemi
Oreen Yousuf
Sahib Singh
T. Gwadabe
34
7
0
19 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
127
94
0
06 Oct 2022
Parameter-Efficient Finetuning for Robust Continual Multilingual Learning
Kartikeya Badola
Shachi Dave
Partha P. Talukdar
CLL
KELM
39
7
0
14 Sep 2022
ANEC: An Amharic Named Entity Corpus and Transformer Based Recognizer
Ebrahim Chekol Jibril
A. C. Tantuğ
35
7
0
02 Jul 2022
Ancestor-to-Creole Transfer is Not a Walk in the Park
Heather Lent
Emanuele Bugliarello
Anders Søgaard
11
8
0
09 Jun 2022
Resolving the Human Subjects Status of Machine Learning's Crowdworkers
Divyansh Kaushik
Zachary Chase Lipton
A. London
25
2
0
08 Jun 2022
KenSwQuAD -- A Question Answering Dataset for Swahili Low Resource Language
B. Wanjawa
Lilian D. A. Wanzare
F. Indede
Owen McOnyango
Lawrence Muchemi
Edward Ombui
39
19
0
04 May 2022
yosm: A new yoruba sentiment corpus for movie reviews
Iyanuoluwa Shode
David Ifeoluwa Adelani
Anna Feldman
36
16
0
20 Apr 2022
MMTAfrica: Multilingual Machine Translation for African Languages
Chris C. Emezue
Bonaventure F. P. Dossou
27
24
0
08 Apr 2022
Challenges and Strategies in Cross-Cultural NLP
Daniel Hershcovich
Stella Frank
Heather Lent
Miryam de Lhoneux
Mostafa Abdou
...
Ruixiang Cui
Constanza Fierro
Katerina Margatina
Phillip Rust
Anders Søgaard
43
164
0
18 Mar 2022
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation
Xinyi Wang
Sebastian Ruder
Graham Neubig
39
60
0
17 Mar 2022
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
Sebastian Ruder
I. Ahmad
Idris Abdulmumin
...
Chris C. Emezue
Saheed Abdul
Anuoluwapo Aremu
Alipio Jeorge
P. Brazdil
45
96
0
20 Jan 2022
Dataset Geography: Mapping Language Data to Language Users
Fahim Faisal
Yinkai Wang
Antonios Anastasopoulos
69
23
0
07 Dec 2021
On Language Models for Creoles
Heather Lent
Emanuele Bugliarello
Miryam de Lhoneux
Chen Qiu
Anders Søgaard
39
20
0
13 Sep 2021
SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation
Chester Palen-Michel
Nolan Holley
Constantine Lignos
27
12
0
29 Jul 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
36
210
0
11 Mar 2021
1