Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.21315
Cited By
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead
27 May 2025
Jesujoba Oluwadara Alabi
Michael A. Hedderich
David Ifeoluwa Adelani
Dietrich Klakow
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead"
50 / 122 papers shown
Title
JWSign: A Highly Multilingual Corpus of Bible Translations for more Diversity in Sign Language Processing
Shester Gueuwou
Sophie Siake
Colin Leong
Mathias Müller
SLR
82
13
0
16 Nov 2023
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
Jiayi Wang
David Ifeoluwa Adelani
Sweta Agrawal
Marek Masiak
Ricardo Rei
...
V. Otiende
C. Mbonu
Sakayo Toadoum Sari
Yao Lu
Pontus Stenetorp
48
10
0
16 Nov 2023
Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability
Wei-Rui Chen
Ife Adebara
Khai Duy Doan
Qisheng Liao
Muhammad Abdul-Mageed
54
7
0
16 Nov 2023
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining
Yihong Liu
Peiqin Lin
Mingyang Wang
Hinrich Schütze
57
27
0
15 Nov 2023
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Sanchit Ahuja
Divyanshu Aggarwal
Varun Gumma
Ishaan Watts
Ashutosh Sathe
...
Rishav Hada
Prachi Jain
Maxamed Axmed
Kalika Bali
Sunayana Sitaram
ELM
87
44
0
13 Nov 2023
ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters
Vipul Rathore
Rajdeep Dhingra
Parag Singla
Mausam
52
8
0
25 Oct 2023
Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study
Injy Hamed
Nizar Habash
Ngoc Thang Vu
60
3
0
23 Oct 2023
Sentiment Analysis Across Multiple African Languages: A Current Benchmark
Saurav K. Aryal
H. Prioleau
Surakshya Aryal
63
5
0
21 Oct 2023
PuoBERTa: Training and evaluation of a curated language model for Setswana
Vukosi Marivate
Moseli Motsóehli
Valencia Wagner
Richard Lastrucci
Isheanesu Dzingirai
44
10
0
13 Oct 2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
Tobi Olatunji
Tejumade Afonja
Aditya Yadavalli
Chris C. Emezue
Sahib Singh
...
Joanne I. Osuchukwu
Salomey Osei
A. Tonja
Naome A. Etori
Clinton Mbataku
59
19
0
30 Sep 2023
GlotScript: A Resource and Tool for Low Resource Writing System Identification
Amir Hossein Kargaran
François Yvon
Hinrich Schütze
25
11
0
23 Sep 2023
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
David Ifeoluwa Adelani
Hannah Liu
Xiaoyu Shen
Nikita Vassilyev
Jesujoba Oluwadara Alabi
Yanke Mao
Haonan Gao
Annie En-Shiun Lee
ELM
73
76
0
14 Sep 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
91
133
0
09 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
61
152
0
31 Aug 2023
NaijaRC: A Multi-choice Reading Comprehension Dataset for Nigerian Languages
Anuoluwapo Aremu
Jesujoba Oluwadara Alabi
Daud Abolade
Nkechinyere F. Aguobi
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
72
4
0
18 Aug 2023
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Tolulope Ogunremi
Kólá Túbosún
Aremu Anuoluwapo
Iroro Orife
David Ifeoluwa Adelani
76
7
0
29 Jul 2023
Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Claytone Sikasote
Kalinda Siaminwe
Stanly Mwape
Bangiwe Zulu
Mofya Phiri
Martin Phiri
David Zulu
Mayumbo Nyirenda
Antonios Anastasopoulos
49
8
0
07 Jun 2023
Evaluating Emotion Arcs Across Languages: Bridging the Global Divide in Sentiment Analysis
Daniela Teodorescu
Saif M. Mohammad
42
11
0
03 Jun 2023
AfriNames: Most ASR models "butcher" African Names
Tobi Olatunji
Tejumade Afonja
Bonaventure F. P. Dossou
A. Tonja
Chris C. Emezue
Amina Mardiyyah Rufai
Sahib Singh
45
6
0
01 Jun 2023
BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Claytone Sikasote
Eunice Mukonde
Md Mahfuz Ibn Alam
Antonios Anastasopoulos
45
8
0
26 May 2023
Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging
Fabian David Schmidt
Ivan Vulić
Goran Glavaš
55
9
0
26 May 2023
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
Akari Asai
Sneha Kudugunta
Xinyan Velocity Yu
Terra Blevins
Hila Gonen
Machel Reid
Yulia Tsvetkov
Sebastian Ruder
Hannaneh Hajishirzi
88
61
0
24 May 2023
mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations
Jonas Pfeiffer
Francesco Piccinno
Massimo Nicosia
Xinyi Wang
Machel Reid
Sebastian Ruder
VLM
LRM
63
29
0
23 May 2023
An Open Dataset and Model for Language Identification
Laurie Burchell
Alexandra Birch
Nikolay Bogoychev
Kenneth Heafield
36
34
0
23 May 2023
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder
J. Clark
Alexander Gutkin
Mihir Kale
Min Ma
...
Dan Garrette
R. Ingle
Melvin Johnson
Dmitry Panteleev
Partha P. Talukdar
ELM
66
40
0
19 May 2023
MD3: The Multi-Dialect Dataset of Dialogues
Jacob Eisenstein
Vinodkumar Prabhakaran
Clara E. Rivera
Dorottya Demszky
D. Sharma
50
10
0
19 May 2023
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
David C. Uthus
Santiago Ontañón
Joshua Ainslie
Mandy Guo
VLM
32
11
0
18 May 2023
MphayaNER: Named Entity Recognition for Tshivenda
R. Mbuvha
David Ifeoluwa Adelani
Tendani Mutavhatsindi
Tshimangadzo Rakhuhu
A. Mauda
Tshifhiwa Joshua Maumela
Andisani Masindi
Seani Rananga
Vukosi Marivate
T. Marwala
106
1
0
08 Apr 2023
Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities
A. Tonja
Tadesse Destaw Belay
Israel Abebe Azime
Abinew Ali Ayele
Moges Ahmed Mehamed
Olga Kolesnikova
Seid Muhie Yimam
32
9
0
25 Mar 2023
MEGA: Multilingual Evaluation of Generative AI
Kabir Ahuja
Harshita Diddee
Rishav Hada
Millicent Ochieng
Krithika Ramesh
...
T. Ganu
Sameer Segal
Maxamed Axmed
Kalika Bali
Sunayana Sitaram
LM&MA
LRM
ELM
76
283
0
22 Mar 2023
Optical Character Recognition and Transcription of Berber Signs from Images in a Low-Resource Language Amazigh
Levi Corallo
A. Varde
24
2
0
21 Mar 2023
Improving Accented Speech Recognition with Multi-Domain Training
Lucas Maison
Yannick Esteve
57
9
0
14 Mar 2023
SERENGETI: Massively Multilingual Language Models for Africa
Ife Adebara
AbdelRahim Elmadany
Muhammad Abdul-Mageed
Alcides Alcoba Inciarte
42
33
0
21 Dec 2022
TaTa: A Multilingual Table-to-Text Dataset for African Languages
Sebastian Gehrmann
Sebastian Ruder
Vitaly Nikolaev
Jan A. Botha
Michael Chavinda
Ankur P. Parikh
Clara E. Rivera
LMTD
71
11
0
31 Oct 2022
COMET-QE and Active Learning for Low-Resource Machine Translation
E. Chimoto
Bruce A. Bassett
46
10
0
27 Oct 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
121
27
0
26 Oct 2022
AfroLID: A Neural Language Identification Tool for African Languages
Ife Adebara
AbdelRahim Elmadany
Muhammad Abdul-Mageed
Alcides Alcoba Inciarte
56
32
0
21 Oct 2022
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Josh Meyer
David Ifeoluwa Adelani
Edresson Casanova
A. Oktem
Daniel Whitenack Julian Weber
...
Victor Akinode
Bernard Opoku
S. Olanrewaju
Jesujoba Oluwadara Alabi
Shamsuddeen Hassan Muhammad
34
23
0
07 Jul 2022
Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages
D. Zhu
Michael A. Hedderich
Fangzhou Zhai
David Ifeoluwa Adelani
Dietrich Klakow
NoLa
44
1
0
03 Jun 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
124
314
0
25 May 2022
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
Idris Abdulmumin
S. Dash
Musa Abdullahi Dawud
Shantipriya Parida
Shamsuddeen Hassan Muhammad
Ibrahim Said Ahmad
Subhadarshi Panda
Ondrej Bojar
B. Galadanci
Bello Shehu Bello
31
18
0
02 May 2022
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
Sebastian Ruder
Ibrahim Said Ahmad
Idris Abdulmumin
...
Chris C. Emezue
Saheed Abdul
Anuoluwapo Aremu
Alipio Jeorge
P. Brazdil
64
100
0
20 Jan 2022
Lesan -- Machine Translation for Low Resource Languages
Asmelash Teka Hadgu
Abel Aregawi
Adam Beaudoin
27
10
0
15 Dec 2021
Learning Nigerian accent embeddings from speech: preliminary results based on SautiDB-Naija corpus
Tejumade Afonja
Oladimeji Mudele
Iroro Orife
Kenechi Dukor
Lawrence Francis
Duru Goodness
Oluwafemi Azeez
Ademola Malomo
Clinton Mbataku
34
4
0
12 Dec 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
86
699
0
17 Nov 2021
DziriBERT: a Pre-trained Language Model for the Algerian Dialect
Amine Abdaoui
Mohamed Berrimi
Mourad Oussalah
A. Moussaoui
47
45
0
25 Sep 2021
AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages
Machel Reid
Junjie Hu
Graham Neubig
Y. Matsuo
112
33
0
10 Sep 2021
Survey of Low-Resource Machine Translation
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
63
160
0
01 Sep 2021
Facebook AI WMT21 News Translation Task Submission
C. Tran
Shruti Bhosale
James Cross
Philipp Koehn
Sergey Edunov
Angela Fan
VLM
167
82
0
06 Aug 2021
Manually Annotated Spelling Error Corpus for Amharic
A. Gezmu
Tirufat Tesifaye Lema
B. Seyoum
A. Nürnberger
26
5
0
25 Jun 2021
Previous
1
2
3
Next