ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21315
  4. Cited By
Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead

Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead

27 May 2025
Jesujoba Oluwadara Alabi
Michael A. Hedderich
David Ifeoluwa Adelani
Dietrich Klakow
ArXivPDFHTML

Papers citing "Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead"

50 / 122 papers shown
Title
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
Junho Myung
Nayeon Lee
Yi Zhou
Jiho Jin
Rifki Afina Putri
...
Seid Muhie Yimam
Mohammad Taher Pilehvar
N. Ousidhoum
Jose Camacho-Collados
Alice Oh
135
50
0
17 Jan 2025
YAD: Leveraging T5 for Improved Automatic Diacritization of Yor\`ub\á Text
YAD: Leveraging T5 for Improved Automatic Diacritization of Yor\`ub\á Text
Akindele Michael Olawole
Jesujoba Oluwadara Alabi
Aderonke Busayo Sakpere
David Ifeoluwa Adelani
53
1
0
31 Dec 2024
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
78
7
0
31 Oct 2024
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata
Frederikus Hudi
Patrick Amadeus Irawan
David Anugraha
Rifki Afina Putri
...
Alham Fikri Aji
Taro Watanabe
Derry Wijaya
Alice Oh
Chong-Wah Ngo
CoGe
142
12
0
16 Oct 2024
State of NLP in Kenya: A Survey
State of NLP in Kenya: A Survey
Cynthia Jayne Amol
Everlyn Asiko Chimoto
Rose Delilah Gesicho
Antony M. Gitau
Naome A. Etori
...
Catherine Gitau
Antony Ndolo
Lilian D. A. Wanzare
Albert Njoroge Kahira
Ronald Tombe
77
2
0
13 Oct 2024
CulturalBench: A Robust, Diverse, and Challenging Cultural Benchmark by Human-AI CulturalTeaming
CulturalBench: A Robust, Diverse, and Challenging Cultural Benchmark by Human-AI CulturalTeaming
Yu Ying Chiu
Liwei Jiang
Bill Yuchen Lin
Chan Young Park
Shuyue Stella Li
...
Mehar Bhatia
Maria Antoniak
Yulia Tsvetkov
Vered Shwartz
Yejin Choi
ELM
ALM
83
0
0
03 Oct 2024
AfriHuBERT: A self-supervised speech representation model for African languages
AfriHuBERT: A self-supervised speech representation model for African languages
Jesujoba Oluwadara Alabi
Xuechen Liu
Dietrich Klakow
Junichi Yamagishi
VLM
48
3
0
30 Sep 2024
SpeechTaxi: On Multilingual Semantic Speech Classification
SpeechTaxi: On Multilingual Semantic Speech Classification
Lennart Keller
Goran Glavaš
100
2
0
10 Sep 2024
InkubaLM: A small language model for low-resource African languages
InkubaLM: A small language model for low-resource African languages
A. Tonja
Bonaventure F. P. Dossou
Jessica Ojo
Jenalea Rajab
Fadel Thior
...
Anuoluwapo Aremu
Pelonomi Moiloa
Jade Z. Abbott
Vukosi Marivate
Benjamin Rosman
73
11
0
30 Aug 2024
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Min Ma
Yuma Koizumi
Shigeki Karita
Heiga Zen
Jason Riesa
Haruko Ishikawa
M. Bacchiani
VLM
70
5
0
12 Aug 2024
EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora
EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora
Faisal Qarah
54
5
0
07 Aug 2024
In-Context Example Selection via Similarity Search Improves Low-Resource
  Machine Translation
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation
Joel Witzke
Benoît Sagot
Rachel Bawden
69
10
0
01 Aug 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
100
841
0
31 Jul 2024
Benchmarking Vision Language Models for Cultural Understanding
Benchmarking Vision Language Models for Cultural Understanding
Shravan Nayak
Kanishk Jain
Rabiul Awal
Siva Reddy
Sjoerd van Steenkiste
Lisa Anne Hendricks
Karolina Stañczak
Aishwarya Agrawal
VLM
CoGe
105
36
0
15 Jul 2024
Toucan: Many-to-Many Translation for 150 African Language Pairs
Toucan: Many-to-Many Translation for 150 African Language Pairs
AbdelRahim Elmadany
Ife Adebara
Muhammad Abdul-Mageed
58
3
0
05 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
73
16
0
30 Jun 2024
Voices Unheard: NLP Resources and Models for Yorùbá Regional
  Dialects
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects
Orevaoghene Ahia
Anuoluwapo Aremu
Diana Abagyan
Hila Gonen
David Ifeoluwa Adelani
Daud Abolade
Noah A. Smith
Yulia Tsvetkov
104
8
0
27 Jun 2024
Implicit Discourse Relation Classification For Nigerian Pidgin
Implicit Discourse Relation Classification For Nigerian Pidgin
Muhammed Saeed
Peter Bourgonje
Vera Demberg
36
2
0
26 Jun 2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
  Scale
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
97
241
0
25 Jun 2024
From Insights to Actions: The Impact of Interpretability and Analysis
  Research on NLP
From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP
Marius Mosbach
Vagrant Gautam
Tomás Vergara-Browne
Dietrich Klakow
Mor Geva
AI4CE
65
10
0
18 Jun 2024
1000 African Voices: Advancing inclusive multi-speaker multi-accent
  speech synthesis
1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis
Sewade Ogun
A. Owodunni
Tobi Olatunji
Eniola Alese
Babatunde Oladimeji
Tejumade Afonja
Kayode Olaleye
Naome A. Etori
Tosin Adewumi
60
6
0
17 Jun 2024
MINERS: Multilingual Language Models as Semantic Retrievers
MINERS: Multilingual Language Models as Semantic Retrievers
Genta Indra Winata
Ruochen Zhang
David Ifeoluwa Adelani
RALM
96
6
0
11 Jun 2024
mHuBERT-147: A Compact Multilingual HuBERT Model
mHuBERT-147: A Compact Multilingual HuBERT Model
Marcely Zanon Boito
Vivek Iyer
Nikolaos Lagos
Laurent Besacier
Ioan Calapodescu
VLM
104
18
0
10 Jun 2024
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in
  Low-Resource and Extinct Languages
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
Andrew M. Bean
Simi Hellsten
Harry Mayne
Jabez Magomere
Ethan A. Chi
Ryan A. Chi
Scott A. Hale
Hannah Rose Kirk
ELM
LRM
55
11
0
10 Jun 2024
CVQA: Culturally-diverse Multilingual Visual Question Answering
  Benchmark
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
David Romero
Chenyang Lyu
Haryo Akbarianto Wibowo
Teresa Lynn
Injy Hamed
...
Oana Ignat
Joan Nwatu
Rada Mihalcea
Thamar Solorio
Alham Fikri Aji
74
38
0
10 Jun 2024
YODAS: Youtube-Oriented Dataset for Audio and Speech
YODAS: Youtube-Oriented Dataset for Audio and Speech
Xinjian Li
Shinnosuke Takamichi
Takaaki Saeki
William Chen
Sayaka Shiota
Shinji Watanabe
75
23
0
02 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for
  Efficient Data Pruning
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
68
6
0
29 May 2024
Aya 23: Open Weight Releases to Further Multilingual Progress
Aya 23: Open Weight Releases to Further Multilingual Progress
Viraat Aryabumi
John Dang
Dwarak Talupuru
Saurabh Dash
David Cairuz
...
Aidan Gomez
Phil Blunsom
Marzieh Fadaee
Ahmet Üstün
Sara Hooker
OSLM
69
85
0
23 May 2024
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial
  African Creole Languages
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages
Nathaniel R. Robinson
Raj Dabre
Ammon Shurtz
Rasul Dent
Onenamiyi Onesi
...
Matthew Dean Stutzman
Bismarck Odoom
Sanjeev Khudanpur
Stephen D. Richardson
Kenton Murray
MoE
82
8
0
08 May 2024
The IgboAPI Dataset: Empowering Igbo Language Technologies through
  Multi-dialectal Enrichment
The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment
Chris C. Emezue
Ifeoma Okoh
C. Mbonu
Chiamaka Chukwuneke
Daisy Lal
...
Bright Ogbonna
Chukwuebuka U. Oraegbunam
Esther Chidinma Awo-Ndubuisi
Akudo Amarachukwu Osuagwu
Obioha Nmezi
50
1
0
02 May 2024
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the
  Language we Prompt them in
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in
Utkarsh Agarwal
Kumar Tanmay
Aditi Khandelwal
Monojit Choudhury
LRM
53
16
0
29 Apr 2024
EkoHate: Abusive Language and Hate Speech Detection for Code-switched
  Political Discussions on Nigerian Twitter
EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter
Comfort Eseohen Ilevbare
Jesujoba Oluwadara Alabi
David Ifeoluwa Adelani
Firdous Damilola Bakare
O. B. Abiola
O. Adeyemo
49
8
0
28 Apr 2024
Do "English" Named Entity Recognizers Work Well on Global Englishes?
Do "English" Named Entity Recognizers Work Well on Global Englishes?
Alexander Shan
John Bauer
Riley Carlson
Christopher D. Manning
61
3
0
20 Apr 2024
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for
  Angolan Language Model
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica
David Ifeoluwa Adelani
55
1
0
03 Apr 2024
Africa-Centric Self-Supervised Pre-Training for Multilingual Speech
  Representation in a Sub-Saharan Context
Africa-Centric Self-Supervised Pre-Training for Multilingual Speech Representation in a Sub-Saharan Context
Antoine Caubrière
Elodie Gauthier
45
3
0
02 Apr 2024
Kallaama: A Transcribed Speech Dataset about Agriculture in the Three
  Most Widely Spoken Languages in Senegal
Kallaama: A Transcribed Speech Dataset about Agriculture in the Three Most Widely Spoken Languages in Senegal
Elodie Gauthier
A. Ndiaye
Abdoulaye Guissé
21
3
0
02 Apr 2024
AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for
  Multilingual Semantic Textual Relatedness
AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness
Miaoran Zhang
Mingyang Wang
Jesujoba Oluwadara Alabi
Dietrich Klakow
VLM
66
6
0
01 Apr 2024
SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian
  Languages
SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages
N. Ousidhoum
Shamsuddeen Hassan Muhammad
Mohamed Abdalla
Idris Abdulmumin
Ibrahim Said Ahmad
...
Thamar Solorio
Nirmal Surange
Krishnapriya Vishnubhotla
Seid Muhie Yimam
Saif M. Mohammad
72
13
0
27 Mar 2024
ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech
  Corpus
ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech Corpus
Injy Hamed
Fadhl Eryani
David Palfreyman
Nizar Habash
68
4
0
27 Mar 2024
Introducing Syllable Tokenization for Low-resource Languages: A Case
  Study with Swahili
Introducing Syllable Tokenization for Low-resource Languages: A Case Study with Swahili
Jesse Atuhurra
Hiroyuki Shindo
Hidetaka Kamigaito
Taro Watanabe
23
3
0
26 Mar 2024
EthioLLM: Multilingual Large Language Models for Ethiopian Languages
  with Task Evaluation
EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation
A. Tonja
Israel Abebe Azime
Tadesse Destaw Belay
M. Yigezu
Moges Ahmed Mehamed
...
Olga Kolesnikova
Philipp Slusallek
Dietrich Klakow
Shengwu Xiong
Seid Muhie Yimam
81
8
0
20 Mar 2024
DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and
  Closely-Related Languages
DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and Closely-Related Languages
Fahim Faisal
Orevaoghene Ahia
Aarohi Srivastava
Kabir Ahuja
David Chiang
Yulia Tsvetkov
Antonios Anastasopoulos
66
30
0
16 Mar 2024
The Hidden Space of Transformer Language Adapters
The Hidden Space of Transformer Language Adapters
Jesujoba Oluwadara Alabi
Marius Mosbach
Matan Eyal
Dietrich Klakow
Mor Geva
70
10
1
20 Feb 2024
The Impact of Demonstrations on Multilingual In-Context Learning: A
  Multidimensional Analysis
The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis
Miaoran Zhang
Vagrant Gautam
Mingyang Wang
Jesujoba Oluwadara Alabi
Xiaoyu Shen
Dietrich Klakow
Marius Mosbach
72
11
0
20 Feb 2024
Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and
  Generative Datasets
Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets
Israel Abebe Azime
A. Tonja
Tadesse Destaw Belay
Mitiku Yohannes Fuge
A. Wassie
Eyasu Shiferaw Jada
Yonas Chanie
W. Sewunetie
Seid Muhie Yimam
27
3
0
12 Feb 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALM
ELM
SyDa
LRM
63
221
0
12 Feb 2024
Do Moral Judgment and Reasoning Capability of LLMs Change with Language?
  A Study using the Multilingual Defining Issues Test
Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test
Aditi Khandelwal
Utkarsh Agarwal
Kumar Tanmay
Monojit Choudhury
ELM
LRM
48
7
0
03 Feb 2024
Deep Learning Based Amharic Chatbot for FAQs in Universities
Deep Learning Based Amharic Chatbot for FAQs in Universities
Goitom Ybrah Hailu
Shishay Welay
18
4
0
26 Jan 2024
Multilingual acoustic word embeddings for zero-resource languages
Multilingual acoustic word embeddings for zero-resource languages
C. Jacobs
16
1
0
19 Jan 2024
Cheetah: Natural Language Generation for 517 African Languages
Cheetah: Natural Language Generation for 517 African Languages
Ife Adebara
AbdelRahim Elmadany
Muhammad Abdul-Mageed
51
6
0
02 Jan 2024
123
Next