ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18383
  4. Cited By
NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities

NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities

23 May 2025
Abdellah El Mekki
Houdaifa Atou
Omer Nacar
Shady Shehata
Muhammad Abdul-Mageed
ArXivPDFHTML

Papers citing "NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities"

31 / 31 papers shown
Title
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Jonathan Rystrøm
Hannah Rose Kirk
Scott A. Hale
80
6
0
23 Feb 2025
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Loubna Ben Allal
Anton Lozhkov
Elie Bakouch
Gabriel Martín Blázquez
Guilherme Penedo
...
Cyril Zakka
Mathieu Morlon
Colin Raffel
Leandro von Werra
Thomas Wolf
MoE
91
40
0
04 Feb 2025
Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic
Zaid Alyafeai
Michael Pieler
H. Teufel
J. Tow
Marco Bellagente
...
Nikhil Pinnaparaju
Reshinth Adithyan
Paulo Rocha
Maksym Zhuravinskyi
Carlos Riquelme
LM&MA
89
1
0
05 Dec 2024
Aya Expanse: Combining Research Breakthroughs for a New Multilingual
  Frontier
Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
John Dang
Shivalika Singh
Daniel D'souza
Arash Ahmadian
Alejandro Salamanca
...
Nick Frosst
Marzieh Fadaee
Beyza Ermis
Ahmet Üstün
Sara Hooker
ELM
OSLM
MoE
126
41
0
05 Dec 2024
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan
  Arabic Dialect
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect
Guokan Shang
Hadi Abdine
Yousef Khoubrane
Amr Mohamed
Yassine Abbahaddou
...
Xuguang Ren
Eric Moulines
Preslav Nakov
Michalis Vazirgiannis
Eric Xing
59
6
0
26 Sep 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
109
856
0
31 Jul 2024
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge
  Graphs via Adapters
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters
Daniil Gurgurov
Mareike Hartmann
Simon Ostermann
61
8
0
01 Jul 2024
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Tao Ge
Xin Chan
Dian Yu
Haitao Mi
Dong Yu
Dong Yu
SyDa
144
142
0
28 Jun 2024
Are Generative Language Models Multicultural? A Study on Hausa Culture
  and Emotions using ChatGPT
Are Generative Language Models Multicultural? A Study on Hausa Culture and Emotions using ChatGPT
Ibrahim Said Ahmad
Shiran Dudy
R. Ramachandranpillai
Kenneth Church
69
6
0
27 Jun 2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
  Scale
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
100
243
0
25 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
93
17
0
24 Jun 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
95
130
0
09 May 2024
Understanding the Capabilities and Limitations of Large Language Models
  for Cultural Commonsense
Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense
Siqi Shen
Lajanugen Logeswaran
Moontae Lee
Honglak Lee
Soujanya Poria
Rada Mihalcea
AI4MH
LRM
ELM
73
31
0
07 May 2024
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing
  Japanese Language Capabilities
Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities
Kazuki Fujii
Taishi Nakamura
Mengsay Loem
Hiroki Iida
Masanari Ohi
Kakeru Hattori
Hirai Shota
Sakae Mizuki
Rio Yokota
Naoaki Okazaki
CLL
80
71
0
27 Apr 2024
How Bad is Training on Synthetic Data? A Statistical Analysis of
  Language Model Collapse
How Bad is Training on Synthetic Data? A Statistical Analysis of Language Model Collapse
M. Seddik
Suei-Wen Chen
Soufiane Hayou
Pierre Youssef
Merouane Debbah
78
36
0
07 Apr 2024
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common
  Knowledge
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge
Jiyoung Lee
Minwoo Kim
Seungho Kim
Junghwan Kim
Seunghyun Won
Hwaran Lee
Edward Choi
ALM
76
15
0
21 Feb 2024
Airavata: Introducing Hindi Instruction-tuned LLM
Airavata: Introducing Hindi Instruction-tuned LLM
Jay Gala
Thanmay Jayakumar
Jaavid Aktar Husain
M. AswanthKumar
Mohammed Safi Ur Rahman Khan
...
Ratish Puduppully
Mitesh M. Khapra
Raj Dabre
Rudra Murthy
Anoop Kunchukuttan
67
27
0
26 Jan 2024
EtiCor: Corpus for Analyzing LLMs for Etiquettes
EtiCor: Corpus for Analyzing LLMs for Etiquettes
Ashutosh Dwivedi
Pradhyumna Lavania
Ashutosh Modi
52
22
0
29 Oct 2023
ALDi: Quantifying the Arabic Level of Dialectness of Text
ALDi: Quantifying the Arabic Level of Dialectness of Text
Amr Keleg
Sharon Goldwater
Walid Magdy
45
15
0
20 Oct 2023
AceGPT, Localizing Large Language Models in Arabic
AceGPT, Localizing Large Language Models in Arabic
Huang Huang
Fei Yu
Jianqing Zhu
Xuening Sun
Hao Cheng
...
Lian Zhang
Ruoyu Sun
Xiang Wan
Haizhou Li
Jinchao Xu
61
57
0
21 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122
  Language Variants
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
72
153
0
31 Aug 2023
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
Yun Luo
Zhen Yang
Fandong Meng
Yafu Li
Jie Zhou
Yue Zhang
CLL
KELM
154
305
0
17 Aug 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
320
4,298
0
09 Jun 2023
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
Dolphin: A Challenging and Diverse Benchmark for Arabic NLG
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Ahmed Oumar El-Shangiti
Muhammad Abdul-Mageed
LM&MA
49
19
0
24 May 2023
ORCA: A Challenging Benchmark for Arabic Language Understanding
ORCA: A Challenging Benchmark for Arabic Language Understanding
AbdelRahim Elmadany
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
51
44
0
21 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
52
8
0
21 Dec 2022
No Language Left Behind: Scaling Human-Centered Machine Translation
No Language Left Behind: Scaling Human-Centered Machine Translation
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
...
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
215
1,258
0
11 Jul 2022
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual
  Machine Translation
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Naman Goyal
Cynthia Gao
Vishrav Chaudhary
Peng-Jen Chen
Guillaume Wenzek
Da Ju
Sanjan Krishnan
MarcÁurelio Ranzato
Francisco Guzman
Angela Fan
88
583
0
06 Jun 2021
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
D. Song
Jacob Steinhardt
ELM
RALM
166
4,413
0
07 Sep 2020
The State and Fate of Linguistic Diversity and Inclusion in the NLP
  World
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
Pratik M. Joshi
Sebastin Santy
A. Budhiraja
Kalika Bali
Monojit Choudhury
LMTD
107
847
0
20 Apr 2020
HellaSwag: Can a Machine Really Finish Your Sentence?
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
163
2,464
0
19 May 2019
1