Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.16534
Cited By
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
23 February 2025
Jonathan Rystrøm
Hannah Rose Kirk
Scott A. Hale
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs"
17 / 17 papers shown
Title
NileChat: Towards Linguistically Diverse and Culturally Aware LLMs for Local Communities
Abdellah El Mekki
Houdaifa Atou
Omer Nacar
Shady Shehata
Muhammad Abdul-Mageed
41
0
0
23 May 2025
Language Specific Knowledge: Do Models Know Better in X than in English?
Ishika Agarwal
Nimet Beyza Bozdag
Dilek Hakkani-Tur
56
0
0
21 May 2025
Crosslingual Reasoning through Test-Time Scaling
Zheng-Xin Yong
Muhammad Farid Adilazuarda
Jonibek Mansurov
Ruochen Zhang
Niklas Muennighoff
Carsten Eickhoff
Genta Indra Winata
Julia Kreutzer
Stephen H. Bach
Alham Fikri Aji
LRM
ELM
368
9
0
08 May 2025
The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context
Nikhil Verma
Manasa Bharadwaj
61
2
0
03 Apr 2025
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
Paul Röttger
Musashi Hinck
Valentin Hofmann
Kobi Hackenburg
Valentina Pyatkin
Faeze Brahman
Dirk Hovy
115
5
0
12 Feb 2025
Investigating Cultural Alignment of Large Language Models
Badr AlKhamissi
Muhammad N. ElNokrashy
Mai AlKhamissi
Mona T. Diab
99
56
0
20 Feb 2024
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
David Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
284
226
0
20 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Hannah Rose Kirk
Andrew M. Bean
Bertie Vidgen
Paul Röttger
Scott A. Hale
ALM
62
48
0
11 Oct 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
314
4,288
0
09 Jun 2023
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Maribeth Rauh
John F. J. Mellor
J. Uesato
Po-Sen Huang
Johannes Welbl
...
Amelia Glaese
G. Irving
Iason Gabriel
William S. Isaac
Lisa Anne Hendricks
108
51
0
16 Jun 2022
The Ghost in the Machine has an American accent: value conflict in GPT-3
Rebecca Lynn Johnson
Giada Pistilli
Natalia Menédez-González
Leslye Denisse Dias Duran
Enrico Panai
Julija Kalpokienė
D. Bertulfo
82
88
0
15 Mar 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
227
4,392
0
27 Oct 2021
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
205
5,513
0
07 Jul 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
53
277
0
22 Mar 2021
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering
Shayne Longpre
Yi Lu
Joachim Daiber
ELM
HILM
72
156
0
30 Jul 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
543
4,797
0
23 Jan 2020
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
252
8,124
0
16 Jun 2016
1