Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.00929
Cited By
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
1 April 2024
Yuemei Xu
Ling Hu
Jiayi Zhao
Zihan Qiu
Yuqi Ye
Hanwen Gu
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias"
50 / 75 papers shown
Title
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Mahjabin Nahar
Eun-Ju Lee
Jin Won Park
Dongwon Lee
HILM
117
0
0
01 Apr 2025
CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering
Yumeng Wang
Zhiyuan Fan
Q. Wang
May Fung
Heng Ji
132
4
0
30 Jan 2025
Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto
Maartje ter Hoeve
He Bai
Natalie Schluter
David Grangier
133
0
0
20 Nov 2024
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
139
7
0
03 Oct 2024
Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance
Somnath Banerjee
Avik Halder
Rajarshi Mandal
Sayan Layek
Ian Soboroff
Rima Hazra
Animesh Mukherjee
105
1
0
17 Jun 2024
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
153
743
0
19 Sep 2023
Queer People are People First: Deconstructing Sexual Identity Stereotypes in Large Language Models
Harnoor Dhingra
Preetiha Jayashanker
Sayali S. Moghe
Emma Strubell
55
13
0
30 Jun 2023
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
Tarek Naous
Michael Joseph Ryan
Alan Ritter
Wei Xu
62
92
0
23 May 2023
Comparing Biases and the Impact of Multilingual Training across Multiple Languages
Sharon Levy
Neha Ann John
Ling Liu
Yogarshi Vyas
Jie Ma
Yoshinari Fujinuma
Miguel Ballesteros
Vittorio Castelli
Dan Roth
53
28
0
18 May 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
88
254
0
07 Apr 2023
Overwriting Pretrained Bias with Finetuning Data
Angelina Wang
Olga Russakovsky
43
31
0
10 Mar 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
374
2,377
0
09 Nov 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
169
3,116
0
20 Oct 2022
BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation
Tianxiang Sun
Junliang He
Xipeng Qiu
Xuanjing Huang
69
47
0
14 Oct 2022
Are Pretrained Multilingual Models Equally Fair Across Languages?
Laura Cabello Piqueras
Anders Søgaard
37
9
0
11 Oct 2022
IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces
Kelly Marchisio
Neha Verma
Kevin Duh
Philipp Koehn
45
7
0
11 Oct 2022
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
220
363
0
06 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
344
1,090
0
05 Oct 2022
Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
Indira Sen
Mattia Samory
Claudia Wagner
Isabelle Augenstein
56
17
0
09 May 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
441
6,222
0
05 Apr 2022
Combining Static and Contextualised Multilingual Embeddings
Katharina Hämmerl
Jindrich Libovický
Alexander Fraser
53
11
0
17 Mar 2022
Improving Word Translation via Two-Stage Contrastive Learning
Yaoyiran Li
Fangyu Liu
Nigel Collier
Anna Korhonen
Ivan Vulić
57
29
0
15 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
806
12,893
0
04 Mar 2022
NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
Sebastian Ruder
Ibrahim Said Ahmad
Idris Abdulmumin
...
Chris C. Emezue
Saheed Abdul
Anuoluwapo Aremu
Alipio Jeorge
P. Brazdil
66
100
0
20 Jan 2022
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
126
1,593
0
20 Jan 2022
Few-shot Learning with Multilingual Language Models
Xi Lin
Todor Mihaylov
Mikel Artetxe
Tianlu Wang
Shuohui Chen
...
Luke Zettlemoyer
Zornitsa Kozareva
Mona T. Diab
Ves Stoyanov
Xian Li
BDL
ELM
LRM
97
305
0
20 Dec 2021
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
116
777
0
01 Dec 2021
An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models
Nicholas Meade
Elinor Poole-Dayan
Siva Reddy
61
127
0
16 Oct 2021
Mitigating Language-Dependent Ethnic Bias in BERT
Jaimeen Ahn
Alice Oh
172
99
0
13 Sep 2021
A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations
Ziyi Yang
Yinfei Yang
Daniel Cer
Eric F. Darve
46
24
0
10 Sep 2021
Debiasing Multilingual Word Embeddings: A Case Study of Three Indian Languages
Srijan Bansal
Vishal Garimella
Ayush Suhane
Animesh Mukherjee
42
9
0
21 Jul 2021
An Investigation of the (In)effectiveness of Counterfactually Augmented Data
Nitish Joshi
He He
OODD
42
47
0
01 Jul 2021
A Primer on Pretrained Multilingual Language Models
Sumanth Doddapaneni
Gowtham Ramesh
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
LRM
74
76
0
01 Jul 2021
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
95
841
0
22 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
378
10,273
0
17 Jun 2021
PanGu-
α
α
α
: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
69
213
0
26 Apr 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
53
277
0
22 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDL
AI4CE
116
1,543
0
18 Mar 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
302
384
0
28 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
69
141
0
01 Feb 2021
XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection
Emily Öhman
Marc Pàmies
Kaisla Kajava
Jörg Tiedemann
28
63
0
03 Nov 2020
A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios
Michael A. Hedderich
Lukas Lange
Heike Adel
Jannik Strötgen
Dietrich Klakow
289
299
0
23 Oct 2020
Multi-Adversarial Learning for Cross-Lingual Word Embeddings
Haozhou Wang
James Henderson
Paola Merlo
GAN
72
8
0
16 Oct 2020
Measuring and Reducing Gendered Correlations in Pre-trained Models
Kellie Webster
Xuezhi Wang
Ian Tenney
Alex Beutel
Emily Pitler
Ellie Pavlick
Jilin Chen
Ed Chi
Slav Petrov
FaML
72
258
0
12 Oct 2020
Probing Pretrained Language Models for Lexical Semantics
Ivan Vulić
Edoardo Ponti
Robert Litschko
Goran Glavaš
Anna Korhonen
KELM
68
245
0
12 Oct 2020
The Multilingual Amazon Reviews Corpus
Phillip Keung
Y. Lu
György Szarvas
Noah A. Smith
60
201
0
06 Oct 2020
Towards Debiasing Sentence Representations
Paul Pu Liang
Irene Li
Emily Zheng
Y. Lim
Ruslan Salakhutdinov
Louis-Philippe Morency
70
238
0
16 Jul 2020
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
66
323
0
18 May 2020
Social Biases in NLP Models as Barriers for Persons with Disabilities
Ben Hutchinson
Vinodkumar Prabhakaran
Emily L. Denton
Kellie Webster
Yu Zhong
Stephen Denuyl
57
313
0
02 May 2020
LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space
Tasnim Mohiuddin
M Saiful Bari
Shafiq Joty
50
50
0
28 Apr 2020
1
2
Next