Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.09660
Cited By
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges
19 December 2022
Genta Indra Winata
Alham Fikri Aji
Zheng-Xin Yong
Thamar Solorio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges"
50 / 141 papers shown
Title
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
Ercong Nie
Helmut Schmid
Hinrich Schutze
21
0
0
22 May 2025
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
64
9
0
03 Oct 2024
Understanding and Mitigating Language Confusion in LLMs
Kelly Marchisio
Wei-Yin Ko
Alexandre Berard
Théo Dehaze
Sebastian Ruder
72
27
0
28 Jun 2024
IndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local Languages
Muhammad Farid Adilazuarda
Samuel Cahyawijaya
Genta Indra Winata
Pascale Fung
Ayu Purwarianti
58
12
0
21 Nov 2023
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages
Zheng-Xin Yong
Ruochen Zhang
Jessica Zosa Forde
Skyler Wang
Arjun Subramonian
...
Yinghua Tan
Long Phan
Rowena Garcia
Thamar Solorio
Alham Fikri Aji
LRM
65
49
0
23 Mar 2023
A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies
A. Seza Doğruöz
Sunayana Sitaram
Barbara E. Bullock
Almeida Jacqueline Toribio
88
76
0
05 Jan 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
97
3,515
0
06 Dec 2022
Benchmarking Evaluation Metrics for Code-Switching Automatic Speech Recognition
Injy Hamed
A. Hussein
Oumnia Chellah
Shammur A. Chowdhury
Hamdy Mubarak
Sunayana Sitaram
Nizar Habash
Ahmed M. Ali
48
6
0
22 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
247
2,348
0
09 Nov 2022
MultiCoNER: A Large-scale Multilingual dataset for Complex Named Entity Recognition
S. Malmasi
Anjie Fang
B. Fetahu
Sudipta Kar
Oleg Rokhlenko
53
69
0
30 Aug 2022
Language-specific Characteristic Assistance for Code-switching Speech Recognition
Tongtong Song
Qiang Xu
Meng Ge
Longbiao Wang
Hao Shi
Yongjie Lv
Yuqin Lin
Jianwu Dang
18
26
0
29 Jun 2022
TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Chengfei Li
Shuhao Deng
Yaoping Wang
Guangjing Wang
Y. Gong
Changbin Chen
Jinfeng Bai
53
16
0
27 Jun 2022
CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data
Suman Dowlagar
R. Mamidi
31
6
0
15 Jun 2022
Borrowing or Codeswitching? Annotating for Finer-Grained Distinctions in Language Mixing
Elena Álvarez Mellado
Constantine Lignos
34
5
0
10 Jun 2022
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
...
Zhuoye Zhao
Zijian Wang
Zijie J. Wang
Zirui Wang
Ziyi Wu
ELM
51
1,726
0
09 Jun 2022
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Jinchuan Tian
Jianwei Yu
Chunlei Zhang
Chao Weng
Yuexian Zou
Dong Yu
AuLLM
27
25
0
05 Jun 2022
Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction
Manikandan Ravikiran
Bharathi Raja Chakravarthi
47
3
0
12 May 2022
Findings of the Shared Task on Offensive Span Identification from Code-Mixed Tamil-English Comments
Manikandan Ravikiran
Bharathi Raja Chakravarthi
Anand Kumar Madasamy
Sangeetha Sivanesan
R. Rajalakshmi
Sajeetha Thavareesan
Rahul Ponnusamy
Shankar Mahadevan
18
52
0
12 May 2022
UM6P-CS at SemEval-2022 Task 11: Enhancing Multilingual and Code-Mixed Complex Named Entity Recognition via Pseudo Labels using Multilingual Transformer
Abdellah El Mekki
Abdelkader El Mahdaouy
Mohammed Akallouch
Ismail Berrada
A. Khoumsi
49
2
0
28 Apr 2022
L3Cube-HingCorpus and HingBERT: A Code Mixed Hindi-English Dataset and BERT Language Models
Ravindra Nayak
Raviraj Joshi
30
41
0
18 Apr 2022
Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts
Saadullah Amin
N. Goldstein
M. Wixted
Alejandro García-Rudolph
Catalina Martínez-Costa
G. Neumann
40
5
0
10 Apr 2022
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
Alham Fikri Aji
Genta Indra Winata
Fajri Koto
Samuel Cahyawijaya
Ade Romadhony
...
David Moeljadi
Radityo Eko Prasojo
Timothy Baldwin
Jey Han Lau
Sebastian Ruder
51
102
0
24 Mar 2022
Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching
Alissa Ostapenko
S. Wintner
Melinda Fricke
Yulia Tsvetkov
54
5
0
16 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
605
12,525
0
04 Mar 2022
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
76
30,021
0
01 Mar 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
37
111
0
03 Feb 2022
Reducing language context confusion for end-to-end code-switching automatic speech recognition
Shuai Zhang
Jiangyan Yi
Zhengkun Tian
J. Tao
Y. Yeung
Liqun Deng
34
11
0
28 Jan 2022
KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics
Saida Mussakhojayeva
Yerbolat Khassanov
H. A. Varol
37
13
0
15 Jan 2022
Improving Code-switching Language Modeling with Artificially Generated Texts using Cycle-consistent Adversarial Networks
Chia-Yu Li
Ngoc Thang Vu
30
12
0
12 Dec 2021
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Holy Lovenia
Samuel Cahyawijaya
Genta Indra Winata
Peng Xu
Xu Yan
...
Elham J. Barezi
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
53
32
0
12 Dec 2021
Switch Point biased Self-Training: Re-purposing Pretrained Models for Code-Switching
P. Chopra
Sai Krishna Rallabandi
A. Black
Khyathi Chandu
40
6
0
01 Nov 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
72
94
0
20 Oct 2021
Language Models are Few-shot Multilingual Learners
Genta Indra Winata
Andrea Madotto
Zhaojiang Lin
Rosanne Liu
J. Yosinski
Pascale Fung
ELM
LRM
53
133
0
16 Sep 2021
Towards Developing a Multilingual and Code-Mixed Visual Question Answering System by Knowledge Distillation
H. Khan
D. Gupta
Asif Ekbal
32
14
0
10 Sep 2021
Quality Evaluation of the Low-Resource Synthetically Generated Code-Mixed Hinglish Text
Vivek Srivastava
M. Singh
33
12
0
04 Aug 2021
MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation
Ayush Garg
S. S. Kagi
Vivek Srivastava
M. Singh
29
9
0
24 Jul 2021
The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding
Archiki Prasad
Mohammad Ali Rehan
Shreyasi Pathak
Preethi Jyothi
34
9
0
21 Jul 2021
From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text
Ishan Tarunesh
Syamantak Kumar
Preethi Jyothi
52
45
0
14 Jul 2021
HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text
Vivek Srivastava
M. Singh
32
45
0
08 Jul 2021
Arabic Code-Switching Speech Recognition using Monolingual Data
Ahmed M. Ali
Shammur A. Chowdhury
A. Hussein
Yasser Hifny
30
23
0
04 Jul 2021
Challenges and Limitations with the Metrics Measuring the Complexity of Code-Mixed Text
Vivek Srivastava
M. Singh
46
21
0
18 Jun 2021
CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing
Sai Muralidhar Jayanthi
Kavya Nerella
Khyathi Chandu
A. Black
MoE
42
8
0
10 Jun 2021
Dual Script E2E framework for Multilingual and Code-Switching ASR
Mari Ganesh Kumar
Jom Kuriakose
Anand Thyagachandran
A. Arunkumar
Ashish Seth
L. D. Prasad
Saish Jaiswal
Anusha Prakash
H. Murthy
48
10
0
02 Jun 2021
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR
Shammur A. Chowdhury
A. Hussein
Ahmed Abdelali
Ahmed M. Ali
27
34
0
31 May 2021
Investigating Code-Mixed Modern Standard Arabic-Egyptian to English Machine Translation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
MoE
33
11
0
28 May 2021
Exploring Text-to-Text Transformers for English to Hinglish Machine Translation with Synthetic Code-Mixing
Ganesh Jawahar
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
L. Lakshmanan
57
30
0
18 May 2021
Can You Traducir This? Machine Translation for Code-Switched Input
Jitao Xu
François Yvon
35
30
0
11 May 2021
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation
Sebastian Ruder
Noah Constant
Jan A. Botha
Aditya Siddhant
Orhan Firat
...
Pengfei Liu
Junjie Hu
Dan Garrette
Graham Neubig
Melvin Johnson
ELM
AAML
LRM
34
185
0
15 Apr 2021
Multilingual and code-switching ASR challenges for low resource Indian languages
Anuj Diwan
Rakesh Vaideeswaran
Sanket Shah
Ankita Singh
Srinivasa Raghavan
...
Jai Nanavati
Raoul Nanavati
Karthik Sankaranarayanan
Tejaswi Seeram
Basil Abraham
32
83
0
01 Apr 2021
Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data
Akshat Gupta
Sargam Menghani
Sai Krishna Rallabandi
A. Black
SSL
18
14
0
27 Mar 2021
1
2
3
Next