Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.01176
Cited By
Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset
2 July 2020
Brian Roark
Lawrence Wolf-Sonkin
Christo Kirov
Sabrina J. Mielke
Cibu Johny
Isin Demirsahin
Keith B. Hall
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset"
14 / 14 papers shown
Title
Improving Informally Romanized Language Identification
Adrian Benton
Alexander Gutkin
Christo Kirov
Brian Roark
55
0
0
30 Apr 2025
Low-Resource Transliteration for Roman-Urdu and Urdu Using Transformer-Based Models
Umer Butt
Stalin Veranasi
Günter Neumann
65
0
0
27 Mar 2025
IndoNLP 2025: Shared Task on Real-Time Reverse Transliteration for Romanized Indo-Aryan languages
Deshan Sumanathilaka
Isuri Anuradha
Ruvan Weerasinghe
Nicholas Micallef
Julian Hough
47
0
0
10 Jan 2025
User-Aware Multilingual Abusive Content Detection in Social Media
Mohammad Zia Ur Rehman
Somya Mehta
Kuldeep Singh
Kunal Kaushik
Nagendra Kumar
31
14
0
26 Oct 2024
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
44
0
0
29 Jun 2024
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP
M. Kabir
Mohammed Saidul Islam
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
M Saiful Bari
Enamul Hoque
LM&MA
24
15
0
22 Sep 2023
Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency
Shigeki Karita
R. Sproat
Haruko Ishikawa
35
4
0
07 Jun 2023
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder
J. Clark
Alexander Gutkin
Mihir Kale
Min Ma
...
Dan Garrette
R. Ingle
Melvin Johnson
Dmitry Panteleev
Partha P. Talukdar
ELM
28
38
0
19 May 2023
Gui at MixMT 2022 : English-Hinglish: An MT approach for translation of code mixed data
Akshat Gahoi
Jayant Duneja
Anshul Padhi
Shivam Mangale
Saransh Rajput
Tanvi Kamble
D. Sharma
Vasudeva Varma
35
3
0
21 Oct 2022
Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts
Charangan Vasantharajan
Uthayasanker Thayasivam
26
38
0
24 Aug 2021
Do Images really do the Talking? Analysing the significance of Images in Tamil Troll meme classification
Siddhanth U Hegde
Adeep Hande
R. Priyadharshini
Sajeetha Thavareesan
Ratnasingam Sakuntharaj
S. Thangasamy
B. Bharathi
Bharathi Raja Chakravarthi
42
7
0
09 Aug 2021
MuRIL: Multilingual Representations for Indian Languages
Simran Khanuja
Diksha Bansal
Sarvesh Mehtani
Savya Khosla
Atreyee Dey
...
Shachi Dave
Shruti Gupta
Subhash Chandra Bose Gali
Vishnu Subramanian
Partha P. Talukdar
49
281
0
19 Mar 2021
Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots
Samson Tan
Chenyu You
AAML
39
35
0
17 Mar 2021
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
220
7,930
0
17 Aug 2015
1