Naamapadam: A Large-Scale Named Entity Annotated Data for Indic
Languages

Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages

20 December 2022

Sumanth Doddapaneni

Mitesh M. Khapra

Anoop Kunchukuttan

Papers citing "Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages"

16 / 16 papers shown

Title
Tokenization Matters: Improving Zero-Shot NER for Indic Languages Priyaranjan Pattnayak Hitesh Laxmichand Patel Amit Agarwal 30 0 0 23 Apr 2025
Overcoming Vocabulary Constraints with Pixel-level Fallback Jonas F. Lotz Hendra Setiawan Stephan Peitz Yova Kementchedjhieva 43 0 0 02 Apr 2025
TriNER: A Series of Named Entity Recognition Models For Hindi, Bengali & Marathi Mohammed Amaan Dhamaskar Rasika Ransing 57 0 0 06 Feb 2025
Long Range Named Entity Recognition for Marathi Documents Pranita Deshmukh Nikita Kulkarni Sanhita Kulkarni Kareena Manghani Geetanjali Kale Raviraj Joshi 11 0 0 11 Oct 2024
IndicSentEval: How Effectively do Multilingual Transformer Models encode Linguistic Properties for Indic Languages? Akhilesh Aravapalli Mounika Marreddy S. Oota R. Mamidi Manish Gupta 39 0 0 03 Oct 2024
A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs Vaibhav Singh Amrith Krishna Karthika NJ Ganesh Ramakrishnan 29 4 0 25 Jun 2024
Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages Sankalp Bahad Pruthwik Mishra Karunesh Arora R. Balabantaray D. Sharma Parameswari Krishnamurthy 21 2 0 08 May 2024
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Harman Singh Nitish Gupta Shikhar Bharadwaj Dinesh Tewari Partha P. Talukdar ELM 32 22 0 25 Apr 2024
A Material Lens on Coloniality in NLP William B. Held Camille Harris Michael Best Diyi Yang 25 11 0 14 Nov 2023
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects David Ifeoluwa Adelani Hannah Liu Xiaoyu Shen Nikita Vassilyev Jesujoba Oluwadara Alabi Yanke Mao Haonan Gao Annie En-Shiun Lee ELM 38 60 0 14 Sep 2023
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models Wei Qi Leong Jian Gang Ngui Yosephine Susanto Hamsawardhini Rengarajan Kengatharaiyer Sarveswaran William-Chandra Tjhi 26 9 0 12 Sep 2023
A Comprehensive Analysis of Adapter Efficiency Nandini Mundra Sumanth Doddapaneni Raj Dabre Anoop Kunchukuttan Ratish Puduppully Mitesh M. Khapra 23 10 0 12 May 2023
Romanization-based Large-scale Adaptation of Multilingual Language Models Sukannya Purkayastha Sebastian Ruder Jonas Pfeiffer Iryna Gurevych Ivan Vulić 28 13 0 18 Apr 2023
Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages Sumanth Doddapaneni Rahul Aralikatte Gowtham Ramesh Shreyansh Goyal Mitesh M. Khapra Anoop Kunchukuttan Pratyush Kumar ELM 44 81 0 11 Dec 2022
Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users Yash Madhani Sushane Parthan Priyanka A. Bedekar N. Gokul Ruchi Khapra Anoop Kunchukuttan Pratyush Kumar Mitesh M. Khapra 23 23 0 06 May 2022
Word Alignment by Fine-tuning Embeddings on Parallel Corpora Zi-Yi Dou Graham Neubig 96 257 0 20 Jan 2021