ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09648
  4. Cited By
NusaCrowd: Open Source Initiative for Indonesian NLP Resources

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

19 December 2022
Samuel Cahyawijaya
Holy Lovenia
Alham Fikri Aji
Genta Indra Winata
Bryan Wilie
Rahmad Mahendra
C. Wibisono
Ade Romadhony
Karissa Vincentio
Fajri Koto
Jennifer Santoso
David Moeljadi
Cahya Wirawan
Frederikus Hudi
Ivan Halim Parmonangan
Ika Alfina
Muhammad Satrio Wicaksono
Ilham Firdausi Putra
Samsul Rahmadani
Yulianti Oenang
Ali Akbar Septiandri
James Jaya
Kaustubh D. Dhole
Arie A. Suryani
Rifki Afina Putri
Dan Su
K. Stevens
Made Nindyatama Nityasya
Muhammad Farid Adilazuarda
Ryan Ignatius
Ryandito Diandaru
Tiezheng Yu
Vito Ghifari
Wenliang Dai
Yan Xu
Dyah Damapuspita
C. Tho
I. M. K. Karo
Tirana Noor Fatyanosa
Ziwei Ji
Pascale Fung
Graham Neubig
Timothy Baldwin
Sebastian Ruder
Herry Sujaini
S. Sakti
Ayu Purwarianti
ArXivPDFHTML

Papers citing "NusaCrowd: Open Source Initiative for Indonesian NLP Resources"

36 / 36 papers shown
Title
Assessing Thai Dialect Performance in LLMs with Automatic Benchmarks and Human Evaluation
Assessing Thai Dialect Performance in LLMs with Automatic Benchmarks and Human Evaluation
Peerat Limkonchotiwat
Kanruethai Masuk
Surapon Nonesung
Chalermpun Mai-On
Sarana Nutanong
Wuttikorn Ponwitayarat
Potsawee Manakul
21
0
0
08 Apr 2025
INCLUDE: Evaluating Multilingual Language Understanding with Regional
  Knowledge
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
Angelika Romanou
Negar Foroutan
Anna Sotnikova
Zeming Chen
Sree Harsha Nelaturu
...
Mike Zhang
Imanol Schlag
Marzieh Fadaee
Sara Hooker
Antoine Bosselut
ELM
113
6
0
29 Nov 2024
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
47
2
0
13 Sep 2024
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling
  Constraints, Languages, and Datasets
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Jiatong Shi
Shih-Heng Wang
William Chen
Martijn Bartelds
Vanya Bannihatti Kumar
...
Xuankai Chang
Dan Jurafsky
Karen Livescu
Hung-yi Lee
Shinji Watanabe
AuLLM
77
5
0
12 Jun 2024
CVQA: Culturally-diverse Multilingual Visual Question Answering
  Benchmark
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
David Romero
Chenyang Lyu
Haryo Akbarianto Wibowo
Teresa Lynn
Injy Hamed
...
Oana Ignat
Joan Nwatu
Rada Mihalcea
Thamar Solorio
Alham Fikri Aji
48
25
0
10 Jun 2024
Cendol: Open Instruction-tuned Generative Large Language Models for
  Indonesian Languages
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages
Samuel Cahyawijaya
Holy Lovenia
Fajri Koto
Rifki Afina Putri
Emmanuel Dave
...
Bryan Wilie
Genta Indra Winata
Alham Fikri Aji
Ayu Purwarianti
Pascale Fung
55
15
0
09 Apr 2024
Multilingual Large Language Model: A Survey of Resources, Taxonomy and
  Frontiers
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers
Libo Qin
Qiguang Chen
Yuhang Zhou
Zhi Chen
Hai-Tao Zheng
Lizi Liao
Min Li
Wanxiang Che
Philip S. Yu
LRM
55
36
0
07 Apr 2024
LLMs Are Few-Shot In-Context Low-Resource Language Learners
LLMs Are Few-Shot In-Context Low-Resource Language Learners
Samuel Cahyawijaya
Holy Lovenia
Pascale Fung
46
35
0
25 Mar 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALM
ELM
SyDa
LRM
35
194
0
12 Feb 2024
Aya Dataset: An Open-Access Collection for Multilingual Instruction
  Tuning
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Shivalika Singh
Freddie Vargus
Daniel D'souza
Börje F. Karlsson
Abinaya Mahendiran
...
Max Bartolo
Julia Kreutzer
Ahmet Üstün
Marzieh Fadaee
Sara Hooker
119
117
0
09 Feb 2024
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Jun Zhao
Zhihao Zhang
Luhui Gao
Qi Zhang
Tao Gui
Xuanjing Huang
ELM
35
65
0
02 Jan 2024
Replicable Benchmarking of Neural Machine Translation (NMT) on
  Low-Resource Local Languages in Indonesia
Replicable Benchmarking of Neural Machine Translation (NMT) on Low-Resource Local Languages in Indonesia
Lucky Susanto
Ryandito Diandaru
Adila Alfa Krisnadhi
Ayu Purwarianti
Derry Wijaya
22
2
0
02 Nov 2023
IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End
  Task-Oriented Dialogue Systems
IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems
Muhammad Dehan Al Kautsar
Rahmah Khoirussyifa' Nurdini
Samuel Cahyawijaya
Genta Indra Winata
Ayu Purwarianti
21
0
0
02 Nov 2023
Utilizing Weak Supervision To Generate Indonesian Conservation Dataset
Utilizing Weak Supervision To Generate Indonesian Conservation Dataset
Mega Fransiska
Diah Pitaloka
Saripudin
Satrio Putra
Lintang Sutawika
27
0
0
17 Oct 2023
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Jiatong Shi
William Chen
Dan Berrebbi
Hsiu-Hsuan Wang
Wei-Ping Huang
...
Yuxun Tang
Shang-Wen Li
Abdelrahman Mohamed
Hung-yi Lee
Shinji Watanabe
LRM
ELM
39
15
0
09 Oct 2023
NusaWrites: Constructing High-Quality Corpora for Underrepresented and
  Extremely Low-Resource Languages
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages
Samuel Cahyawijaya
Holy Lovenia
Fajri Koto
Dea Adhista
Emmanuel Dave
...
Genta Indra Winata
David Moeljadi
Alham Fikri Aji
Ayu Purwarianti
Pascale Fung
46
7
0
19 Sep 2023
Enriching the NArabizi Treebank: A Multifaceted Approach to Supporting
  an Under-Resourced Language
Enriching the NArabizi Treebank: A Multifaceted Approach to Supporting an Under-Resourced Language
Arij Riabi
Menel Mahamdi
Djamé Seddah
34
5
0
26 Jun 2023
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech
  Emotion Recognition
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
Samuel Cahyawijaya
Holy Lovenia
Willy Chung
Rita Frieske
Zihan Liu
Pascale Fung
39
1
0
26 Jun 2023
Lost in Translation: Large Language Models in Non-English Content
  Analysis
Lost in Translation: Large Language Models in Non-English Content Analysis
Gabriel Nicholas
Aliya Bhatia
ELM
15
35
0
12 Jun 2023
GlobalBench: A Benchmark for Global Progress in Natural Language
  Processing
GlobalBench: A Benchmark for Global Progress in Natural Language Processing
Yueqi Song
Catherine Cui
Simran Khanuja
Pengfei Liu
Fahim Faisal
...
Alham Fikri Aji
Samuel Cahyawijaya
Yulia Tsvetkov
Antonios Anastasopoulos
Graham Neubig
19
7
0
24 May 2023
Prompting Multilingual Large Language Models to Generate Code-Mixed
  Texts: The Case of South East Asian Languages
Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages
Zheng-Xin Yong
Ruochen Zhang
Jessica Zosa Forde
Skyler Wang
Arjun Subramonian
...
Yinghua Tan
Long Phan
Rowena Garcia
Thamar Solorio
Alham Fikri Aji
LRM
57
46
0
23 Mar 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
  Reasoning, Hallucination, and Interactivity
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
Yejin Bang
Samuel Cahyawijaya
Nayeon Lee
Wenliang Dai
Dan Su
...
Tiezheng Yu
Willy Chung
Quyet V. Do
Yan Xu
Pascale Fung
ReLM
LRM
27
1,335
0
08 Feb 2023
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
172
327
0
06 Oct 2022
Towards Answering Open-ended Ethical Quandary Questions
Towards Answering Open-ended Ethical Quandary Questions
Yejin Bang
Nayeon Lee
Tiezheng Yu
Leila Khalatbari
Yan Xu
...
Romain Barraud
Elham J. Barezi
Andrea Madotto
Hayden Kee
Pascale Fung
ELM
32
6
0
12 May 2022
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Olga Majewska
E. Razumovskaia
E. Ponti
Ivan Vulić
Anna Korhonen
32
28
0
31 Jan 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
174
86
0
06 Dec 2021
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Masader: Metadata Sourcing for Arabic Text and Speech Data Resources
Zaid Alyafeai
Maraim Masoud
Mustafa Ghaleb
Maged S. Al-Shaibani
44
25
0
13 Oct 2021
Visually Grounded Reasoning across Languages and Cultures
Visually Grounded Reasoning across Languages and Cultures
Fangyu Liu
Emanuele Bugliarello
E. Ponti
Siva Reddy
Nigel Collier
Desmond Elliott
VLM
LRM
109
168
0
28 Sep 2021
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with
  Effective Domain-Specific Vocabulary Initialization
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization
Fajri Koto
Jey Han Lau
Timothy Baldwin
VLM
55
82
0
10 Sep 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
293
1,084
0
17 Feb 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
254
285
0
02 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
276
1,996
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
142
221
0
31 Dec 2020
PhoBERT: Pre-trained language models for Vietnamese
PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
A. Nguyen
174
341
0
02 Mar 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1