ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.08582
  4. Cited By
MASSIVE: A 1M-Example Multilingual Natural Language Understanding
  Dataset with 51 Typologically-Diverse Languages

MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages

18 April 2022
Jack G. M. FitzGerald
C. Hench
Charith Peris
Scott Mackie
Kay Rottmann
A. Sánchez
Aaron Nash
Liam Urbach
Vishesh Kakarala
Richa Singh
Swetha Ranganath
Laurie Crist
Misha Britan
Wouter Leeuwis
Gokhan Tur
Premkumar Natarajan
ArXivPDFHTML

Papers citing "MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages"

27 / 27 papers shown
Title
Survey of Abstract Meaning Representation: Then, Now, Future
Survey of Abstract Meaning Representation: Then, Now, Future
Behrooz Mansouri
3DV
233
0
0
06 May 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Tiansheng Wen
Yifei Wang
Zequn Zeng
Zhong Peng
Yudi Su
Xinyang Liu
Bo Chen
Hongwei Liu
Stefanie Jegelka
Chenyu You
CLL
76
3
0
03 Mar 2025
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
Y. Meena
Vaibhav Singh
Ayush Maheshwari
Amrith Krishna
Ganesh Ramakrishnan
AI4TS
178
0
0
09 Feb 2025
Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration
Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration
Thomas Walshe
S. Moon
Chunyang Xiao
Yawwani Gunawardana
Fran Silavong
50
2
0
21 Jan 2025
Text Clustering as Classification with LLMs
Text Clustering as Classification with LLMs
Chen Huang
Guoxiu He
44
2
0
03 Jan 2025
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
Beomseok Lee
Ioan Calapodescu
Marco Gaido
Matteo Negri
Laurent Besacier
AuLLM
39
4
0
07 Aug 2024
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of
  Multilingual and Monolingual Text Embedding
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Kenneth C. Enevoldsen
Márton Kardos
Niklas Muennighoff
Kristoffer Laigaard Nielbo
42
9
0
04 Jun 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
68
152
0
27 May 2024
k* Distribution: Evaluating the Latent Space of Deep Neural Networks
  using Local Neighborhood Analysis
k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis
Shashank Kotyan
Tatsuya Ueda
Danilo Vasconcellos Vargas
34
1
0
07 Dec 2023
Primacy Effect of ChatGPT
Primacy Effect of ChatGPT
Yiwei Wang
Yujun Cai
Muhao Chen
Keli Zhang
Bryan Hooi
ALM
AI4MH
LRM
38
15
0
20 Oct 2023
Language Models are Universal Embedders
Language Models are Universal Embedders
Xin Zhang
Zehan Li
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Min Zhang
KELM
ELM
58
6
0
12 Oct 2023
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss
R. S. Srinivasa
Jaejin Cho
Chouchang Yang
Yashas Malur Saidutta
Ching Hua Lee
Yilin Shen
Hongxia Jin
VLM
38
8
0
26 Sep 2023
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for
  Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Songbo Hu
Han Zhou
Mete Hergul
Milan Gritta
Guchun Zhang
Ignacio Iacobacci
Ivan Vulić
Anna Korhonen
41
10
0
26 Jul 2023
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual
  Pretrained Language Models
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin
Chengzhi Hu
Zheyu Zhang
André F. T. Martins
Hinrich Schütze
35
1
0
23 May 2023
Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal
  Selective Self-Training
Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training
Jianfeng He
Julian Salazar
Kaisheng Yao
Haoqi Li
Jason (Jinglun) Cai
VLM
17
7
0
22 May 2023
Generalized Multiple Intent Conditioned Slot Filling
Generalized Multiple Intent Conditioned Slot Filling
Harshil Shah
Arthur Wilcke
Marius Cobzarenco
Cristian C Cobzarenco
Edward Challis
David Barber
18
0
0
18 May 2023
Measuring and Mitigating Local Instability in Deep Neural Networks
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
30
3
0
18 May 2023
The Interpreter Understands Your Meaning: End-to-end Spoken Language
  Understanding Aided by Speech Translation
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation
Mutian He
Philip N. Garner
46
4
0
16 May 2023
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical
  domains
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
Yanis Labrak
Adrien Bazoge
Richard Dufour
Mickael Rouvier
Emmanuel Morin
B. Daille
P. Gourraud
LM&MA
25
54
0
03 Apr 2023
RETVec: Resilient and Efficient Text Vectorizer
RETVec: Resilient and Efficient Text Vectorizer
Elie Bursztein
Marina Zhang
Owen Vallis
Xinyu Jia
Alexey Kurakin
VLM
32
4
0
18 Feb 2023
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for
  Natural Language Understanding in Task-Oriented Dialogue
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe
E. Razumovskaia
Liane Guillou
Ivan Vulić
Anna Korhonen
Alexandra Birch
45
13
0
20 Dec 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
148
2,319
0
09 Nov 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of
  Downstream Tasks
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
51
24
0
26 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
129
95
0
06 Oct 2022
LINGUIST: Language Model Instruction Tuning to Generate Annotated
  Utterances for Intent Classification and Slot Tagging
LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging
Andrew Rosenbaum
Saleh Soltan
Wael Hamza
Yannick Versley
M. Boese
26
43
0
20 Sep 2022
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation
Olga Majewska
E. Razumovskaia
Edoardo Ponti
Ivan Vulić
Anna Korhonen
39
28
0
31 Jan 2022
Crossing the Conversational Chasm: A Primer on Natural Language
  Processing for Multilingual Task-Oriented Dialogue Systems
Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems
E. Razumovskaia
Goran Glavaš
Olga Majewska
Edoardo Ponti
Anna Korhonen
Ivan Vulić
36
32
0
17 Apr 2021
1