ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.02116
  4. Cited By
Unsupervised Cross-lingual Representation Learning at Scale

Unsupervised Cross-lingual Representation Learning at Scale

5 November 2019
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
ArXivPDFHTML

Papers citing "Unsupervised Cross-lingual Representation Learning at Scale"

50 / 1,190 papers shown
Title
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Xiang Geng
Zhejian Lai
Jiajun Chen
Hao Yang
Shujian Huang
62
0
0
27 Feb 2025
NaijaNLP: A Survey of Nigerian Low-Resource Languages
NaijaNLP: A Survey of Nigerian Low-Resource Languages
Isa Inuwa-Dutse
44
0
0
27 Feb 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
49
0
0
26 Feb 2025
Language Models' Factuality Depends on the Language of Inquiry
Language Models' Factuality Depends on the Language of Inquiry
Tushar Aggarwal
Kumar Tanmay
Ayush Agrawal
Kumar Ayush
Hamid Palangi
Paul Pu Liang
HILM
KELM
71
1
0
25 Feb 2025
Entity Framing and Role Portrayal in the News
Entity Framing and Role Portrayal in the News
Tarek Mahmoud
Zhuohan Xie
Dimitar Dimitrov
Nikolaos Nikolaidis
Purificação Silvano
...
Elisa Sartori
Nicolas Stefanovitch
Giovanni Da San Martino
Jakub Piskorski
Preslav Nakov
47
0
0
21 Feb 2025
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
Song Duong
Florian Le Bronnec
Alexandre Allauzen
Vincent Guigue
Alberto Lumbreras
Laure Soulier
Patrick Gallinari
HILM
50
0
0
20 Feb 2025
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
Nils Constantin Hellwig
Jakob Fehle
Udo Kruschwitz
Christian Wolff
AI4MH
46
0
0
18 Feb 2025
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
Yingli Shen
Wen Lai
Shuo Wang
Xueren Zhang
Kangyang Luo
Alexander Fraser
Maosong Sun
49
1
0
17 Feb 2025
URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base
URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base
Aditya Khan
Mason Shipton
David Anugraha
Kaiyao Duan
Phuong H. Hoang
Eric Khiu
A. Seza Doğruöz
En-Shiun Annie Lee
VLM
54
3
0
17 Feb 2025
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases
Sang Quang Nguyen
Kiet Van Nguyen
62
0
0
11 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
81
2
0
10 Feb 2025
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
Saurabh Kumar Pandey
S. Vashistha
Debrup Das
Somak Aditya
Monojit Choudhury
AAML
74
0
0
10 Feb 2025
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
Y. Meena
Vaibhav Singh
Ayush Maheshwari
Amrith Krishna
Ganesh Ramakrishnan
AI4TS
163
0
0
09 Feb 2025
On Memory Construction and Retrieval for Personalized Conversational Agents
On Memory Construction and Retrieval for Personalized Conversational Agents
Zhuoshi Pan
Qianhui Wu
Huiqiang Jiang
Xufang Luo
Hao Cheng
...
Yuqing Yang
Chin-Yew Lin
H. Vicky Zhao
Lili Qiu
Jianfeng Gao
RALM
61
3
0
08 Feb 2025
Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study
Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study
Calvin Cheng
Scott A. Hale
221
0
0
04 Feb 2025
Multilingual Attribute Extraction from News Web Pages
Multilingual Attribute Extraction from News Web Pages
Pavel Bedrin
Maksim Varlamov
Alexander Yatskov
59
1
0
04 Feb 2025
Multilingual State Space Models for Structured Question Answering in Indic Languages
Multilingual State Space Models for Structured Question Answering in Indic Languages
A. Vats
Rahul Raja
Mrinal Mathur
Vinija Jain
Aman Chadha
72
1
0
01 Feb 2025
Revisiting Projection-based Data Transfer for Cross-Lingual Named Entity Recognition in Low-Resource Languages
Revisiting Projection-based Data Transfer for Cross-Lingual Named Entity Recognition in Low-Resource Languages
Andrei Politov
Oleh Shkalikov
René Jäkel
Michael Färber
61
0
0
30 Jan 2025
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation
mHumanEval -- A Multilingual Benchmark to Evaluate Large Language Models for Code Generation
Nishat Raihan
Antonios Anastasopoulos
Marcos Zampieri
ELM
45
6
0
28 Jan 2025
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
Xuan Chen
Yuzhou Nie
Wenbo Guo
Xiangyu Zhang
115
10
0
28 Jan 2025
MEL: Legal Spanish Language Model
David Betancur Sánchez
Nuria Aldama García
Á. Jiménez
Marta Guerrero Nieto
Patricia Marsà Morales
Nicolás Serrano Salas
Carlos García Hernán
Pablo Haya Coll
Elena Montiel Ponsoda
Pablo Calleja Ibáñez
53
0
0
28 Jan 2025
Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop
Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop
Ekaterina Artemova
Akim Tsvigun
Dominik Schlechtweg
Natalia Fedorova
Konstantin Chernyshev
Sergei Tilga
Boris Obmoroshev
SyDa
VLM
187
0
0
28 Jan 2025
IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
Sankalp KJ
Ashutosh Kumar
Laxmaan Balaji
Nikunj Kotecha
Vinija Jain
Aman Chadha
S. Bhaduri
ELM
209
1
0
27 Jan 2025
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
Fu Rong
Meng Lan
Qian Zhang
Lefei Zhang
VOS
VGen
73
1
0
23 Jan 2025
A Comprehensive Social Bias Audit of Contrastive Vision Language Models
A Comprehensive Social Bias Audit of Contrastive Vision Language Models
Zahraa Al Sahili
Ioannis Patras
Matthew Purver
VLM
82
1
0
22 Jan 2025
Comparative Approaches to Sentiment Analysis Using Datasets in Major European and Arabic Languages
Comparative Approaches to Sentiment Analysis Using Datasets in Major European and Arabic Languages
Mikhail Krasitskii
Olga Kolesnikova
Liliana Chanona Hernandez
Grigori Sidorov
Alexander Gelbukh
71
2
0
21 Jan 2025
FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression
FuocChuVIP123 at CoMeDi Shared Task: Disagreement Ranking with XLM-Roberta Sentence Embeddings and Deep Neural Regression
Phuoc Duong Huy Chu
34
0
0
21 Jan 2025
Can MLLMs Generalize to Multi-Party dialog? Exploring Multilingual Response Generation in Complex Scenarios
Can MLLMs Generalize to Multi-Party dialog? Exploring Multilingual Response Generation in Complex Scenarios
Zhongtian Hu
Yiwen Cui
Ronghan Li
Meng Zhao
Lifang Wang
41
0
0
20 Jan 2025
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana
Fabian David Schmidt
Goran Glavas
Heiko Paulheim
71
3
0
20 Jan 2025
From Scarcity to Capability: Empowering Fake News Detection in Low-Resource Languages with LLMs
From Scarcity to Capability: Empowering Fake News Detection in Low-Resource Languages with LLMs
Hrithik Majumdar Shibu
Shrestha Datta
Md. Sumon Miah
Nasrullah Sami
Mahruba Sharmin Chowdhury
Md. Saiful Islam
69
0
0
17 Jan 2025
Harnessing Large Language Models for Disaster Management: A Survey
Harnessing Large Language Models for Disaster Management: A Survey
Zhenyu Lei
Yushun Dong
Weiyu Li
Rong Ding
Qi Wang
Jundong Li
AI4CE
56
3
0
12 Jan 2025
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages
Jannik Brinkmann
Chris Wendler
Christian Bartelt
Aaron Mueller
54
11
0
10 Jan 2025
BERTopic for Topic Modeling of Hindi Short Texts: A Comparative Study
BERTopic for Topic Modeling of Hindi Short Texts: A Comparative Study
Atharva Mutsaddi
Anvi Jamkhande
Aryan Thakre
Yashodhara Haribhakta
28
0
0
08 Jan 2025
BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context
BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context
Alexis Matzopoulos
Charl Hendriks
Hishaam Mahomed
Francois Meyer
30
0
0
08 Jan 2025
IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry
Mohammad AL-Smadi
38
0
0
07 Jan 2025
From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
Eunseong Choi
Sunkyung Lee
Minjin Choi
June Park
Jongwuk Lee
67
1
0
03 Jan 2025
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
Hieu Man
Nghia Trung Ngo
Viet Dac Lai
Ryan Rossi
Franck Dernoncourt
T. Nguyen
226
0
0
01 Jan 2025
Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages
Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages
Tejas Deshpande
Nidhi Kowtal
Raviraj Joshi
LRM
55
1
0
31 Dec 2024
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Samin Mahdizadeh Sani
Pouya Sadeghi
Thuy-Trang Vu
Yadollah Yaghoobzadeh
Gholamreza Haffari
78
2
0
17 Dec 2024
RCLMuFN: Relational Context Learning and Multiplex Fusion Network for
  Multimodal Sarcasm Detection
RCLMuFN: Relational Context Learning and Multiplex Fusion Network for Multimodal Sarcasm Detection
Tongguan Wang
Junkai Li
Guixin Su
Yongcheng Zhang
Dongyu Su
Yuxue Hu
Ying Sha
111
2
0
17 Dec 2024
RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages
RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages
Harshvivek Kashid
Pushpak Bhattacharyya
89
1
0
14 Dec 2024
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Andreas Koukounas
Georgios Mastrapas
Bo Wang
Mohammad Kalim Akram
Sedigheh Eslami
Michael Gunther
Isabelle Mohr
Saba Sturua
Scott Martens
Nan Wang
VLM
118
7
0
11 Dec 2024
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
Surangika Ranathunga
Asanka Ranasinghea
Janaka Shamala
Ayodya Dandeniyaa
Rashmi Galappaththia
Malithi Samaraweeraa
83
0
0
03 Dec 2024
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu
Xinyan Velocity Yu
Dani Yogatama
Jiasen Lu
Yoon Kim
AIFin
54
13
0
07 Nov 2024
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
Hoang Nguyen
Khyati Mahajan
Vikas Yadav
Philip S. Yu
Masoud Hashemi
Rishabh Maheshwary
Rishabh Maheshwary
54
0
0
04 Nov 2024
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xinyi Yang
Yulin Yuan
Lidia S. Chao
DeLMO
58
2
0
31 Oct 2024
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
46
5
0
31 Oct 2024
User-Aware Multilingual Abusive Content Detection in Social Media
User-Aware Multilingual Abusive Content Detection in Social Media
Mohammad Zia Ur Rehman
Somya Mehta
Kuldeep Singh
Kunal Kaushik
Nagendra Kumar
23
14
0
26 Oct 2024
Scaling up Masked Diffusion Models on Text
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
63
14
0
24 Oct 2024
Monolingual and Multilingual Misinformation Detection for Low-Resource Languages: A Comprehensive Survey
Monolingual and Multilingual Misinformation Detection for Low-Resource Languages: A Comprehensive Survey
Xinyu Wang
Wenbo Zhang
Sarah Rajtmajer
37
1
0
24 Oct 2024
Previous
12345...222324
Next