Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.06893
Cited By
Learning Word Vectors for 157 Languages
19 February 2018
Edouard Grave
Piotr Bojanowski
Prakhar Gupta
Armand Joulin
Tomáš Mikolov
SSL
FaML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Word Vectors for 157 Languages"
50 / 390 papers shown
Title
Toward Data-centric Directed Graph Learning: An Entropy-driven Approach
Xunkai Li
Zhengyu Wu
Kaichi Yu
Hongchao Qin
Guang Zeng
Rong-Hua Li
Guoren Wang
35
0
0
02 May 2025
Enhancing NER Performance in Low-Resource Pakistani Languages using Cross-Lingual Data Augmentation
Toqeer Ehsan
Thamar Solorio
158
0
0
07 Apr 2025
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents
Samira Zangooei
Amirhossein Darmani
Hossein Farahmand Nezhad
Laya Mahmoudi
47
0
0
13 Mar 2025
KréyoLID From Language Identification Towards Language Mining
Rasul Dent
Pedro Ortiz Suarez
Thibault Clérice
Benoît Sagot
51
0
0
09 Mar 2025
HeTGB: A Comprehensive Benchmark for Heterophilic Text-Attributed Graphs
Shujie Li
Yuxia Wu
Chuan Shi
Yuan Fang
44
0
0
05 Mar 2025
Figurative Archive: an open dataset and web-based application for the study of metaphor
Maddalena Bressler
Veronica Mangiaterra
Paolo Canal
Federico Frau
Fabrizio Luciani
...
Chiara Battaglini
Chiara Pompei
Fortunata Romeo
L. Bischetti
V. Bambini
33
0
0
01 Mar 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Alon Albalak
Duy Phung
Nathan Lile
Rafael Rafailov
Kanishk Gandhi
...
Anikait Singh
Chase Blagden
Violet Xiang
Dakota Mahan
Nick Haber
OffRL
LRM
53
6
0
24 Feb 2025
Multi-label Scandinavian Language Identification (SLIDE)
Mariia Fedorova
Jonas Sebulon Frydenberg
Victoria Handford
Victoria Ovedie Chruickshank Langø
Solveig Helene Willoch
Marthe Løken Midtgaard
Yves Scherrer
Petter Mæhlum
David Samuel
54
0
0
10 Feb 2025
Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study
Calvin Cheng
Scott A. Hale
179
0
0
04 Feb 2025
Toward Effective Digraph Representation Learning: A Magnetic Adaptive Propagation based Approach
Xunkai Li
Daohan Su
Zhengyu Wu
Guang Zeng
Hongchao Qin
Rong-Hua Li
Guoren Wang
AI4CE
33
0
0
21 Jan 2025
Human-like conceptual representations emerge from language prediction
Ningyu Xu
Qi Zhang
Chao Du
Qiang Luo
Xipeng Qiu
Xuanjing Huang
Menghan Zhang
70
0
0
21 Jan 2025
Enriching Social Science Research via Survey Item Linking
Tornike Tsereteli
Daniel Ruffinelli
Simone Paolo Ponzetto
LRM
75
0
0
20 Dec 2024
ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization
Robert Aufschläger
Sebastian Wilhelm
Michael Heigl
Martin Schramm
66
0
0
17 Dec 2024
Bilingual BSARD: Extending Statutory Article Retrieval to Dutch
Ehsan Lotfi
Nikolay Banar
Nerses Yuzbashyan
Walter Daelemans
AILaw
71
0
0
10 Dec 2024
Pay Attention to the Robustness of Chinese Minority Language Models! Syllable-level Textual Adversarial Attack on Tibetan Script
Xi Cao
Dolma Dawa
Nuo Qun
Trashi Nyima
AAML
97
3
0
03 Dec 2024
Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities
Dani Roytburg
Deborah Olorunisola
Sandeep Soni
Lauren F. Klein
64
0
0
03 Dec 2024
HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings
Anton M. Alekseev
Gulnara Kabaeva
21
0
0
16 Nov 2024
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
36
5
0
31 Oct 2024
Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch
Donglin Di
Weinan Zhang
Yue Zhang
Fanglin Wang
23
1
0
24 Oct 2024
Experiences from Creating a Benchmark for Sentiment Classification for Varieties of English
Dipankar Srirag
Jordan Painter
Aditya Joshi
Diptesh Kanojia
25
0
0
15 Oct 2024
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki
Muhammad Abdul-Mageed
LRM
36
0
0
14 Oct 2024
DiRW: Path-Aware Digraph Learning for Heterophily
Daohan Su
Xunkai Li
Zhenjun Li
Yinping Liao
Rong-Hua Li
Guoren Wang
35
1
0
14 Oct 2024
Data Processing for the OpenGPT-X Model Family
Nicolo' Brandizzi
Hammam Abdelwahab
Anirban Bhowmick
Lennard Helmer
Benny Jörg Stein
...
Georg Rehm
Dennis Wegener
Nicolas Flores-Herr
Joachim Kohler
Johannes Leveling
VLM
81
2
0
11 Oct 2024
SkillMatch: Evaluating Self-supervised Learning of Skill Relatedness
Jens-Joris Decorte
Jeroen Van Hautte
Thomas Demeester
Chris Develder
21
0
0
07 Oct 2024
Is deeper always better? Replacing linear mappings with deep learning networks in the Discriminative Lexicon Model
Maria Heitmeier
Valeria Schmidt
Hendrik P. A. Lensch
R. Baayen
44
1
0
05 Oct 2024
Beyond Film Subtitles: Is YouTube the Best Approximation of Spoken Vocabulary?
Adam Nohejl
Frederikus Hudi
Eunike Andriani Kardinata
Shintaro Ozaki
Maria Angelica Riera Machin
Hongyu Sun
Justin Vasselli
Taro Watanabe
39
2
0
04 Oct 2024
Customized Information and Domain-centric Knowledge Graph Construction with Large Language Models
Frank Wawrzik
Matthias Plaue
Savan Vekariya
Christoph Grimm
19
0
0
30 Sep 2024
From LIMA to DeepLIMA: following a new path of interoperability
Victor Bocharov
Romaric Besançon
Gaël de Chalendar
Olivier Ferret
N. Semmar
23
1
0
10 Sep 2024
OpenFGL: A Comprehensive Benchmark for Federated Graph Learning
Xunkai Li
Yichen Zhu
Boyang Pang
Guochen Yan
Yeyu Yan
Zening Li
Zhengyu Wu
Wentao Zhang
Rong-Hua Li
Guoren Wang
FedML
33
1
0
29 Aug 2024
An Evaluation of Sindhi Word Embedding in Semantic Analogies and Downstream Tasks
Wazir Ali
Saifullah Tumrani
Jay Kumar
Tariq Rahim Soomro
21
0
0
28 Aug 2024
Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time
Marisa Hudspeth
Brendan O’Connor
Laure Thompson
23
1
0
13 Aug 2024
Representation Bias of Adolescents in AI: A Bilingual, Bicultural Study
Robert Wolfe
Aayushi Dangol
Bill Howe
Alexis Hiniker
18
3
0
04 Aug 2024
ALLaM: Large Language Models for Arabic and English
M Saiful Bari
Yazeed Alnumay
Norah A. Alzahrani
Nouf M. Alotaibi
H. A. Alyahya
...
Jeril Kuriakose
Abdalghani Abujabal
Nora Al-Twairesh
Areeb Alowisheq
Haidar Khan
42
11
0
22 Jul 2024
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
45
1
0
10 Jul 2024
Revisiting the Message Passing in Heterophilous Graph Neural Networks
Zhuonan Zheng
Yuan-Qi Bei
Sheng Zhou
Yao Ma
Ming Gu
Hongjia Xu
Chengyu Lai
Jiawei Chen
Jiajun Bu
73
0
0
28 May 2024
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping-Chia Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
58
36
0
26 May 2024
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach
Huy V. Vo
Vasil Khalidov
Timothée Darcet
Théo Moutakanni
Nikita Smetanin
...
Maxime Oquab
Armand Joulin
Hervé Jégou
Patrick Labatut
Piotr Bojanowski
SSL
59
18
0
24 May 2024
GotFunding: A grant recommendation system based on scientific articles
Tong Zeng
Daniel Ernesto Acuna
AI4TS
16
4
0
21 May 2024
A Comprehensive Analysis of Static Word Embeddings for Turkish
Karahan Sarıtaş
Cahid Arda Öz
Tunga Güngör
23
3
0
13 May 2024
NLU-STR at SemEval-2024 Task 1: Generative-based Augmentation and Encoder-based Scoring for Semantic Textual Relatedness
Sanad Malaysha
Mustafa Jarrar
Mohammed Khalilia
37
4
0
01 May 2024
Reliability Estimation of News Media Sources: Birds of a Feather Flock Together
Sergio Burdisso
Dairazalia Sanchez-Cortes
Esaú Villatoro-Tello
P. Motlícek
40
5
0
15 Apr 2024
The Shape of Word Embeddings: Recognizing Language Phylogenies through Topological Data Analysis
Ondvrej Draganov
Steven Skiena
19
0
0
30 Mar 2024
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
Rik van Noord
Taja Kuzman
Peter Rupnik
Nikola Ljubesic
Miquel Espla-Gomis
Gema Ramírez-Sánchez
Antonio Toral
ALM
34
2
0
13 Mar 2024
On the Tip of the Tongue: Analyzing Conceptual Representation in Large Language Models with Reverse-Dictionary Probe
Ningyu Xu
Qi Zhang
Menghan Zhang
Peng Qian
Xuanjing Huang
LRM
67
3
0
22 Feb 2024
WERank: Towards Rank Degradation Prevention for Self-Supervised Learning Using Weight Regularization
Ali Saheb Pasand
Reza Moravej
Mahdi Biparva
Ali Ghodsi
42
2
0
14 Feb 2024
Cross-lingual Transfer Learning for Javanese Dependency Parsing
Fadli Aulawi Al Ghiffari
Ika Alfina
Kurniawati Azizah
13
3
0
22 Jan 2024
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
27
20
0
19 Jan 2024
Antonym vs Synonym Distinction using InterlaCed Encoder NETworks (ICE-NET)
Muhammad Asif Ali
Yan Hu
Jianbin Qin
Di Wang
17
1
0
18 Jan 2024
Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges
Aiqi Jiang
A. Zubiaga
AAML
31
3
0
17 Jan 2024
Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with Explanation
Krishanu Maity
Prince Jha
Raghav Jain
S. Saha
P. Bhattacharyya
15
1
0
17 Jan 2024
1
2
3
4
5
6
7
8
Next