ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.04606
  4. Cited By
Enriching Word Vectors with Subword Information
v1v2 (latest)

Enriching Word Vectors with Subword Information

15 July 2016
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
    NAISSLVLM
ArXiv (abs)PDFHTML

Papers citing "Enriching Word Vectors with Subword Information"

50 / 2,679 papers shown
Title
Essential-Web v1.0: 24T tokens of organized web data
Essential-Web v1.0: 24T tokens of organized web data
Essential AI
Andrew Hojel
Michael Pust
Tim Romanski
Yash Vanjani
...
Platon Mazarakis
Saad Jamal
Saurabh Srivastava
Somanshu Singla
Ashish Vaswani
36
0
0
17 Jun 2025
Edeflip: Supervised Word Translation between English and Yoruba
Edeflip: Supervised Word Translation between English and Yoruba
Ikeoluwa Abioye
Jiani Ge
5
0
0
16 Jun 2025
Static Word Embeddings for Sentence Semantic Representation
Takashi Wada
Yuki Hirakawa
Ryotaro Shimizu
Takahiro Kawashima
Yuki Saito
95
0
0
05 Jun 2025
Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback
Junior Cedric Tonga
KV Aditya Srivatsa
Kaushal Kumar Maurya
Fajri Koto
Ekaterina Kochmar
LRM
111
0
0
05 Jun 2025
Multi-domain anomaly detection in a 5G network
Multi-domain anomaly detection in a 5G network
Thomas Hoger
Philippe Owezarski
17
0
0
04 Jun 2025
TokAlign: Efficient Vocabulary Adaptation via Token Alignment
TokAlign: Efficient Vocabulary Adaptation via Token Alignment
Chong Li
Jiajun Zhang
Chengqing Zong
VLM
65
0
0
04 Jun 2025
Culture Matters in Toxic Language Detection in Persian
Culture Matters in Toxic Language Detection in Persian
Zahra Bokaei
Walid Magdy
Bonnie Webber
27
0
0
03 Jun 2025
Dictionaries to the Rescue: Cross-Lingual Vocabulary Transfer for Low-Resource Languages Using Bilingual Dictionaries
Dictionaries to the Rescue: Cross-Lingual Vocabulary Transfer for Low-Resource Languages Using Bilingual Dictionaries
Haruki Sakajo
Yusuke Ide
Justin Vasselli
Yusuke Sakai
Yingtao Tian
Hidetaka Kamigaito
Taro Watanabe
56
0
0
02 Jun 2025
Leveraging Natural Language Processing to Unravel the Mystery of Life: A Review of NLP Approaches in Genomics, Transcriptomics, and Proteomics
Leveraging Natural Language Processing to Unravel the Mystery of Life: A Review of NLP Approaches in Genomics, Transcriptomics, and Proteomics
Ella Rannon
David Burstein
AI4TS
33
0
0
02 Jun 2025
Novel Benchmark for NER in the Wastewater and Stormwater Domain
Novel Benchmark for NER in the Wastewater and Stormwater Domain
Franco Alberto Cardillo
Franca Debole
Francesca Frontini
Mitra Aelami
Nanée Chahinian
Serge Conrad
58
0
0
02 Jun 2025
Memory-Efficient FastText: A Comprehensive Approach Using Double-Array Trie Structures and Mark-Compact Memory Management
Memory-Efficient FastText: A Comprehensive Approach Using Double-Array Trie Structures and Mark-Compact Memory Management
Yimin Du
VLM
63
0
0
02 Jun 2025
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training
Marianne de Heer Kloots
Hosein Mohebbi
Charlotte Pouw
Gaofei Shen
Willem H. Zuidema
Martijn Bentum
SSL
73
0
0
01 Jun 2025
Synthetic Document Question Answering in Hungarian
Synthetic Document Question Answering in Hungarian
Jonathan Li
Zoltan Csaki
Nidhi Hiremath
Etash Guha
Fenglu Hong
Edward Ma
Urmish Thakker
49
0
0
29 May 2025
Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models
Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models
Qiuyu Ding
Zhiqiang Cao
Hailong Cao
Tiejun Zhao
66
0
0
29 May 2025
FoodTaxo: Generating Food Taxonomies with Large Language Models
FoodTaxo: Generating Food Taxonomies with Large Language Models
Pascal Wullschleger
Majid Zarharan
Donnacha Daly
Marc Pouly
Jennifer Foster
19
0
0
26 May 2025
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project
Angelina A. Aquino
Lester James V. Miranda
Elsie Marie T. Or
84
0
0
26 May 2025
BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations
BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations
Ashirbad Mishra
Jinyu Zhao
Soumik Dey
Hansi Wu
Binbin Li
Kamesh Madduri
56
0
0
25 May 2025
The Pilot Corpus of the English Semantic Sketches
The Pilot Corpus of the English Semantic Sketches
Maria Petrova
Maria Ponomareva
Alexandra Ivoylova
143
0
0
23 May 2025
SemSketches-2021: experimenting with the machine processing of the pilot semantic sketches corpus
SemSketches-2021: experimenting with the machine processing of the pilot semantic sketches corpus
Maria Ponomareva
Maria Petrova
Julia Detkova
Oleg Serikov
Maria Yarova
199
0
0
23 May 2025
Omni TM-AE: A Scalable and Interpretable Embedding Model Using the Full Tsetlin Machine State Space
Omni TM-AE: A Scalable and Interpretable Embedding Model Using the Full Tsetlin Machine State Space
Ahmed K. Kadhim
Lei Jiao
Rishad Shafik
Ole-Christoffer Granmo
129
0
0
22 May 2025
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
Jisu Kim
Youngwoo Shin
Uiji Hwang
Jihun Choi
Richeng Xuan
Taeuk Kim
LRM
87
0
0
22 May 2025
Guarded Query Routing for Large Language Models
Guarded Query Routing for Large Language Models
Richard Šléher
William Brach
Tibor Sloboda
Kristián Košťál
Lukas Galke
RALM
81
0
0
20 May 2025
A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs
A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs
V.S.D.S.Mahesh Akavarapu
Hrishikesh Terdalkar
Pramit Bhattacharyya
Shubhangi Agarwal
Vishakha Deulgaonkar
Pralay Manna
Chaitali Dangarikar
Arnab Bhattacharya
91
0
0
19 May 2025
Historical and psycholinguistic perspectives on morphological productivity: A sketch of an integrative approach
Historical and psycholinguistic perspectives on morphological productivity: A sketch of an integrative approach
Harald Baayen
Kristian Berg
Maziyah Mohamed
25
0
0
17 May 2025
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
Yile Wang
Zhanyu Shen
Hui Huang
180
0
0
15 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
Zhizhuo Yin
Yuk Hang Tsui
Pan Hui
SLRVGen
66
0
0
13 May 2025
Hakim: Farsi Text Embedding Model
Hakim: Farsi Text Embedding Model
Mehran Sarmadi
Morteza Alikhani
Erfan Zinvandi
Zahra Pourbahman
VLM
210
0
0
13 May 2025
A Comparative Analysis of Static Word Embeddings for Hungarian
A Comparative Analysis of Static Word Embeddings for Hungarian
Máté Gedeon
79
0
0
12 May 2025
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent
E. Wilcox
Cui Ding
Giovanni Acampa
Tiago Pimentel
Alex Warstadt
Tamar I. Regev
79
1
0
12 May 2025
An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
M. Maziyah Mohamed
R. H. Baayen
27
1
0
09 May 2025
Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models
Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models
Dawid Wi'sniewski
Antoni Solarski
Artur Nowakowski
LRM
101
0
0
09 May 2025
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models
Faeze Ghorbanpour
Thiago Zordan Malaguth
Aliakbar Akbaritabar
53
0
0
09 May 2025
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Nicy Scaria
Silvester John Joseph Kennedy
Diksha Seth
Ananya Thakur
Deepak N. Subramani
AI4Ed
115
0
0
02 May 2025
The Influence of Text Variation on User Engagement in Cross-Platform Content Sharing
The Influence of Text Variation on User Engagement in Cross-Platform Content Sharing
Yibo Hu
Yiqiao Jin
Meng Ye
Ajay Divakaran
Srijan Kumar
65
0
0
26 Apr 2025
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
Ocheme Anthony Ekle
Biswarup Das
64
0
0
24 Apr 2025
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
Thomas F Burns
Letitia Parcalabescu
Stephan Wäldchen
Michael Barlow
Gregor Ziegltrum
Volker Stampa
Bastian Harren
Björn Deiseroth
SyDa
138
0
0
24 Apr 2025
MPAD: A New Dimension-Reduction Method for Preserving Nearest Neighbors in High-Dimensional Vector Search
MPAD: A New Dimension-Reduction Method for Preserving Nearest Neighbors in High-Dimensional Vector Search
Jiuzhou Fu
Dongfang Zhao
46
0
0
23 Apr 2025
On Self-improving Token Embeddings
On Self-improving Token Embeddings
Mario M. Kubek
Shiraj Pokharel
Thomas Böhme
Emma L. McDaniel
Herwig Unger
Armin R. Mikler
AI4TS
39
0
0
21 Apr 2025
myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging
myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging
Kaung Lwin Thant
Kwankamol Nongpong
Ye Kyaw Thu
Thura Aung
Khaing Hsu Wai
Thazin Myint Oo
47
0
0
05 Apr 2025
MegaMath: Pushing the Limits of Open Math Corpora
MegaMath: Pushing the Limits of Open Math Corpora
Fan Zhou
Zengzhi Wang
Nikhil Ranjan
Zhoujun Cheng
Liping Tang
Guowei He
Zhengzhong Liu
Eric P. Xing
LRM
139
3
0
03 Apr 2025
Subasa -- Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala
Subasa -- Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala
Shanilka Haturusinghe
Tharindu Cyril Weerasooriya
Marcos Zampieri
Christopher Homan
S. Liyanage
73
0
0
02 Apr 2025
Semantic Adapter for Universal Text Embeddings: Diagnosing and Mitigating Negation Blindness to Enhance Universality
Semantic Adapter for Universal Text Embeddings: Diagnosing and Mitigating Negation Blindness to Enhance Universality
Hongliu Cao
123
1
0
01 Apr 2025
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Lucy Havens
Benjamin Bach
Melissa Mhairi Terras
Beatrice Alex
121
0
0
01 Apr 2025
Advancing Sentiment Analysis in Tamil-English Code-Mixed Texts: Challenges and Transformer-Based Solutions
Advancing Sentiment Analysis in Tamil-English Code-Mixed Texts: Challenges and Transformer-Based Solutions
Mikhail Krasitskii
Olga Kolesnikova
Liliana Chanona Hernandez
Grigori Sidorov
Alexander Gelbukh
98
1
0
30 Mar 2025
The realization of tones in spontaneous spoken Taiwan Mandarin: a corpus-based survey and theory-driven computational modeling
The realization of tones in spontaneous spoken Taiwan Mandarin: a corpus-based survey and theory-driven computational modeling
Yuxin Lu
Yu-Ying Chuang
R. Baayen
128
0
0
29 Mar 2025
ParsiPy: NLP Toolkit for Historical Persian Texts in Python
ParsiPy: NLP Toolkit for Historical Persian Texts in Python
Farhan Farsi
Parnian Fazel
Sepand Haghighi
Sadra Sabouri
Farzaneh Goshtasb
Nadia Hajipour
Ehsaneddin Asgari
Hossein Sameti
68
0
0
22 Mar 2025
A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers
A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers
Jaihyun Park
Ryan Cordell
64
0
0
19 Mar 2025
Language Independent Named Entity Recognition via Orthogonal Transformation of Word Vectors
Language Independent Named Entity Recognition via Orthogonal Transformation of Word Vectors
Omar E. Rakha
Hazem M. Abbas
78
0
0
18 Mar 2025
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization
Binjie Liu
Lina Liu
Sanyi Zhang
Songen Gu
Yihao Zhi
Tianyi Zhu
Lei Yang
Long Ye
SLR
111
0
0
18 Mar 2025
Sentiment Analysis in SemEval: A Review of Sentiment Identification Approaches
Bousselham EL HADDAOUI
R. Chiheb
R. Faizi
A. E. Afia
121
0
0
13 Mar 2025
1234...525354
Next