v1v2 (latest)

Enriching Word Vectors with Subword Information

15 July 2016

Papers citing "Enriching Word Vectors with Subword Information"

50 / 2,679 papers shown

Title
Essential-Web v1.0: 24T tokens of organized web data Essential AI Andrew Hojel Michael Pust Tim Romanski Yash Vanjani ... Platon Mazarakis Saad Jamal Saurabh Srivastava Somanshu Singla Ashish Vaswani 36 0 0 17 Jun 2025
Edeflip: Supervised Word Translation between English and Yoruba Ikeoluwa Abioye Jiani Ge 5 0 0 16 Jun 2025
Static Word Embeddings for Sentence Semantic Representation Takashi Wada Yuki Hirakawa Ryotaro Shimizu Takahiro Kawashima Yuki Saito 95 0 0 05 Jun 2025
Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback Junior Cedric Tonga KV Aditya Srivatsa Kaushal Kumar Maurya Fajri Koto Ekaterina Kochmar LRM 111 0 0 05 Jun 2025
Multi-domain anomaly detection in a 5G network Thomas Hoger Philippe Owezarski 17 0 0 04 Jun 2025
TokAlign: Efficient Vocabulary Adaptation via Token Alignment Chong Li Jiajun Zhang Chengqing Zong VLM 65 0 0 04 Jun 2025
Culture Matters in Toxic Language Detection in Persian Zahra Bokaei Walid Magdy Bonnie Webber 27 0 0 03 Jun 2025
Dictionaries to the Rescue: Cross-Lingual Vocabulary Transfer for Low-Resource Languages Using Bilingual Dictionaries Haruki Sakajo Yusuke Ide Justin Vasselli Yusuke Sakai Yingtao Tian Hidetaka Kamigaito Taro Watanabe 56 0 0 02 Jun 2025
Leveraging Natural Language Processing to Unravel the Mystery of Life: A Review of NLP Approaches in Genomics, Transcriptomics, and Proteomics Ella Rannon David Burstein AI4TS 33 0 0 02 Jun 2025
Novel Benchmark for NER in the Wastewater and Stormwater Domain Franco Alberto Cardillo Franca Debole Francesca Frontini Mitra Aelami Nanée Chahinian Serge Conrad 58 0 0 02 Jun 2025
Memory-Efficient FastText: A Comprehensive Approach Using Double-Array Trie Structures and Mark-Compact Memory Management Yimin Du VLM 63 0 0 02 Jun 2025
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training Marianne de Heer Kloots Hosein Mohebbi Charlotte Pouw Gaofei Shen Willem H. Zuidema Martijn Bentum SSL 73 0 0 01 Jun 2025
Synthetic Document Question Answering in Hungarian Jonathan Li Zoltan Csaki Nidhi Hiremath Etash Guha Fenglu Hong Edward Ma Urmish Thakker 49 0 0 29 May 2025
Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models Qiuyu Ding Zhiqiang Cao Hailong Cao Tiejun Zhao 66 0 0 29 May 2025
FoodTaxo: Generating Food Taxonomies with Large Language Models Pascal Wullschleger Majid Zarharan Donnacha Daly Marc Pouly Jennifer Foster 19 0 0 26 May 2025
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project Angelina A. Aquino Lester James V. Miranda Elsie Marie T. Or 84 0 0 26 May 2025
BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations Ashirbad Mishra Jinyu Zhao Soumik Dey Hansi Wu Binbin Li Kamesh Madduri 56 0 0 25 May 2025
The Pilot Corpus of the English Semantic Sketches Maria Petrova Maria Ponomareva Alexandra Ivoylova 143 0 0 23 May 2025
SemSketches-2021: experimenting with the machine processing of the pilot semantic sketches corpus Maria Ponomareva Maria Petrova Julia Detkova Oleg Serikov Maria Yarova 199 0 0 23 May 2025
Omni TM-AE: A Scalable and Interpretable Embedding Model Using the Full Tsetlin Machine State Space Ahmed K. Kadhim Lei Jiao Rishad Shafik Ole-Christoffer Granmo 129 0 0 22 May 2025
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs Jisu Kim Youngwoo Shin Uiji Hwang Jihun Choi Richeng Xuan Taeuk Kim LRM 87 0 0 22 May 2025
Guarded Query Routing for Large Language Models Richard Šléher William Brach Tibor Sloboda Kristián Košťál Lukas Galke RALM 81 0 0 20 May 2025
A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs V.S.D.S.Mahesh Akavarapu Hrishikesh Terdalkar Pramit Bhattacharyya Shubhangi Agarwal Vishakha Deulgaonkar Pralay Manna Chaitali Dangarikar Arnab Bhattacharya 91 0 0 19 May 2025
Historical and psycholinguistic perspectives on morphological productivity: A sketch of an integrative approach Harald Baayen Kristian Berg Maziyah Mohamed 25 0 0 17 May 2025
LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations Yile Wang Zhanyu Shen Hui Huang 180 0 0 15 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis Zhizhuo Yin Yuk Hang Tsui Pan Hui SLR VGen 66 0 0 13 May 2025
Hakim: Farsi Text Embedding Model Mehran Sarmadi Morteza Alikhani Erfan Zinvandi Zahra Pourbahman VLM 210 0 0 13 May 2025
A Comparative Analysis of Static Word Embeddings for Hungarian Máté Gedeon 79 0 0 12 May 2025
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent E. Wilcox Cui Ding Giovanni Acampa Tiago Pimentel Alex Warstadt Tamar I. Regev 79 1 0 12 May 2025
An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition M. Maziyah Mohamed R. H. Baayen 27 1 0 09 May 2025
Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models Dawid Wi'sniewski Antoni Solarski Artur Nowakowski LRM 101 0 0 09 May 2025
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models Faeze Ghorbanpour Thiago Zordan Malaguth Aliakbar Akbaritabar 53 0 0 09 May 2025
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors Nicy Scaria Silvester John Joseph Kennedy Diksha Seth Ananya Thakur Deepak N. Subramani AI4Ed 115 0 0 02 May 2025
The Influence of Text Variation on User Engagement in Cross-Platform Content Sharing Yibo Hu Yiqiao Jin Meng Ye Ajay Divakaran Srijan Kumar 65 0 0 26 Apr 2025
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo Ocheme Anthony Ekle Biswarup Das 64 0 0 24 Apr 2025
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation Thomas F Burns Letitia Parcalabescu Stephan Wäldchen Michael Barlow Gregor Ziegltrum Volker Stampa Bastian Harren Björn Deiseroth SyDa 138 0 0 24 Apr 2025
MPAD: A New Dimension-Reduction Method for Preserving Nearest Neighbors in High-Dimensional Vector Search Jiuzhou Fu Dongfang Zhao 46 0 0 23 Apr 2025
On Self-improving Token Embeddings Mario M. Kubek Shiraj Pokharel Thomas Böhme Emma L. McDaniel Herwig Unger Armin R. Mikler AI4TS 39 0 0 21 Apr 2025
myNER: Contextualized Burmese Named Entity Recognition with Bidirectional LSTM and fastText Embeddings via Joint Training with POS Tagging Kaung Lwin Thant Kwankamol Nongpong Ye Kyaw Thu Thura Aung Khaing Hsu Wai Thazin Myint Oo 47 0 0 05 Apr 2025
MegaMath: Pushing the Limits of Open Math Corpora Fan Zhou Zengzhi Wang Nikhil Ranjan Zhoujun Cheng Liping Tang Guowei He Zhengzhong Liu Eric P. Xing LRM 139 3 0 03 Apr 2025
Subasa -- Adapting Language Models for Low-resourced Offensive Language Detection in Sinhala Shanilka Haturusinghe Tharindu Cyril Weerasooriya Marcos Zampieri Christopher Homan S. Liyanage 73 0 0 02 Apr 2025
Semantic Adapter for Universal Text Embeddings: Diagnosing and Mitigating Negation Blindness to Enhance Universality Hongliu Cao 123 1 0 01 Apr 2025
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals Lucy Havens Benjamin Bach Melissa Mhairi Terras Beatrice Alex 121 0 0 01 Apr 2025
Advancing Sentiment Analysis in Tamil-English Code-Mixed Texts: Challenges and Transformer-Based Solutions Mikhail Krasitskii Olga Kolesnikova Liliana Chanona Hernandez Grigori Sidorov Alexander Gelbukh 98 1 0 30 Mar 2025
The realization of tones in spontaneous spoken Taiwan Mandarin: a corpus-based survey and theory-driven computational modeling Yuxin Lu Yu-Ying Chuang R. Baayen 128 0 0 29 Mar 2025
ParsiPy: NLP Toolkit for Historical Persian Texts in Python Farhan Farsi Parnian Fazel Sepand Haghighi Sadra Sabouri Farzaneh Goshtasb Nadia Hajipour Ehsaneddin Asgari Hossein Sameti 68 0 0 22 Mar 2025
A Data-driven Investigation of Euphemistic Language: Comparing the usage of "slave" and "servant" in 19th century US newspapers Jaihyun Park Ryan Cordell 64 0 0 19 Mar 2025
Language Independent Named Entity Recognition via Orthogonal Transformation of Word Vectors Omar E. Rakha Hazem M. Abbas 78 0 0 18 Mar 2025
MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization Binjie Liu Lina Liu Sanyi Zhang Songen Gu Yihao Zhi Tianyi Zhu Lei Yang Long Ye SLR 111 0 0 18 Mar 2025
Sentiment Analysis in SemEval: A Review of Sentiment Identification Approaches Bousselham EL HADDAOUI R. Chiheb R. Faizi A. E. Afia 121 0 0 13 Mar 2025