BERTje: A Dutch BERT Model

19 December 2019

Wietse de Vries

Andreas van Cranenburgh

Papers citing "BERTje: A Dutch BERT Model"

50 / 128 papers shown

Title
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents Samira Zangooei Amirhossein Darmani Hossein Farahmand Nezhad Laya Mahmoudi 50 0 0 13 Mar 2025
I see what you mean: Co-Speech Gestures for Reference Resolution in Multimodal Dialogue E. Ghaleb Bulat Khaertdinov Aslı Özyürek Raquel Fernández 41 0 0 27 Feb 2025
Detecting Linguistic Bias in Government Documents Using Large language Models Milena de Swart Floris den Hengst Jieying Chen 66 0 0 20 Feb 2025
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models? Lewen Yang Xuanyu Zhou Juao Fan Xinyi Xie Shengxin Zhu AI4CE 64 0 0 27 Nov 2024
Training Bilingual LMs with Data Constraints in the Targeted Language Skyler Seto Maartje ter Hoeve He Bai Natalie Schluter David Grangier 86 0 0 20 Nov 2024
Towards Tailored Recovery of Lexical Diversity in Literary Machine Translation Esther Ploeger Huiyuan Lai Rik van Noord Antonio Toral 29 1 0 30 Aug 2024
Quantifying the Effectiveness of Student Organization Activities using Natural Language Processing Lyberius Ennio F. Taruc A. R. L. Cruz 18 0 0 16 Aug 2024
LegalTurk Optimized BERT for Multi-Label Text Classification and NER Farnaz Zeidi Mehmet Fatih Amasyali Çiğdem Erol VLM 30 1 0 30 Jun 2024
Classification of Geological Borehole Descriptions Using a Domain Adapted Large Language Model Hossein Ghorbanfekr P. Kerstens K. Dirix 28 0 0 24 Jun 2024
Exploration of Attention Mechanism-Enhanced Deep Learning Models in the Mining of Medical Textual Data Lingxi Xiao Muqing Li Yinqiu Feng Meiqi Wang Ziyi Zhu Zexi Chen 34 16 0 23 May 2024
Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining Nikola Ljubesic Vít Suchomel Peter Rupnik Taja Kuzman Rik van Noord CLL 35 5 0 08 Apr 2024
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias Yuemei Xu Ling Hu Jiayi Zhao Zihan Qiu Yuqi Ye Hanwen Gu LRM 27 36 0 01 Apr 2024
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Rik van Noord Taja Kuzman Peter Rupnik Nikola Ljubesic Miquel Espla-Gomis Gema Ramírez-Sánchez Antonio Toral ALM 40 2 0 13 Mar 2024
Aspect-Based Sentiment Analysis for Open-Ended HR Survey Responses Lois Rink Job Meijdam David Graus 21 1 0 07 Feb 2024
$Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes$ Describing Images $\textit{Fast and Slow}$ : Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes Ece Takmaz Sandro Pezzelle Raquel Fernández 24 1 0 02 Feb 2024
Do large language models solve verbal analogies like children do? Claire E. Stevenson Mathilde ter Veen Rochelle Choenni Han L. J. van der Maas Ekaterina Shutova LRM 18 8 0 31 Oct 2023
Transformer-based Entity Legal Form Classification Alexander Arimond Mauro Molteni Dominik Jany Zornitsa Manolova Damian Borth Andreas G. F. Hoepner MedIm AILaw 19 1 0 19 Oct 2023
Filling in the Gaps: Efficient Event Coreference Resolution using Graph Autoencoder Networks Loic De Langhe Orphée De Clercq Véronique Hoste 36 1 0 18 Oct 2023
Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation François Remy Pieter Delobelle Bettina Berendt Kris Demuynck Thomas Demeester 29 3 0 05 Oct 2023
Analysing Cross-Lingual Transfer in Low-Resourced African Named Entity Recognition Michael Beukman Manuel A. Fokam 21 2 0 11 Sep 2023
Spanish Pre-trained BERT Model and Evaluation Data J. Cañete Gabriel Chaperon Rodrigo Fuentes Jou-Hui Ho Hojin Kang Jorge Pérez 30 657 0 06 Aug 2023
Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations Gábor Berend 38 7 0 25 Jul 2023
Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages Shivanshu Gupta Yoshitomo Matsubara Ankita N. Chadha Alessandro Moschitti 22 2 0 25 May 2023
Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization G. Wijnholds M. Moortgat 24 3 0 24 May 2023
Language-Agnostic Bias Detection in Language Models with Bias Probing Abdullatif Köksal Omer F. Yalcin Ahmet Akbiyik M. Kilavuz Anna Korhonen Hinrich Schütze 41 1 0 22 May 2023
DUMB: A Benchmark for Smart Evaluation of Dutch Models Wietse de Vries Martijn B. Wieling Malvina Nissim ELM ALM MoE 34 6 0 22 May 2023
Advancing Neural Encoding of Portuguese with Transformer Albertina PT-* João Rodrigues Luís Gomes Joao Silva António Branco Rodrigo Santos Henrique Lopes Cardoso T. Osório 27 43 0 11 May 2023
Exploring Linguistic Properties of Monolingual BERTs with Typological Classification among Languages Elena Sofia Ruzzetti Federico Ranaldi F. Logozzo Michele Mastromattei Leonardo Ranaldi Fabio Massimo Zanzotto 24 8 0 03 May 2023
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages Verena Blaschke Hinrich Schütze Barbara Plank 39 14 0 20 Apr 2023
BERTino: an Italian DistilBERT model Matteo Muffo E. Bertino VLM 18 14 0 31 Mar 2023
Learning for Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification Cuong V. Nguyen Khiem H. Le Anh Tran Quang Pham Binh T. Nguyen 15 14 0 16 Mar 2023
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain Joel Niklaus Veton Matoshi Pooja Rani Andrea Galassi Matthias Sturmer Ilias Chalkidis ELM AILaw 19 55 0 30 Jan 2023
Can Peanuts Fall in Love with Distributional Semantics? J. Michaelov S. Coulson Benjamin Bergen MILM 26 8 0 20 Jan 2023
FullStop:Punctuation and Segmentation Prediction for Dutch with Transformers Vincent Vandeghinste Oliver Guhr 10 6 0 09 Jan 2023
Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training Kelly Marchisio Patrick Lewis Yihong Chen Mikel Artetxe 35 16 0 20 Dec 2022
Lessons learned from the evaluation of Spanish Language Models Rodrigo Agerri Eneko Agirre ELM 30 15 0 16 Dec 2022
Beyond Discrete Genres: Mapping News Items onto a Multidimensional Framework of Genre Cues Zilin Lin Kasper Welbers Susan A. M. Vermeer D. Trilling 6 3 0 08 Dec 2022
Bidirectional Representations for Low Resource Spoken Language Understanding Quentin Meeus Marie-Francine Moens Hugo Van hamme 19 2 0 24 Nov 2022
Multitask Learning for Low Resource Spoken Language Understanding Quentin Meeus Marie-Francine Moens Hugo Van hamme 24 4 0 24 Nov 2022
RobBERT-2022: Updating a Dutch Language Model to Account for Evolving Language Use Pieter Delobelle Thomas Winters Bettina Berendt 24 6 0 15 Nov 2022
Local Structure Matters Most in Most Languages Louis Clouâtre Prasanna Parthasarathi Amal Zouaq Sarath Chandar 39 1 0 09 Nov 2022
Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes Louis Clouâtre Prasanna Parthasarathi Amal Zouaq Sarath Chandar 33 3 0 09 Nov 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron Mueller Yudi Xia Tal Linzen MILM 36 9 0 25 Oct 2022
Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model Chris van der Lee Thiago Castro Ferreira Chris Emmery Travis J. Wiltshire Emiel Krahmer 27 2 0 14 Jul 2022
Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese Kurt Micallef Albert Gatt Marc Tanti Lonneke van der Plas Claudia Borg 33 28 0 21 May 2022
Making sense of violence risk predictions using clinical notes P. Mosteiro Emil Rijcken Kalliopi Zervanou U. Kaymak Floortje E. Scheepers Marco Spruit 14 6 0 29 Apr 2022
Machine Learning for Violence Risk Assessment Using Dutch Clinical Notes P. Mosteiro Emil Rijcken Kalliopi Zervanou U. Kaymak Floortje E. Scheepers Marco Spruit 33 13 0 28 Apr 2022
RobBERTje: a Distilled Dutch BERT Model Pieter Delobelle Thomas Winters Bettina Berendt 30 14 0 28 Apr 2022
Tweets2Stance: Users stance detection exploiting Zero-Shot Learning Algorithms on Tweets Margherita Gambini T. Fagni C. Senette Maurizio Tesconi 15 3 0 22 Apr 2022
ALBETO and DistilBETO: Lightweight Spanish Language Models J. Canete S. Donoso Felipe Bravo-Marquez Andrés Carvallo Vladimir Araujo 48 20 0 19 Apr 2022