CamemBERT: a Tasty French Language Model

10 November 2019

Louis Martin

Eric Villemonte de la Clergerie

Djamé Seddah

Benoît Sagot

ArXiv PDF HTML

Papers citing "CamemBERT: a Tasty French Language Model"

50 / 361 papers shown

Title
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny Karahan Sarıtaş Çağatay Yıldız 34 0 0 12 May 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases Michael Y. Hu Jackson Petty Chuan Shi William Merrill Tal Linzen AI4CE 66 1 0 26 Feb 2025
Extraction multi-étiquettes de relations en utilisant des couches de Transformer Ngoc Luyen Le Gildas Tagny Ngompé 65 0 0 24 Feb 2025
Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias Enzo Doyen Amalia Todirascu 42 0 0 14 Feb 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks Elie Antoine Frédéric Béchet Géraldine Damnati Philippe Langlais 56 1 0 29 Jan 2025
Deep Learning and Natural Language Processing in the Field of Construction Rémy Kessler Nicolas Béchet 51 0 0 14 Jan 2025
IntegrityAI at GenAI Detection Task 2: Detecting Machine-Generated Academic Essays in English and Arabic Using ELECTRA and Stylometry Mohammad AL-Smadi 38 0 0 07 Jan 2025
An Incremental Clustering Baseline for Event Detection on Twitter Marjolaine Ray Qi Wang Frédérique Mélanie-Bécquet Thierry Poibeau Béatrice Mazoyer CLL 82 0 0 16 Dec 2024
Bilingual BSARD: Extending Statutory Article Retrieval to Dutch Ehsan Lotfi Nikolay Banar Nerses Yuzbashyan Walter Daelemans AILaw 74 0 0 10 Dec 2024
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models? Lewen Yang Xuanyu Zhou Juao Fan Xinyi Xie Shengxin Zhu AI4CE 64 0 0 27 Nov 2024
Training Bilingual LMs with Data Constraints in the Targeted Language Skyler Seto Maartje ter Hoeve He Bai Natalie Schluter David Grangier 86 0 0 20 Nov 2024
Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song Yufa Zhou 54 16 0 12 Oct 2024
Manual Verbalizer Enrichment for Few-Shot Text Classification Quang Anh Nguyen Nadi Tomeh M. Lebbah Thierry Charnois Hanene Azzag Santiago Cordoba Muñoz VLM 35 0 0 08 Oct 2024
Explanation sensitivity to the randomness of large language models: the case of journalistic text classification Jérémie Bogaert Marie-Catherine de Marneffe Antonin Descampe Louis Escouflaire Cedrick Fairon François-Xavier Standaert 24 1 0 07 Oct 2024
An evaluation of LLM code generation capabilities through graded exercises Álvaro Barbero Jiménez ELM 36 1 0 06 Oct 2024
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia Tomás Feith Akhil Arora Martin Gerlach Debjit Paul Robert West KELM 33 0 0 05 Oct 2024
Generating bilingual example sentences with large language models as lexicography assistants Raphael Merx Ekaterina Vylomova Kemal Kurniawan 31 2 0 04 Oct 2024
Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks Eunice Akani Benoît Favre Frederic Bechet Romain Gemignani 26 0 0 16 Sep 2024
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain Antoine Louis Gijs van Dijck Gerasimos Spanakis 36 0 0 02 Sep 2024
Exploring Multiple Strategies to Improve Multilingual Coreference Resolution in CorefUD Ondřej Pražák Miloslav Konopík 39 4 0 29 Aug 2024
A Survey of Large Language Models for European Languages Wazir Ali S. Pyysalo 47 2 0 27 Aug 2024
Domain-specific long text classification from sparse relevant information Célia DĆruz J. Bereder Frédéric Precioso Michel Riveill 39 0 0 23 Aug 2024
Combining Objective and Subjective Perspectives for Political News Understanding Evan Dufraisse Adrian Popescu Julien Tourille Armelle Brun Olivier Hamon 40 0 0 20 Aug 2024
Goldfish: Monolingual Language Models for 350 Languages Tyler A. Chang Catherine Arnett Zhuowen Tu Benjamin Bergen LRM 44 4 0 19 Aug 2024
Difficulty Estimation and Simplification of French Text Using LLMs Henri Jamet Yash Raj Shrestha Michalis Vlachos 30 2 0 25 Jul 2024
Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models Ioana Buhnila Aman Sinha Mathieu Constant LM&MA 37 1 0 23 Jul 2024
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment Yongxin Huang Kexin Wang Goran Glavavs Iryna Gurevych 46 0 0 20 Jul 2024
Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? Aman Sinha Timothee Mickus Marianne Clausel Mathieu Constant X. Coubez 36 0 0 17 Jul 2024
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models Nandini Mundra Aditya Nanda Kishore Raj Dabre Ratish Puduppully Anoop Kunchukuttan Mitesh Khapra 30 3 0 08 Jul 2024
Classification of Geological Borehole Descriptions Using a Domain Adapted Large Language Model Hossein Ghorbanfekr P. Kerstens K. Dirix 28 0 0 24 Jun 2024
Growing Trees on Sounds: Assessing Strategies for End-to-End Dependency Parsing of Speech Adrien Pupier Maximin Coavoux Jérôme Goulian Benjamin Lecouteux 13 0 0 18 Jun 2024
Tag and correct: high precision post-editing approach to correction of speech recognition errors Tomasz Ziętkiewicz 31 0 0 11 Jun 2024
MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis Mathieu Ciancone Imene Kerboua Marion Schaeffer W. Siblini 42 2 0 30 May 2024
Multi-objective Representation for Numbers in Clinical Narratives: A CamemBERT-Bio-Based Alternative to Large-Scale LLMs Boammani Aser Lompo Thanh-Dung Le 33 0 0 28 May 2024
Quantifying the Gain in Weak-to-Strong Generalization Moses Charikar Chirag Pabbaraju Kirankumar Shiragur ELM 42 17 0 24 May 2024
Emotion Identification for French in Written Texts: Considering their Modes of Expression as a Step Towards Text Complexity Analysis A. Étienne Delphine Battistelli Gwénolé Lecorvé 34 1 0 23 May 2024
Code-mixed Sentiment and Hate-speech Prediction Anjali Yadav Tanya Garg Matej Klemen Matej Ulčar Basant Agarwal Marko Robnik-Šikonja 30 2 0 21 May 2024
Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis A. Englebert Anne-Sophie Collin O. Cornu Christophe De Vleeschouwer 34 1 0 14 May 2024
No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement Mateusz Klimaszewski Piotr Andruszkiewicz Alexandra Birch MoMe 47 4 0 24 Apr 2024
Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining Nikola Ljubesic Vít Suchomel Peter Rupnik Taja Kuzman Rik van Noord CLL 35 5 0 08 Apr 2024
A Morphology-Based Investigation of Positional Encodings Poulami Ghosh Shikhar Vashishth Raj Dabre Pushpak Bhattacharyya 34 1 0 06 Apr 2024
CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval Santosh T.Y.S.S Kristina Kaiser Matthias Grabmair 13 3 0 31 Mar 2024
New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark Nadege Alavoine G. Laperriere Christophe Servan Sahar Ghannay Sophie Rosset VLM 37 0 0 28 Mar 2024
A Benchmark Evaluation of Clinical Named Entity Recognition in French N. Bannour Christophe Servan Aurélie Névéol Xavier Tannier 26 0 0 28 Mar 2024
Opportunities and challenges in the application of large artificial intelligence models in radiology Liangrui Pan Zhenyu Zhao Ying Lu Kewei Tang Liyong Fu Qingchun Liang Shaoliang Peng LM&MA MedIm AI4CE 45 5 0 24 Mar 2024
A Multi-Label Dataset of French Fake News: Human and Machine Insights B. Icard Franccois Maine Morgane Casanova Géraud Faye Julien Chanson Guillaume Gadek G. Atemezing Franccois Bancilhon Paul Égré 21 0 0 24 Mar 2024
VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Phong Nguyen-Thuan Do Son Quoc Tran Phu Gia Hoang Kiet Van Nguyen Ngan Luu-Thuy Nguyen ELM 50 3 0 23 Mar 2024
Multimodal Chaptering for Long-Form TV Newscast Video Khalil Guetari Yannis Tevissen Frédéric Petitpont AI4TS 17 0 0 20 Mar 2024
A Question on the Explainability of Large Language Models and the Word-Level Univariate First-Order Plausibility Assumption Jérémie Bogaert François-Xavier Standaert AAML LRM 21 2 0 15 Mar 2024
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers Kaichao You Runsheng Bai Meng Cao Jianmin Wang Ion Stoica Mingsheng Long VLM 33 0 0 14 Mar 2024