How does the pre-training objective affect what large language models learn about linguistic properties?

20 March 2022

Papers citing "How does the pre-training objective affect what large language models learn about linguistic properties?"

16 / 16 papers shown

Title
Linguistic Interpretability of Transformer-based Language Models: a systematic review Miguel López-Otal Jorge Gracia Jordi Bernad Carlos Bobed Lucía Pitarch-Ballesteros Emma Anglés-Herrero VLM 36 0 0 09 Apr 2025
Linguistic Blind Spots of Large Language Models Jiali Cheng Hadi Amiri 49 1 0 25 Mar 2025
Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance Anna C. Marbut John W. Chandler Travis J. Wheeler 35 0 0 18 Jun 2024
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance? Ahmed Alajrami Katerina Margatina Nikolaos Aletras AAML 19 1 0 26 Oct 2023
Large language models in medicine: the potentials and pitfalls J. Omiye Haiwen Gui Shawheen J. Rezaei James Zou Roxana Daneshjou LM&MA 16 65 0 31 Aug 2023
Generate to Understand for Representation Changshan Xue Xiande Zhong Xiaoqing Liu VLM 38 0 0 14 Jun 2023
How does the task complexity of masked pretraining objectives affect downstream performance? Atsuki Yamaguchi Hiroaki Ozaki Terufumi Morishita Gaku Morio Yasuhiro Sogawa 22 2 0 18 May 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond Jingfeng Yang Hongye Jin Ruixiang Tang Xiaotian Han Qizhang Feng Haoming Jiang Bing Yin Xia Hu LM&MA 131 619 0 26 Apr 2023
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE Qihuang Zhong Liang Ding Keqin Peng Juhua Liu Bo Du Li Shen Yibing Zhan Dacheng Tao VLM 39 13 0 18 Feb 2023
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models Kenan Tang Hanchun Jiang AI4CE 18 1 0 26 Nov 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers Huiyin Xue Nikolaos Aletras 17 4 0 14 Oct 2022
Efficient Methods for Natural Language Processing: A Survey Marcos Vinícius Treviso Ji-Ung Lee Tianchu Ji Betty van Aken Qingqing Cao ... Emma Strubell Niranjan Balasubramanian Leon Derczynski Iryna Gurevych Roy Schwartz 28 109 0 31 Aug 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation Qihuang Zhong Liang Ding Juhua Liu Bo Du Dacheng Tao 41 27 0 30 May 2022
Should You Mask 15% in Masked Language Modeling? Alexander Wettig Tianyu Gao Zexuan Zhong Danqi Chen CVBM 29 161 0 16 Feb 2022
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018