ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.10415
  4. Cited By
How does the pre-training objective affect what large language models
  learn about linguistic properties?

How does the pre-training objective affect what large language models learn about linguistic properties?

20 March 2022
Ahmed Alajrami
Nikolaos Aletras
ArXivPDFHTML

Papers citing "How does the pre-training objective affect what large language models learn about linguistic properties?"

16 / 16 papers shown
Title
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Miguel López-Otal
Jorge Gracia
Jordi Bernad
Carlos Bobed
Lucía Pitarch-Ballesteros
Emma Anglés-Herrero
VLM
36
0
0
09 Apr 2025
Linguistic Blind Spots of Large Language Models
Linguistic Blind Spots of Large Language Models
Jiali Cheng
Hadi Amiri
49
1
0
25 Mar 2025
Exploring the Impact of a Transformer's Latent Space Geometry on
  Downstream Task Performance
Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance
Anna C. Marbut
John W. Chandler
Travis J. Wheeler
35
0
0
18 Jun 2024
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
19
1
0
26 Oct 2023
Large language models in medicine: the potentials and pitfalls
Large language models in medicine: the potentials and pitfalls
J. Omiye
Haiwen Gui
Shawheen J. Rezaei
James Zou
Roxana Daneshjou
LM&MA
16
65
0
31 Aug 2023
Generate to Understand for Representation
Generate to Understand for Representation
Changshan Xue
Xiande Zhong
Xiaoqing Liu
VLM
38
0
0
14 Jun 2023
How does the task complexity of masked pretraining objectives affect
  downstream performance?
How does the task complexity of masked pretraining objectives affect downstream performance?
Atsuki Yamaguchi
Hiroaki Ozaki
Terufumi Morishita
Gaku Morio
Yasuhiro Sogawa
22
2
0
18 May 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
131
619
0
26 Apr 2023
Bag of Tricks for Effective Language Model Pretraining and Downstream
  Adaptation: A Case Study on GLUE
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE
Qihuang Zhong
Liang Ding
Keqin Peng
Juhua Liu
Bo Du
Li Shen
Yibing Zhan
Dacheng Tao
VLM
39
13
0
18 Feb 2023
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked
  Language Models
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
Kenan Tang
Hanchun Jiang
AI4CE
18
1
0
26 Nov 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
Huiyin Xue
Nikolaos Aletras
17
4
0
14 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
  Understanding and Generation
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
41
27
0
30 May 2022
Should You Mask 15% in Masked Language Modeling?
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
29
161
0
16 Feb 2022
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1