Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.10415
Cited By
How does the pre-training objective affect what large language models learn about linguistic properties?
20 March 2022
Ahmed Alajrami
Nikolaos Aletras
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How does the pre-training objective affect what large language models learn about linguistic properties?"
16 / 16 papers shown
Title
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Miguel López-Otal
Jorge Gracia
Jordi Bernad
Carlos Bobed
Lucía Pitarch-Ballesteros
Emma Anglés-Herrero
VLM
36
0
0
09 Apr 2025
Linguistic Blind Spots of Large Language Models
Jiali Cheng
Hadi Amiri
49
1
0
25 Mar 2025
Exploring the Impact of a Transformer's Latent Space Geometry on Downstream Task Performance
Anna C. Marbut
John W. Chandler
Travis J. Wheeler
35
0
0
18 Jun 2024
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
19
1
0
26 Oct 2023
Large language models in medicine: the potentials and pitfalls
J. Omiye
Haiwen Gui
Shawheen J. Rezaei
James Zou
Roxana Daneshjou
LM&MA
16
65
0
31 Aug 2023
Generate to Understand for Representation
Changshan Xue
Xiande Zhong
Xiaoqing Liu
VLM
38
0
0
14 Jun 2023
How does the task complexity of masked pretraining objectives affect downstream performance?
Atsuki Yamaguchi
Hiroaki Ozaki
Terufumi Morishita
Gaku Morio
Yasuhiro Sogawa
22
2
0
18 May 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
131
619
0
26 Apr 2023
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE
Qihuang Zhong
Liang Ding
Keqin Peng
Juhua Liu
Bo Du
Li Shen
Yibing Zhan
Dacheng Tao
VLM
39
13
0
18 Feb 2023
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
Kenan Tang
Hanchun Jiang
AI4CE
18
1
0
26 Nov 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
Huiyin Xue
Nikolaos Aletras
17
4
0
14 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
41
27
0
30 May 2022
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
29
161
0
16 Feb 2022
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1