ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15096
  4. Cited By
Dynamic Masking Rate Schedules for MLM Pretraining

Dynamic Masking Rate Schedules for MLM Pretraining

24 May 2023
Zachary Ankner
Naomi Saphra
Davis W. Blalock
Jonathan Frankle
Matthew L. Leavitt
ArXivPDFHTML

Papers citing "Dynamic Masking Rate Schedules for MLM Pretraining"

6 / 6 papers shown
Title
EuroBERT: Scaling Multilingual Encoders for European Languages
EuroBERT: Scaling Multilingual Encoders for European Languages
Nicolas Boizard
Hippolyte Gisserot-Boukhlef
Duarte M. Alves
André F. T. Martins
Ayoub Hammal
...
Maxime Peyrard
Nuno M. Guerreiro
Patrick Fernandes
Ricardo Rei
Pierre Colombo
158
1
0
07 Mar 2025
Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text
Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text
Andrei Jarca
Florinel-Alin Croitoru
Radu Tudor Ionescu
53
0
0
18 Feb 2025
GPT or BERT: why not both?
GPT or BERT: why not both?
Lucas Georges Gabriel Charpentier
David Samuel
55
5
0
31 Dec 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
67
0
0
13 May 2024
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,017
0
28 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1