Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.01825
Cited By
PMI-Masking: Principled masking of correlated spans
5 October 2020
Yoav Levine
Barak Lenz
Opher Lieber
Omri Abend
Kevin Leyton-Brown
Moshe Tennenholtz
Y. Shoham
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PMI-Masking: Principled masking of correlated spans"
23 / 23 papers shown
Title
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
31
0
0
05 May 2025
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
67
0
0
13 May 2024
Language Model Adaptation to Specialized Domains through Selective Masking based on Genre and Topical Characteristics
Anas Belfathi
Ygor Gallina
Nicolas Hernandez
Richard Dufour
Laura Monceaux
44
1
0
19 Feb 2024
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal
Yixuan Li
Junjie Hu
37
4
0
31 Jan 2024
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
Zhiyuan Liu
Yaorui Shi
An Zhang
Enzhi Zhang
Kenji Kawaguchi
Xiang Wang
Tat-Seng Chua
AI4CE
39
36
0
23 Oct 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
43
7
0
22 May 2023
What is the best recipe for character-level encoder-only modelling?
Kris Cao
39
2
0
09 May 2023
Salient Span Masking for Temporal Understanding
Jeremy R. Cole
Aditi Chaudhary
Bhuwan Dhingra
Partha P. Talukdar
52
11
0
22 Mar 2023
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ahmed Elnaggar
Hazem Essam
Wafaa Salah-Eldin
Walid Moustafa
Mohamed Elkerdawy
Charlotte Rochereau
B. Rost
167
87
0
16 Jan 2023
Uniform Masking Prevails in Vision-Language Pretraining
Siddharth Verma
Yuchen Lu
Rui Hou
Hanchao Yu
Nicolas Ballas
Madian Khabsa
Amjad Almahairi
VLM
21
0
0
10 Dec 2022
Language Model Pre-training on True Negatives
Zhuosheng Zhang
Hai Zhao
Masao Utiyama
Eiichiro Sumita
34
2
0
01 Dec 2022
UniMASK: Unified Inference in Sequential Decision Problems
Micah Carroll
Orr Paradise
Jessy Lin
Raluca Georgescu
Mingfei Sun
...
Stephanie Milani
Katja Hofmann
Matthew J. Hausknecht
Anca Dragan
Sam Devlin
OffRL
26
21
0
20 Nov 2022
InforMask: Unsupervised Informative Masking for Language Model Pretraining
Nafis Sadeq
Canwen Xu
Julian McAuley
27
13
0
21 Oct 2022
Pre-training Language Models with Deterministic Factual Knowledge
Shaobo Li
Xiaoguang Li
Lifeng Shang
Chengjie Sun
Bingquan Liu
Zhenzhou Ji
Xin Jiang
Qun Liu
KELM
47
11
0
20 Oct 2022
Learning Better Masking for Better Language Model Pre-training
Dongjie Yang
Zhuosheng Zhang
Hai Zhao
37
15
0
23 Aug 2022
Multilingual Extraction and Categorization of Lexical Collocations with Graph-aware Transformers
Luis Espinosa-Anke
A. Shvets
Alireza Mohammadshahi
James Henderson
Leo Wanner
25
4
0
23 May 2022
Unsupervised Slot Schema Induction for Task-oriented Dialog
Dian Yu
Mingqiu Wang
Yuan Cao
Izhak Shafran
Laurent El Shafey
H. Soltau
38
13
0
09 May 2022
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Micah Carroll
Jessy Lin
Orr Paradise
Raluca Georgescu
Mingfei Sun
...
Stephanie Milani
Katja Hofmann
Matthew J. Hausknecht
Anca Dragan
Sam Devlin
OffRL
40
10
0
28 Apr 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
19
17
0
13 Feb 2022
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
24
97
0
17 Jun 2021
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
24
11
0
31 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1