Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.16618
Cited By
StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training
25 November 2024
Kaustubh Ponkshe
Venkatapathy Subramanian
Natwar Modani
Ganesh Ramakrishnan
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training"
9 / 9 papers shown
Title
HEGEL: Hypergraph Transformer for Long Document Summarization
Haopeng Zhang
Xiao Liu
Jiawei Zhang
71
45
0
09 Oct 2022
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig
Tianyu Gao
Zexuan Zhong
Danqi Chen
CVBM
93
166
0
16 Feb 2022
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
582
2,105
0
28 Jul 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
187
4,105
0
10 Apr 2020
PubLayNet: largest dataset ever for document layout analysis
Xu Zhong
Jianbin Tang
Antonio Jimeno Yepes
54
462
0
16 Aug 2019
A Multiscale Visualization of Attention in the Transformer Model
Jesse Vig
ViT
81
583
0
12 Jun 2019
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
235
1,607
0
11 Jun 2019
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
136
1,919
0
23 Apr 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
1