Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.08823
Cited By
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation
16 June 2021
Srinadh Bhojanapalli
Ayan Chakrabarti
Himanshu Jain
Sanjiv Kumar
Michal Lukasik
Andreas Veit
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation"
23 / 23 papers shown
Title
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
101
20
0
30 Jul 2024
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
130
717
0
08 Nov 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
167
1,570
0
30 Sep 2020
O
(
n
)
O(n)
O
(
n
)
Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun
Yin-Wen Chang
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
45
81
0
08 Jun 2020
Synthesizer: Rethinking Self-Attention in Transformer Models
Yi Tay
Dara Bahri
Donald Metzler
Da-Cheng Juan
Zhe Zhao
Che Zheng
53
335
0
02 May 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
128
4,048
0
10 Apr 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
294
596
0
12 Mar 2020
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
83
1,496
0
27 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
71
92
0
24 Feb 2020
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
181
2,307
0
13 Jan 2020
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
101
352
0
20 Dec 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
371
20,053
0
23 Oct 2019
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
64
365
0
07 Jun 2019
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
97
1,060
0
25 May 2019
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
93
1,894
0
23 Apr 2019
Star-Transformer
Qipeng Guo
Xipeng Qiu
Pengfei Liu
Yunfan Shao
Xiangyang Xue
Zheng Zhang
69
264
0
25 Feb 2019
Pay Less Attention with Lightweight and Dynamic Convolutions
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
70
607
0
29 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.5K
94,511
0
11 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
904
7,141
0
20 Apr 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
636
130,942
0
12 Jun 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
499
4,473
0
18 Apr 2017
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
118
2,544
0
22 Jun 2015
1