Fast Monte-Carlo Approximation of the Attention Mechanism

30 January 2022

Papers citing "Fast Monte-Carlo Approximation of the Attention Mechanism"

8 / 8 papers shown

Title
Rethinking Attention with Performers K. Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane ... Afroz Mohiuddin Lukasz Kaiser David Belanger Lucy J. Colwell Adrian Weller 116 1,548 0 30 Sep 2020
Linformer: Self-Attention with Linear Complexity Sinong Wang Belinda Z. Li Madian Khabsa Han Fang Hao Ma 144 1,678 0 08 Jun 2020
Longformer: The Long-Document Transformer Iz Beltagy Matthew E. Peters Arman Cohan RALM VLM 65 3,996 0 10 Apr 2020
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting Bryan Lim Sercan O. Arik Nicolas Loeff Tomas Pfister AI4TS 81 1,427 0 19 Dec 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 94 7,386 0 02 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 281 1,861 0 17 Sep 2019
Are Sixteen Heads Really Better than One? Paul Michel Omer Levy Graham Neubig MoE 45 1,049 0 25 May 2019
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation Emily L. Denton Wojciech Zaremba Joan Bruna Yann LeCun Rob Fergus FAtt 98 1,682 0 02 Apr 2014