Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.06366
Cited By
N-Grammer: Augmenting Transformers with latent n-grams
13 July 2022
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
Shuyuan Zhang
Shibo Wang
Ye Zhang
Shen Wu
Rigel Swavely
Tao Yu
Yu
Phuong Dao
Christopher Fifty
Z. Chen
Yonghui Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"N-Grammer: Augmenting Transformers with latent n-grams"
9 / 9 papers shown
Title
Scaling Embedding Layers in Language Models
Da Yu
Edith Cohen
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Daogao Liu
Chiyuan Zhang
84
0
0
03 Feb 2025
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman
Alexander Nikulin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
Andrei Polubarov
Igor Kiselev
Vladislav Kurenkov
OffRL
56
2
0
04 Nov 2024
Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition
Solène Tarride
Christopher Kermorvant
37
1
0
30 Apr 2024
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
34
15
0
28 Sep 2023
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
91
153
0
17 Sep 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,023
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
255
580
0
12 Mar 2020
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,748
0
26 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
305
31,280
0
16 Jan 2013
1