Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.02433
Cited By
Stolen Probability: A Structural Weakness of Neural Language Models
5 May 2020
David Demeter
Gregory J. Kimmel
Doug Downey
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stolen Probability: A Structural Weakness of Neural Language Models"
7 / 7 papers shown
Title
Norm of Mean Contextualized Embeddings Determines their Variance
Hiroaki Yamagiwa
Hidetoshi Shimodaira
27
0
0
17 Sep 2024
LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization
Peng Lu
Ahmad Rashid
I. Kobyzev
Mehdi Rezagholizadeh
Philippe Langlais
13
0
0
08 May 2023
Why do Nearest Neighbor Language Models Work?
Frank F. Xu
Uri Alon
Graham Neubig
RALM
30
22
0
07 Jan 2023
Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice
Andreas Grivas
Nikolay Bogoychev
Adam Lopez
17
9
0
12 Mar 2022
Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings
Sangwon Yu
Jongyoon Song
Heeseung Kim
SeongEun Lee
Woo-Jong Ryu
Sung-Hoon Yoon
22
31
0
07 Sep 2021
Query-Key Normalization for Transformers
Alex Henry
Prudhvi Raj Dachapally
S. Pawar
Yuxuan Chen
17
77
0
08 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Rada Mihalcea
Jonathan K. Kummerfeld
AI4CE
19
4
0
29 Sep 2020
1