Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.19399
Cited By
Scalable-Softmax Is Superior for Attention
31 January 2025
Ken M. Nakanishi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Scalable-Softmax Is Superior for Attention"
3 / 3 papers shown
Title
Scale-invariant Attention
Ben Anson
Xi Wang
Laurence Aitchison
LRM
105
0
0
20 May 2025
Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models
Hector Pasten
Felipe Urrutia
Hector Jimenez
Cristian B. Calderon
Cristóbal Rojas
Alexander Kozachinskiy
117
0
0
15 May 2025
Multi-Token Attention
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
89
1
0
01 Apr 2025
1