Anisotropy Is Inherent to Self-Attention in Transformers

v1v2 (latest)

Anisotropy Is Inherent to Self-Attention in Transformers

22 January 2024

Eric Villemonte de la Clergerie

ArXiv (abs)PDF HTML

Papers citing "Anisotropy Is Inherent to Self-Attention in Transformers"

5 / 5 papers shown

Title
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments Junyoung Park Dalton Jones Matthew J Morse Raghavv Goel Mingu Lee Chris Lott 98 1 0 21 Apr 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations Yize Zhao Tina Behnia V. Vakilian Christos Thrampoulidis 191 10 0 20 Feb 2025
Discrete Speech Unit Extraction via Independent Component Analysis Tomohiko Nakamura Kwanghee Choi Keigo Hojo Yoshiaki Bando Satoru Fukayama Shinji Watanabe 77 1 0 11 Jan 2025
Banyan: Improved Representation Learning with Explicit Structure Mattia Opper N. Siddharth 221 1 0 25 Jul 2024
Transformer Layers as Painters Qi Sun Marc Pickett Aakash Kumar Nain Llion Jones AI4CE 125 19 0 12 Jul 2024