Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition

Abstract
Existing sequence to sequence models for structured language tasks rely heavily on the dot product self attention mechanism, which incurs quadratic complexity in both computation and memory for input length N. We introduce the Graph Wavelet Transformer (GWT), a novel architecture that replaces this bottleneck with a learnable, multi scale wavelet transform defined over an explicit graph Laplacian derived from syntactic or semantic parses. Our analysis shows that multi scale spectral decomposition offers an interpretable, efficient, and expressive alternative to quadratic self attention for graph structured sequence modeling.
View on arXiv@article{kiruluta2025_2505.07862, title={ Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition }, author={ Andrew Kiruluta and Eric Lundy and Priscilla Burity }, journal={arXiv preprint arXiv:2505.07862}, year={ 2025 } }
Comments on this paper