Fast Gradient Computation for RoPE Attention in Almost Linear Time

3 January 2025

Papers citing "Fast Gradient Computation for RoPE Attention in Almost Linear Time"

8 / 8 papers shown

Title
Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform Josh Alman Zhao Song 15 0 0 17 May 2025
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency Jiangxuan Long Zhao Song Chiwun Yang AI4TS 195 0 0 18 Mar 2025
Theoretical Guarantees for High Order Trajectory Refinement in Generative Flows Chengyue Gong Xiaoyu Li Yingyu Liang Jiangxuan Long Zhenmei Shi Zhao Song Yu Tian 56 3 0 12 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches Yifang Chen Xuyang Guo Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song 73 3 0 03 Mar 2025
When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time? Chenyang Li Yingyu Liang Zhenmei Shi Zhao Song 36 3 0 24 Feb 2025
On Computational Limits of FlowAR Models: Expressivity and Efficiency Chengyue Gong Yekun Ke Xiaoyu Li Yingyu Liang Zhizhou Sha Zhenmei Shi Zhao Song 74 3 0 23 Feb 2025
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent Bo Chen Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song 96 20 0 15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration Bo Chen Yingyu Liang Zhizhou Sha Zhenmei Shi Zhao Song 95 19 0 14 Oct 2024