ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.20533
90
1

Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence

26 March 2025
Yijiong Yu
    LRM
    AIMat
ArXivPDFHTML
Abstract

Recent advances in reasoning models have demonstrated significant improvements in accuracy, particularly for complex tasks such as mathematical reasoning, by employing detailed and comprehensive reasoning processes. However, generating these lengthy reasoning sequences is computationally expensive and time-consuming. To address this inefficiency, we leverage the inherent parallelizability of certain tasks to accelerate the reasoning process. Specifically, when multiple parallel reasoning branches exist, we decode multiple tokens per step using a specialized attention mask, processing them within a single sequence, avoiding additional memory usage. Experimental results show that our method achieves over 100% speedup in decoding time while maintaining the answer quality.

View on arXiv
@article{yu2025_2503.20533,
  title={ Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence },
  author={ Yijiong Yu },
  journal={arXiv preprint arXiv:2503.20533},
  year={ 2025 }
}
Comments on this paper