ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.01012
26
0

Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance

1 July 2024
Youngmin Seo
Jinha Kim
Unsang Park
ArXivPDFHTML
Abstract

We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. We ultimately propose the Swish-TC_{\textbf{C}}C​ function, while Swish-T and Swish-TB_{\textbf{B}}B​, byproducts of Swish-TC_{\textbf{C}}C​, also demonstrate satisfactory performance. Furthermore, our ablation study shows that using Swish-TC_{\textbf{C}}C​ as a non-parametric function can still achieve high performance. The superiority of the Swish-T family has been empirically demonstrated across various models and benchmark datasets, including MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. The code is publicly available at https://github.com/ictseoyoungmin/Swish-T-pytorch.

View on arXiv
Comments on this paper