ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.09939
13
0

Towards Better Orthogonality Regularization with Disentangled Norm in Training Deep CNNs

16 June 2023
Changhao Wu
Shenan Zhang
Fangsong Long
Ziliang Yin
Tuo Leng
ArXivPDFHTML
Abstract

Orthogonality regularization has been developed to prevent deep CNNs from training instability and feature redundancy. Among existing proposals, kernel orthogonality regularization enforces orthogonality by minimizing the residual between the Gram matrix formed by convolutional filters and the orthogonality matrix. We propose a novel measure for achieving better orthogonality among filters, which disentangles diagonal and correlation information from the residual. The model equipped with the measure under the principle of imposing strict orthogonality between filters surpasses previous regularization methods in near-orthogonality. Moreover, we observe the benefits of improved strict filter orthogonality in relatively shallow models, but as model depth increases, the performance gains in models employing strict kernel orthogonality decrease sharply. Furthermore, based on the observation of the potential conflict between strict kernel orthogonality and growing model capacity, we propose a relaxation theory on kernel orthogonality regularization. The relaxed kernel orthogonality achieves enhanced performance on models with increased capacity, shedding light on the burden of strict kernel orthogonality on deep model performance. We conduct extensive experiments with our kernel orthogonality regularization toolkit on ResNet and WideResNet in CIFAR-10 and CIFAR-100. We observe state-of-the-art gains in model performance from the toolkit, which includes both strict orthogonality and relaxed orthogonality regularization, and obtain more robust models with expressive features. These experiments demonstrate the efficacy of our toolkit and subtly provide insights into the often overlooked challenges posed by strict orthogonality, addressing the burden of strict orthogonality on capacity-rich models.

View on arXiv
Comments on this paper