45

The Effective Depth Paradox: Evaluating the Relationship between Architectural Topology and Trainability in Deep CNNs

Manfred M. Fischer
Joshua Pitts
Main:15 Pages
7 Figures
Bibliography:1 Pages
1 Tables
Abstract

This paper investigates the relationship between convolutional neural network (CNN) and image recognition performance through a comparative study of the VGG, ResNet and GoogLeNet architectural families. By evaluating these models under a unified experimental framework on upscaled CIFAR-10 data, we isolate the effects of depth from confounding implementation variables. We introduce a formal distinction between nominal depth (DnomD_{\mathrm{nom}}), the total count of weight-bearing layers, and effective depth (DeffD_{\mathrm{eff}}), an operational metric representing the expected number of sequential transformations encountered along all feasible forward paths. As derived in Section 3, DeffD_{\mathrm{eff}} is computed through topology-specific proxies: as the total sequential count for plain networks, the arithmetic mean of minimum and maximum path lengths for residual structures, and the sum of average branch depths for multi-branch modules. Our empirical results demonstrate that while sequential architectures such as VGG suffer from diminishing returns and severe gradient attenuation as DnomD_{\mathrm{nom}} increases, architectures with identity shortcuts or branching modules maintain optimization stability. This stability is achieved by decoupling DeffD_{\mathrm{eff}} from DnomD_{\mathrm{nom}}, thus ensuring a manageable functional depth for gradient propagation. We conclude that effective depth serves as a superior predictor of a network's scaling potential and practical trainability compared to traditional layer counts, providing a principled framework for future architectural innovation.

View on arXiv
Comments on this paper