9
0

Better Neural Network Expressivity: Subdividing the Simplex

Abstract

This work studies the expressivity of ReLU neural networks with a focus on their depth. A sequence of previous works showed that log2(n+1)\lceil \log_2(n+1) \rceil hidden layers are sufficient to compute all continuous piecewise linear (CPWL) functions on Rn\mathbb{R}^n. Hertrich, Basu, Di Summa, and Skutella (NeurIPS'21) conjectured that this result is optimal in the sense that there are CPWL functions on Rn\mathbb{R}^n, like the maximum function, that require this depth. We disprove the conjecture and show that log3(n1)+1\lceil\log_3(n-1)\rceil+1 hidden layers are sufficient to compute all CPWL functions on Rn\mathbb{R}^n.A key step in the proof is that ReLU neural networks with two hidden layers can exactly represent the maximum function of five inputs. More generally, we show that log3(n2)+1\lceil\log_3(n-2)\rceil+1 hidden layers are sufficient to compute the maximum of n4n\geq 4 numbers. Our constructions almost match the log3(n)\lceil\log_3(n)\rceil lower bound of Averkov, Hojny, and Merkert (ICLR'25) in the special case of ReLU networks with weights that are decimal fractions. The constructions have a geometric interpretation via polyhedral subdivisions of the simplex into ``easier'' polytopes.

View on arXiv
@article{bakaev2025_2505.14338,
  title={ Better Neural Network Expressivity: Subdividing the Simplex },
  author={ Egor Bakaev and Florestan Brunck and Christoph Hertrich and Jack Stade and Amir Yehudayoff },
  journal={arXiv preprint arXiv:2505.14338},
  year={ 2025 }
}
Comments on this paper