Neural Networks and (Virtual) Extended Formulations

5 November 2024

Abstract

Neural networks with piecewise linear activation functions, such as rectified linear units (ReLU) or maxout, are among the most fundamental models in modern machine learning. We make a step towards proving lower bounds on the size of such neural networks by linking their representative capabilities to the notion of the extension complexity $\mathrm{xc}(P)$ of a polytope $P$ . This is a well-studied quantity in combinatorial optimization and polyhedral geometry describing the number of inequalities needed to model $P$ as a linear program. We show that $\mathrm{xc}(P)$ is a lower bound on the size of any monotone or input-convex neural network that solves the linear optimization problem over $P$ . This implies exponential lower bounds on such neural networks for a variety of problems, including the polynomially solvable maximum weight matching problem.In an attempt to prove similar bounds also for general neural networks, we introduce the notion of virtual extension complexity $\mathrm{vxc}(P)$ , which generalizes $\mathrm{xc}(P)$ and describes the number of inequalities needed to represent the linear optimization problem over $P$ as a difference of two linear programs. We prove that $\mathrm{vxc}(P)$ is a lower bound on the size of any neural network that optimizes over $P$ . While it remains an open question to derive useful lower bounds on $\mathrm{vxc}(P)$ , we argue that this quantity deserves to be studied independently from neural networks by proving that one can efficiently optimize over a polytope $P$ using a small virtual extended formulation.

View on arXiv

@article{hertrich2025_2411.03006,
  title={ Neural Networks and (Virtual) Extended Formulations },
  author={ Christoph Hertrich and Georg Loho },
  journal={arXiv preprint arXiv:2411.03006},
  year={ 2025 }
}

Comments on this paper