We analyze the optimization landscapes of deep learning with wide networks. We highlight the importance of constraints for such networks and show that constraint -- as well as unconstraint -- empirical-risk minimization over such networks has no confined points, that is, suboptimal parameters that are difficult to escape from. Hence, our theories substantiate the common belief that wide neural networks are not only highly expressive but also comparably easy to optimize.
View on arXiv