We consider regression estimation with modified ReLU neural networks in which network weight matrices are first modified by a function before being multiplied by input vectors. We give an example of continuous, piecewise linear function for which the empirical risk minimizers over the classes of modified ReLU networks with and squared penalties attain, up to a logarithmic factor, the minimax rate of prediction of unknown -smooth function.
View on arXiv