Distributionally Robust Learning with Weakly Convex Losses: Convergence Rates and Finite-Sample Guarantees

We consider a distributionally robust stochastic optimization problem and formulate it as a stochastic two-level composition optimization problem with the use of the mean--semideviation risk measure. In this setting, we consider a single time-scale algorithm, involving two versions of the inner function value tracking: linearized tracking of a continuously differentiable loss function, and SPIDER tracking of a weakly convex loss function. We adopt the norm of the gradient of the Moreau envelope as our measure of stationarity and show that the sample complexity of is possible in both cases, with only the constant larger in the second case. Finally, we demonstrate the performance of our algorithm with a robust learning example and a weakly convex, non-smooth regression example.
View on arXiv