Almost Linear Constant-Factor Sketching for and Logistic Regression

We improve upon previous oblivious sketching and turnstile streaming results for and logistic regression, giving a much smaller sketching dimension achieving -approximation and yielding an efficient optimization problem in the sketch space. Namely, we achieve for any constant a sketching dimension of for regression and for logistic regression, where is a standard measure that captures the complexity of compressing the data. For -regression our sketching dimension is near-linear and improves previous work which either required -approximation with this sketching dimension, or required a larger number of rows. Similarly, for logistic regression previous work had worse factors in its sketching dimension. We also give a tradeoff that yields a approximation in input sparsity time by increasing the total size to for and to for logistic regression. Finally, we show that our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.
View on arXiv