
Can speed up the convergence rate of stochastic gradient methods to by a gradient averaging strategy?
Papers citing "Can speed up the convergence rate of stochastic gradient methods to $\mathcal{O}(1/k^2)$ by a gradient averaging strategy?"
Title | |||
---|---|---|---|
No papers |