ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.14737
35
13
v1v2v3 (latest)

Stochastic Zeroth Order Gradient and Hessian Estimators: Variance Reduction and Refined Bias Bounds

29 May 2022
Yasong Feng
Tianyu Wang
ArXiv (abs)PDFHTML
Abstract

We study stochastic zeroth order gradient and Hessian estimators for real-valued functions in Rn\mathbb{R}^nRn. We show that, via taking finite difference along random orthogonal directions, the variance of the stochastic finite difference estimators can be significantly reduced. In particular, we design estimators for smooth functions such that, if one uses Θ(k) \Theta \left( k \right) Θ(k) random directions sampled from the Stiefel's manifold St(n,k) \text{St} (n,k) St(n,k) and finite-difference granularity δ\deltaδ, the variance of the gradient estimator is bounded by O((nk−1)+(n2k−n)δ2+n2δ4k) \mathcal{O} \left( \left( \frac{n}{k} - 1 \right) + \left( \frac{n^2}{k} - n \right) \delta^2 + \frac{ n^2 \delta^4 }{ k } \right) O((kn​−1)+(kn2​−n)δ2+kn2δ4​), and the variance of the Hessian estimator is bounded by O((n2k2−1)+(n4k2−n2)δ2+n4δ4k2)\mathcal{O} \left( \left( \frac{n^2}{k^2} - 1 \right) + \left( \frac{n^4}{k^2} - n^2 \right) \delta^2 + \frac{n^4 \delta^4 }{k^2} \right) O((k2n2​−1)+(k2n4​−n2)δ2+k2n4δ4​). When k=nk = nk=n, the variances become negligibly small. In addition, we provide improved bias bounds for the estimators. The bias of both gradient and Hessian estimators for smooth function fff is of order O(δ2Γ)\mathcal{O} \left( \delta^2 \Gamma \right)O(δ2Γ), where δ\deltaδ is the finite-difference granularity, and Γ \Gamma Γ depends on high order derivatives of fff. Our results are evidenced by empirical observations.

View on arXiv
Comments on this paper