In this paper, we propose a general framework for the algorithm New Q-Newton's method Backtracking, developed in the author's previous work. For a symmetric, square real matrix , we define . Given a cost function and a real number , as well as fixed real numbers , we define for each with the following quantities: ; , where is the first element in the sequence for which ; are an orthonormal basis of , chosen appropriately; the step direction, given by the formula: w(x)=\sum _{i=1}^m\frac{<\nabla f(x),e_i(x)>}{||A(x)e_i(x)||}e_i(x); (we can also normalise by when needed) learning rate chosen by Backtracking line search so that Armijo's condition is satisfied: f(x-\gamma (x)w(x))-f(x)\leq -\frac{1}{3}\gamma (x)<\nabla f(x),w(x)>. The update rule for our algorithm is . In New Q-Newton's method Backtracking, the choices are and 's are eigenvectors of . In this paper, we allow more flexibility and generality, for example can be chosen to be or 's are not necessarily eigenvectors of . New Q-Newton's method Backtracking (as well as Backtracking gradient descent) is a special case, and some versions have flavours of quasi-Newton's methods. Several versions allow good theoretical guarantees. An application to solving systems of polynomial equations is given.
View on arXiv