Global convergence properties of the two new dependent Fletcher–Reeves conjugate gradient methods

https://doi.org/10.1016/j.amc.2006.01.078Get rights and content

Abstract

In this paper, we propose two new dependent Fletcher–Reeves conjugate gradient methods arising from different choice for the scalar βk. We make two different kinds of estimations of upper bounds of ∣βk∣ with respect to βkFR, which are based on Abel Theorem of non-convergent series of positive items. With several different line searches, global convergence results are established for the two new methods which extend the previous dependent Fletcher–Reeves conjugate gradient methods.

Introduction

Consider the unconstrained optimization problemminxRnf(x),where f : Rn  R is continuously differentiable. Conjugate gradient methods for solving (1.1) are iterative methods of the formxk+1=xk+αkdk,dk=-gk,k=1;-gk+βkdk-1,k2,where gk = f(xk), αk is a step-length obtained by a one-dimensional line search and βk is a scalar. The well-known formulas for βk are the Fletcher–Reeves (FR) [18] formulaβkFR=gk2gk-12and the Polak–Ribière–Polyak (PRP) [19], [20] formulaβkPRP=gkT(gk-gk-1)gk-12,where ∥ · ∥ means the Euclidean norm. The conjugate gradient method is available for large-scale unconstrained optimization because its storage is relatively small. Number results in [17] shows that if f is easy to be computed and if its dimension n is very large, the conjugate gradient method is still the best choice for solving (1.1). Convergence properties of the FR conjugate gradient method have been studied by many authors, including [1], [3], [4], [5], [6], [7], [8], [9], [10], [13], [15], [16], [18], [19], [20].

Many efforts have been devoted to investigate the global convergence properties of the FR conjugate gradient method. Powell [8] and Zoutendijk [13] proved that the FR conjugate gradient method with exact line searches is globally convergent on general functions. Al-Baali [1] extended this result to inexact line searches. However, the numerical performance of the FR conjugate gradient method is often much slower than that of the PRP conjugate gradient method. The PRP conjugate gradient method has a good numerical performance [9], but does not have such good convergence property [8]. Due to this, some dependent FR conjugate gradient methods are proposed [5], [6], [7], [15], [16]. In this kind of methods, the region of βk with respect to βkFR is introduced and if βk is chosen suitably in this region, then the method not only has the global convergence property but also has the nice numerical experience.

Gilbert and Nocedal [5] investigated global convergence properties of the dependent FR conjugate gradient method with βk satisfying |βk|βkFR, provided that the line search satisfies the strong Wolfe conditionsf(xk)-f(xk+αkdk)-δαkgkTdk,|g(xk+αkdk)Tdk|σ|gkTdk|,where 0 < δ < σ < 1. If βk satisfies |βk|cβkFR (where c > 1), they have given an example to indicate that even exact line search cannot guarantee the global convergence property of the dependent FR conjugate gradient method.

Hu and Storey [6] proved the dependent FR conjugate gradient method is globally convergent with the strong Wolfe line search if βk satisfiesσβkβkFRσ¯,gk4j=1ki=jk-1lick,where 0 < σ < 1, 0<σ¯<12, li=|βiβiFR|2 and c > 0. Liu et al. [7], [16] had the same result if 0<σ¯12. It should be pointed out that the global convergence of the previous dependent FR conjugate gradient method can be obtained only under the strong Wolfe line search.

Dai and Yuan [4] investigated global convergence properties of the FR conjugate gradient method with the generalized Wolfe conditions, i.e. (1.6) andσ1gkTdkg(xk+αkdk)Tdk-σ2gkTdk,where 0 < δ < σ1 < 1 and σ2 > 0. They also proved that σ1 + σ2  1 is a sufficient condition for ensuring gkTdk0 for all k. It is easy to see that (1.7) can be viewed as a special case of (1.9) with σ1 = σ2.

Dai [3] also investigated global convergence properties of the FR conjugate gradient method with the Armijo line search, which is to determine the smallest integer m  0 such that, if one defines αk = λm, thenf(xk+αkdk)-f(xk)δαkgkTdk,where λ  (0, 1) and δ  (0, 1).

Wang and Zhang [15] proposed an s-dependent GFR conjugate gradient method and made two different kinds of estimations of upper bounds of βk with respect to βkFRβk2σˆk(s)(βkFR)2,where σˆ2(s)=1, σˆk(s)=jΩ1Γk-1(j)(s)Γk-2(j)(s), s  {1, 2}; k  3, Ω1 is a partition of set {1, 2,  , t}, t  1 is an integer and the definition of Γk(j)(s) is as follows:γi=|giTdi|gi2,Γk(0)(s)=i=1kγis,Γk(t)(s)=i=1kγisj=0t-1Γi(j)(s).They proved global convergence of the s-dependent GFR conjugate gradient method with some various line searches.

In this paper, we give new restriction to βk which is wider than the previous restriction, i.e. βk satisfiesj=1ki=j+1kβiβiFR2γjscj=0tΓk(j)(s),where c > 0 is a constant, t  1 is an integer and s  {1, 2}. Further, we also define iΩβiβiFR2=1 when Ω = ϕ.

Generally, the two regions of βk defined by (1.14) do not conclude each other.

In Section 2, we will investigate global convergence properties of the two new dependent FR conjugate gradient methods with the generalized Wolfe line search (1.6), (1.9), the Armijo line search (1.10) and the Wolfe line search respectively. In Section 3, we will indicate that some of the global convergent results in [3], [4], [5], [6], [7], [15], [16] can be viewed as a special case of the global convergent results in Section 2.

Section snippets

Main results

In this section, we always assume that ∥gk  0 for all k, for otherwise a stationary point has already been found. We make the following assumption on the objective function.

Assumption 2.1

  • (A)

    f is bounded below on the level set L={xRn:f(x)f(x1)}.

  • (B)

    The level set L={xRn:f(x)f(x1)} is bounded.

  • (C)

    In some neighborhood N of L, f is differentiable and its gradient g is Lipschitz continuous, namely, there exists a constant L > 0 such that

g(x)-g(y)Lx-y,for allx,yN.

We state a general convergence result as follows. This

Some remarks

From the results in Section 2, we can deduce some known results. First, letting c = 1 in (1.14), we know that βk satisfies (1.14) when it satisfies |βk|βkFR. Thus Theorem 2.3 in [4] and Theorem 3.2 in [5] are the special cases of Theorem 2.1, and Theorem 1 in [3] is a corollary of Theorem 2.3.

Second, we assume that βk satisfiesσβkβkFRσ¯,j=1ki=jk-1lick,where 0 < σ < 1, 0<σ¯12, li=βiβiFR2, C > 0. Eqs. (3.1), (3.2) can be used instead of (1.8) under the assumption (B). With the strong Wolfe line

Numerical results

In this section, we report numerical results for the two dependent FR conjugate gradient methods. For comparison purposes, we also coded the FR conjugate gradient method. Eight test problems having the form of f(x)=i=1mfi(x)2 are selected from Ref. [21]. Numerical results are given in Table 1, Table 2 (where m is the number of function fi and n is the dimension of x). In Table 1, Table 2, “Iter” denotes the number of iteration, “NF” denotes the number of function evaluation, and “NG” denotes

References (21)

  • B.T. Polyak

    The conjugate gradient method in extremem problems

    USSR Comput. Math. Math. Phys.

    (1969)
  • M. Al-Baali

    Descent property and global convergence of the Fletcher–Reeves method with inexact line search

    IMA J. Numer. Anal.

    (1985)
  • R.H. Byrd et al.

    A tool for the analysis of quasi-Newton methods with application to unconstrained minimization

    SIAM J. Numer. Anal.

    (1989)
  • Y.H. Dai

    Further insight into the convergence of the Fletcher–Reeves method

    Sci. China

    (1999)
  • Y.H. Dai et al.

    Convergence properties of the Fletcher–Reeves method

    IMA J. Numer. Anal.

    (1996)
  • J.C. Gilbert et al.

    Global convergence properties of conjugate gradient methods for optimization

    SIAM. J. Optim.

    (1992)
  • Y.F. Hu et al.

    Global convergence result for conjugate gradient methods

    JOTA

    (1991)
  • G. Liu et al.

    Global convergence of the Fletcher–Reeves algorithm with inexact line search

    Appl. Math. Chin. Univ., Series B

    (1995)
  • M.J.D. Powell

    Non-convex minimization calculation and the conjugate gradient method

    (1984)
  • M.J.D. Powell

    Restart procedure for the conjugate gradient method

    Math. Prog.

    (1977)
There are more references available in the full text version of this article.

Cited by (1)

This work is supported by N.S.F. of China (10571106) and the foundation of Qufu Normal University.

View full text