Skip to main content
Log in

A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations

  • Original Paper
  • Published:
Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Abstract

Consider using the right-preconditioned GMRES (AB-GMRES) for obtaining the minimum-norm solution of inconsistent underdetermined systems of linear equations. Morikuni (Ph.D. thesis, 2013) showed that for some inconsistent and ill-conditioned problems, the iterates may diverge. This is mainly because the Hessenberg matrix in the GMRES method becomes very ill-conditioned so that the backward substitution of the resulting triangular system becomes numerically unstable. We propose a stabilized GMRES based on solving the normal equations corresponding to the above triangular system using the standard Cholesky decomposition. This has the effect of shifting upwards the tiny singular values of the Hessenberg matrix which lead to an inaccurate solution. We analyze why the method works. Numerical experiments show that the proposed method is robust and efficient, not only for applying AB-GMRES to underdetermined systems, but also for applying GMRES to severely ill-conditioned range-symmetric systems of linear equations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. ADVANPIX LLC.: Multiprecision Computing Toolbox for MATLAB. https://www.advanpix.com/. Version 4.4.5.12711

  2. Björck, Å.: Numerical Methods for Least Squares Problems. SIAM. Philadelphia, PA (1996)

    Book  Google Scholar 

  3. Brezinski, C., Rodriguez, G., Seatzu, S.: Error estimates for the regularization of least squares problems. Numer. Algorithms 51(1), 61–76 (2009)

    Article  MathSciNet  Google Scholar 

  4. Brown, P., Walker, H.: GMRES on (nearly) singular systems. SIAM J. Matrix Anal. Appl. 18(1), 37–51 (1997)

    Article  MathSciNet  Google Scholar 

  5. Calvetti, D., Lewis, B., Reichel, L.: GMRES-type methods for inconsistent systems. Linear Algebra Appl. 316(1–3), 157–169 (2000)

    Article  MathSciNet  Google Scholar 

  6. Davis, T., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Software 38(1), 1–25 (2011)

    MathSciNet  MATH  Google Scholar 

  7. Fong, D.C.L., Saunders, M.: LSMR: An iterative algorithm for sparse least-squares problems. SIAM J. Sci. Comput. 33(5), 2950–2971 (2011)

    Article  MathSciNet  Google Scholar 

  8. Foster, L.: San Jose State University Singular Matrix Database. http://www.math.sjsu.edu/singular/matrices/

  9. Hansen, P.: Discrete Inverse Problems: Insight and Algorithms. SIAM. Philadelphia, PA (2010)

    Book  Google Scholar 

  10. Hayami, K., Sugihara, M.: A geometric view of Krylov subspace methods on singular systems. Numer. Linear Algebra Appl. 18(3), 449–469 (2011)

    Article  MathSciNet  Google Scholar 

  11. Hayami, K., Yin, J., Ito, T.: GMRES methods for least squares problems. SIAM J. Matrix Anal. Appl. 31(5), 2400–2430 (2010)

    Article  MathSciNet  Google Scholar 

  12. Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Research Nat. Bur. Standards 49(6), 409–436 (1952)

    Article  MathSciNet  Google Scholar 

  13. Higham, N.J.: The Test Matrix Toolbox for MATLAB (version 3.0). University of Manchester, Manchester (1995)

  14. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM. Philadelphia, PA (2002)

    Book  Google Scholar 

  15. Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, New York, NY (2012)

    Book  Google Scholar 

  16. Iri, M.: General Theory of Linear Algebra. Asakura, (in Japanese) (2009)

  17. Meza, J.C., Symes, W.W.: Deflated Krylov subspace methods for nearly singular linear systems. J. Optim. Theory Appl. 72(3), 441–457 (1992)

    Article  MathSciNet  Google Scholar 

  18. Morikuni, K.: Inner-iteration Preconditioning for Least Squares Problems. Doctoral Thesis, Department of Informatics, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies (2013)

  19. Morikuni, K., Hayami, K.: Convergence of inner-iteration GMRES methods for rank-deficient least squares problems. SIAM J. Matrix Anal. Appl. 36(1), 225–250 (2015)

    Article  MathSciNet  Google Scholar 

  20. Morikuni, K., Rozložník, M.: On GMRES for singular EP and GP systems. SIAM J. Matrix Anal. Appl. 39(2), 1033–1048 (2018)

    Article  MathSciNet  Google Scholar 

  21. Neuman, A., Reichel, L., Sadok, H.: Algorithms for range restricted iterative methods for linear discrete ill-posed problems. Numer. Algorithms 59(2), 325–331 (2012)

    Article  MathSciNet  Google Scholar 

  22. Neuman, A., Reichel, L., Sadok, H.: Implementations of range restricted iterative methods for linear discrete ill-posed problems. Linear Algebra Appl. 436(10), 3974–3990 (2012)

    Article  MathSciNet  Google Scholar 

  23. Paige, C., Rozložník, M., Strakoš, Z.: Modified Gram-Schmidt (mgs), least squares, and backward stability of MGS-GMRES. SIAM J. Matrix Anal. Appl. 28(1), 264–284 (2006)

    Article  MathSciNet  Google Scholar 

  24. Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12(4), 617–629 (1975)

    Article  MathSciNet  Google Scholar 

  25. Paige, C.C., Saunders, M.A.: LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software 8(1), 43–71 (1982)

    Article  MathSciNet  Google Scholar 

  26. Reichel, L., Ye, Q.: Breakdown-free GMRES for singular systems. SIAM J. Matrix Anal. Appl. 26(4), 1001–1021 (2005)

    Article  MathSciNet  Google Scholar 

  27. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM. Philadelphia, PA (2003)

    Book  Google Scholar 

  28. Saad, Y., Schultz, M.H.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)

    Article  MathSciNet  Google Scholar 

  29. Sugihara, K., Hayami, K., Liao, Z.: GMRES using pseudo-inverse for range symmetric singular systems. (in revision)

  30. Tebbens, J.D., Tůma, M.: On incremental condition estimators in the 2-norm. SIAM J. Anal. Appl. 35(1), 174–197 (2014)

    Article  MathSciNet  Google Scholar 

  31. Yamamoto, Y., Nakatsukasa, Y., Yanagisawa, Y., Fukaya, T.: Roundoff error analysis of the CholeskyQR2 algorithm. Electron. Trans. Numer. Anal. 44(01), 306–326 (2015)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Dr. Hiroshi Murakami and Professor Lothar Reichel for valuable comments. Ken Hayami was supported by JSPS KAKENHI Grant Number JP15K04768. Keiichi Morikuni was supported by JSPS KAKENHI Grant Number JP16K17639 and JP20K14356 and Hattori Hokokai Foundation. Jun-Feng Yin was supported by the National Natural Science Foundation of China (No. 11971354).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zeyu Liao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1 Proof of statement in section 2.3

Lemma 11

Assume \(\mathrm {\mathcal {N}}(\hat{A})\) \(\cap\) \(\mathrm {\mathcal {R}} (\hat{{A}})=\{0\}\), and \(\mathrm {grade} (\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k\). Then, \(\mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})\) = \(\hat{A}\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})\) holds.

Proof

Note that

$$\begin{aligned} \hat{{A}}\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})&=\mathrm {span}\{\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \hat{{A}}^\mathrm{2}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \cdots ,\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}\\&\subseteqq \mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}=\mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}}). \end{aligned}$$

grade\((\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k\) implies that

$$\begin{aligned} \mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})= \mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}. \end{aligned}$$

Hence,

$$\begin{aligned} \hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_0b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\quad c_i\in \mathbb {R}, i=0, 1, 2, \cdots , k-1. \end{aligned}$$

If \(c_0=0,\)

$$\begin{aligned} \hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Hence,

$$\begin{aligned}&c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\\&\quad =\hat{{A}}(c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=0. \end{aligned}$$

Hence,

$$\begin{aligned} c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\in \mathrm {\mathcal {N}}(\hat{{A}})\cap \mathrm {\mathcal {R}}(\hat{{A}}) = \{0\}. \end{aligned}$$

which implies

$$\begin{aligned} \hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathcal {K}_{k-1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}}), \end{aligned}$$

which contradicts with grade\((\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k\). Hence, \(c_0\ne 0\), and

$$\begin{aligned} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=d_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+d_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +d_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+d_k\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Hence,

$$\begin{aligned}&\mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{A}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}\\&\quad \subseteqq \mathrm {span}\{\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \hat{{A}}^\mathrm{2 }b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}, \cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}\}=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}). \end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}). \end{aligned}$$

Corollary 12

Assume \(\mathrm {\mathcal {N}}(\hat{A})= \mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})\), and \(\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k.\) Then, \(\mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})\) holds.

Proof

\(\mathrm {\mathcal {N}}(\hat{A}) = \mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})\) implies that

$$\begin{aligned} \mathrm {\mathcal {N}}(\hat{ {A}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\mathrm {\mathcal {R}}(\hat{ {A}})^{\bot }\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}. \end{aligned}$$

Hence, from Lemma 11, Corollary 12 holds.

Appendix 2 Proof of statement in section 2.3

Lemma 13

Assume \(\mathrm {\mathcal {N}}(\hat{A})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}\), \(\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k\), and \(b\notin \mathcal {R}(\hat{A})\). Then, \(\mathrm {dim}(\mathcal {K}_{k+1}(\hat{A},b))=k+1\) holds.

Proof

Let \(c_0, c_1, \cdots , c_k\in \mathbb {R}\) satisfy

$$\begin{aligned} c_0b+c_1\hat{ {A}}b+\cdots +c_{k}\hat{ {A}}^{k} b=0. \end{aligned}$$

Since \(\mathrm {\mathcal {N}}(\hat{A})\) \(\cap\) \(\mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}\),

$$\begin{aligned} b=b|_\mathrm{{\mathcal {R}}(\hat{ {A}})}\oplus b|_\mathrm{{\mathcal {N}}(\hat{ {A}})}, \end{aligned}$$

where \(b|_\mathrm{{\mathcal {N}}(\hat{ {A}})}\) denotes the orthogonal projection of b onto \(\mathcal {N}(\hat{ {A}}).\) Hence,

$$\begin{aligned} c_0b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}+c_0b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+c_1\hat{ {A}}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+\cdots +c_{k}\hat{ {A}}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}=0. \end{aligned}$$

If \(c_0\ne 0\)

$$\begin{aligned} b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}=-b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}-\frac{c_1}{c_0}\hat{A}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}-\cdots -\frac{c_{k}}{c_0}\hat{A}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}\in \mathrm {\mathcal {R}(\hat{ {A}})}. \end{aligned}$$

Hence,

$$\begin{aligned} b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}\in \mathrm {\mathcal {N}}(\hat{ {A}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}. \end{aligned}$$

Thus, \(b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}=0,\) which contradicts \(b\notin \mathcal {R}(\hat{ {A}})\). Hence, we have \(c_0= 0\), and

$$\begin{aligned} c_1\hat{A}b+c_2\hat{A}^2b+\cdots +c_{k}\hat{A}^{k} b=c_1\hat{A}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+c_2\hat{A}^2b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+\cdots +c_{k}\hat{A}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}=0. \end{aligned}$$

But, since

$$\begin{aligned}&{\mathrm{dim}}( {\mathrm{span}}\{\hat{{A}}b|_{{\mathcal {R}(\hat{{A}})}}, \hat{A}^{2}b|_{{\mathcal {R}(\hat{{A}})}}, \cdots ,\hat{A}^k b|_{{\mathcal {R}(\hat{{A}})}}\})\\&\quad ={\mathrm{dim}}( \hat{{A}}\, \mathrm{span}\{b|_{{\mathcal {R}(\hat{{A}})}},\hat{A}b|_{{\mathcal {R}(\hat{{A}})}}\cdots ,\hat{A}^{k-1} b|_{{\mathcal {R}(\hat{{A}})}}\})={\mathrm{dim}}( \hat{{A}}\mathcal {K}_{k}(\hat{A},b|_{{\mathcal {R}(\hat{{A}})}}))=k \end{aligned}$$

holds from Lemma 11, we have \(c_1=c_2=\cdots =c_k=0,\) which implies \(\mathrm dim( \mathcal {K}_{k+1}(\hat{ {A}},b))=k+1\).

Corollary 14

Assume \(\mathrm {\mathcal {N}}(\hat{ {A}})=\) \(\mathrm {\mathcal {N}}(\hat{A}^{\mathsf {T}})\), \(\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k\), and \(b\notin \mathcal {R}(\hat{A})\).

Then, \(\mathrm {dim}(\mathcal {K}_{k+1}(\hat{A},b))=k+1\) holds.

Proof

\(\mathrm {\mathcal {N}}(\hat{ {A}})=\) \(\mathrm {\mathcal {N}}(\hat{A}^{\mathsf {T}})\) implies \(\mathrm {\mathcal {N}}(\hat{A})\) \(\cap\) \(\mathrm {\mathcal {R}}(\hat{{A}})=\{0\}\). Hence, the corollary follows from Lemma 13.

Appendix 3 Proof of Theorem 9 in section 4.6

Theorom 9 Let \(R_k=(r_{pq})\in \mathbb {R}^{k\times k}\) be an upper-triangular matrix and

$$\begin{aligned} R_{k+\mathrm {1}}=\left( \begin{array}{cc} R_k &{} d \\ 0^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) \in \mathbb {R}^{(k+\mathrm {1} )\times (k+\mathrm {1} )}.\end{aligned}$$
(54)

Assume \(R_k\) is numerically nonsingular, and \(R_k=\mathcal {O}(\mathrm {1})\), \(R_k^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )\), \(M(R_k)^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )\), \(d=\mathbb {O}(\mathrm {1} )\) and \(O(k)=O(k^2)=O(\mathrm {1}).\) Then, the following holds:

$$\begin{aligned} {\mathrm{f\!l}} (R_{k+{1}}^{\mathsf {T}}R_{k+{1}})~is~numerically~nonsingular \quad \Longleftrightarrow \quad {\mathrm{f\!l }} (r^2_{k+{1},k+{1}})> {\mathrm{f\!l}} (d^{\mathsf {T}}d)O(u). \end{aligned}$$

Proof

Note that

$$\begin{aligned} R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }=\left( \begin{array}{cc} R_k &{} 0 \\ d^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) \left( \begin{array}{cc} R_k &{} d \\ 0^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) =\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k &{} R_k^{\mathsf {T}}d \\ d^{\mathsf {T}}R_k &{} d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2 \\ \end{array} \right) . \end{aligned}$$

Proof of (\(\Rightarrow\))

Assume fl\((r^2_{k+\mathrm {1} ,k+\mathrm {1} })\) \(\le\) fl\((d^{\mathsf {T}}d)O(u).\) Then, since

$$\begin{aligned} {\mathrm{f\!l}} (d^{\mathsf {T}}d)&=d^{\mathsf {T}}d+O(ku)d^{\mathsf {T}}d=({1}+O(ku))d^{\mathsf {T}}d,\\ {\mathrm{f\!l}} (d^{\mathsf {T}}d+r^{2}_{k+{1},k+{1}})&=(d^{\mathsf {T}}d+r^2_{k+{1},k+{1}})({1}+O(ku))=d^{\mathsf {T}}d({1}+O(ku)),\\ R_k&=\mathcal {O}({1}),\quad {{ \mathrm and}}\quad d=\mathbb {O}({1}), \end{aligned}$$

we have

$$\begin{aligned} \mathrm f\!l (R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} })&=\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k +O(ku)|R_k|^{\mathsf {T}}|R_k|&{} R_k^{\mathsf {T}}d+O(ku)|R_k|^{\mathsf {T}}|d| \\ d^{\mathsf {T}}R_k+O(ku)|d|^{\mathsf {T}}|R_k| &{} d^{\mathsf {T}}d+O(ku)d^{\mathsf {T}}d\\ \end{array} \right) \nonumber \\&=\left( \begin{array}{c}R_k^{\mathsf {T}}\\ d^{\mathsf {T}}\end{array}\right) \left( \begin{array}{cc}R_k&d\end{array}\right) +\mathcal {O}(ku). \end{aligned}$$
(55)

Note

$$\begin{aligned}\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) =-R_k R_k^{-\mathrm {1} }d+d=0, \end{aligned}$$

since \(R_k\) is nonsingular.

Hence,

$$\begin{aligned} {\mathrm{f\!l}}\left(\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-{1}}d\\ {1}\end{array}\right) \right)={\mathrm{f\!l}} \{R_k {\mathrm{f\!l}}(-R_k^{-{1}}d)+d\} = [{\mathrm{f\!l}} \{R_k{\mathrm{f\!l}}(-R_k^{-{1}}d)\}+d]\{{1}+O(u)\}. \end{aligned}$$

Note here that

$$\begin{aligned} \mathrm f\!l \{R_k\mathrm f\!l (-R_k^{-\mathrm {1} }d)\}=R_k\mathrm f\!l (-R_k^{-\mathrm {1} }d)+O(ku)|R_k||R_k^{-\mathrm {1} }d|, \end{aligned}$$

and

$$\begin{aligned} \mathrm f\!l (-R_k^{-\mathrm {1} }d)= -R_k^{-\mathrm {1} }d+O(k^\mathrm{2 }u)M(R_k)^{-\mathrm {1} }|d| \end{aligned}$$
(56)

from Theorem 8. Hence,

$$\begin{aligned} \mathrm f\!l (\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) ) = O(k^\mathrm{2 }u)R_sM(R_k)^{-\mathrm {1} }|d|+O(ku)|R_k||R_k^{-\mathrm {1} }d| =\mathbb {O}(k^\mathrm{2 }u), \end{aligned}$$

since \(R_k^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )\) and \(M(R_k)^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} ).\)

Then,

$$\begin{aligned} &\mathrm f\!l (R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) )\\ = \mathrm f\!l \left(\{\left( \begin{array}{c}R_k^{\mathsf {T}}\\ d^{\mathsf {T}}\end{array}\right) \left( \begin{array}{cc}R_k&d\end{array}\right) +\mathcal {O}(ku)\}\left( \begin{array}{c}-R_k^{-\mathrm {1} }d+\mathcal {O}(k^2u)M(R_k)^{-\mathrm {1} }|d|\\ \mathrm {1} \end{array}\right) \right)=\mathbb {O}(k^\mathrm{2 }u)=\mathbb {O}(u), \end{aligned}$$

since (55), (56), and \(O(k^2)=O(\mathrm {1} ).\) Since \(\left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) =\mathbb {O}(\mathrm {1} ),\) \(R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\) is numerically singular. By contraposition, (\(\Rightarrow\)) holds.

Proof of (\(\Leftarrow\))

Assume \(R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\) is not numerically nonsingular. Then, there exists a vector \(\left( \begin{array}{c} z \\ w \\ \end{array} \right) \in \mathbb {R}^{k+\mathrm {1} }\) such that \(\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| >\mathbb {O}(u),\) and

$$\begin{aligned} \mathrm f\!l \{R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) \}= R_{k+\mathrm {1} }^{\mathsf {T}}\left( R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) +|R_{k+\mathrm {1} }|\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| O((k+\mathrm {1} )u)\right) +\\ \left| R_{k+\mathrm {1} }^{\mathsf {T}}\right| \left| R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) +|R_{k+\mathrm {1} }|\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| O((k+\mathrm {1} )u)\right| O((k+\mathrm {1} )u)=\mathbb {O}(u) \end{aligned}$$

assuming \(O(k+\mathrm {1} )=O(\mathrm {1} ).\)

Hence,

$$\begin{aligned} \mathrm f\!l \{R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) \}=\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k &{} R_k^{\mathsf {T}}d \\ d^{\mathsf {T}}R_k &{} d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2 \\ \end{array} \right) \left( \begin{array}{c} z \\ w \\ \end{array} \right) +\mathbb {O}(u)=\mathbb {O}(u).\end{aligned}$$

Thus,

$$\begin{aligned}&R_k^{\mathsf {T}}R_sz+wR_k^{\mathsf {T}}d=\mathbb {O}(u), \end{aligned}$$
(57)
$$\begin{aligned}&d^{\mathsf {T}}R_sz+(d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2)w=\mathbb {O}(u). \end{aligned}$$
(58)

(57) can be expressed as \(R_k^{\mathsf {T}}(R_sz+wd)=\mathbb {O}(u).\) From Lemma 15, \(R_k^\mathsf {T}\) is numerically nonsingular, so that

$$\begin{aligned} R_sz+wd=\mathbb {O}(u). \end{aligned}$$
(59)

Hence, from (58), \(d^{\mathsf {T}}R_sz+w(d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2)=d^{\mathsf {T}}(R_sz+wd)+wr_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u).\) Thus, \(wr_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u)\). If \(w=O(u),\) \(R_sz=\mathbb {O}(u)\) from (59). Since \(R_k\) is numerically nonsingular, \(z=\mathbb {O}(u),\) which contradicts with the assumption.

Hence, \(|w|>O(u),\) so that \(r_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u),\) which gives

$$\begin{aligned} \mathrm f\!l (r_{k+\mathrm {1} ,k+\mathrm {1} }^\mathrm{2 }) = O(u)\le \mathrm f\!l ( d^{\mathsf {T}}d)O(u).\end{aligned}$$

Lemma 15

Let \(n=O(\mathrm {1} )\). If \(A\in \mathbb {R}^{n\times n}\) is numerically nonsingular, and \(A^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )\), then \(A^{\mathsf {T}}\) is numerically nonsingular.

Proof

If

$$\begin{aligned} \mathrm f\!l (A^{\mathsf {T}}x) = A^{\mathsf {T}}x+\mathbb {O}(nu)|A^{\mathsf {T}}||x|=\mathbb {O}(nu), \end{aligned}$$

then

$$\begin{aligned} \mathrm f\!l (x^{\mathsf {T}}A) = x^{\mathsf {T}}A+\mathbb {O}^{\mathsf {T}}(nu)=\mathbb {O}^{\mathsf {T}}(nu). \end{aligned}$$

Thus,

$$\begin{aligned} \mathrm{f\!l}(x^{\mathsf {T}}Ay) = \mathrm{f\!l} (x^{\mathsf {T}}A)y+O(nu)|\mathrm{f\!l}(x^{\mathsf {T}}A)||y|=O(nu) \end{aligned}$$

holds for all \(y=\mathbb {O}(\mathrm {1} ).\)

For arbitrary \(z=\mathbb {O}(\mathrm {1} )\in \mathbb {R}^{n},\) let

$$\begin{aligned}y=A^{-\mathrm {1} }z=\mathbb {O}(\mathrm {1} ).\end{aligned}$$

Then,

$$\begin{aligned} \mathrm f\!l( Ay) = Ay+O(nu)|A||y|=z+O(nu)|A||y|. \end{aligned}$$

Hence,

$$\begin{aligned} z=\mathrm f\!l (Ay) + O(nu)|A||y| = \mathrm f\!l (Ay) + \mathbb {O}(nu). \end{aligned}$$

Thus, we have

$$\begin{aligned} \mathrm f\!l( x^{\mathsf {T}}z) = x^{\mathsf {T}}z + O(nu)|x|^{\mathsf {T}}|z| =\mathrm f\!l (x^{\mathsf {T}}Ay) + O(nu)= O(nu) \end{aligned}$$

for arbitrary \(z=\mathbb {O}(\mathrm {1} )\in \mathbb {R}^n.\) Hence, \(x=\mathbb {O}(u),\) so that \(A^{\mathsf {T}}\) is numerically nonsingular.

Proof of Theorem 10 in section 5.1.2

Proof

Let the singular value decomposition of \(R_k\) be given by \(R_k=U\Sigma V^{\mathsf {T}}\in \mathbb {R}^{k\times k},\) where UV are orthogonal matrices and \(\Sigma =\mathrm diag (\sigma _\mathrm{1}, \sigma _2, \dots , \sigma _k).\) Let \(\rm{I}_k\in \mathbb {R}^{k\times k}\) be the identity matrix. Then, we have \(R_k'=\left( \begin{array}{c} R_k\\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right) =U'\Sigma 'V^{\mathsf {T}}\), where \(U'=\left( \begin{array}{cc} U &{} 0\\ 0 &{} V \end{array}\right)\) and \(\Sigma '=\left( \begin{array}{c} \Sigma \\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right) .\) Since \(\Sigma '^{\mathsf {T}}\Sigma '=\Sigma ^2+\lambda I_k=\mathrm diag (\sigma _1^2+\lambda , \sigma _2^2+\lambda , \dots , \sigma _{ k}^\mathrm{2}+\lambda ),\) the singular values of \(\left( \begin{array}{c} R_k\\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right)\) are \(\sqrt{\sigma _1^2+\lambda }\ge \sqrt{\sigma _2^2+\lambda }\ge \cdots \ge \sqrt{\sigma _k^2+\lambda }.\)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liao, Z., Hayami, K., Morikuni, K. et al. A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations. Japan J. Indust. Appl. Math. 39, 717–751 (2022). https://doi.org/10.1007/s13160-022-00505-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13160-022-00505-2

Keywords

Mathematics Subject Classification

Navigation