A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations

Liao, Zeyu; Hayami, Ken; Morikuni, Keiichi; Yin, Jun-Feng

doi:10.1007/s13160-022-00505-2

A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations

Original Paper
Published: 28 March 2022

Volume 39, pages 717–751, (2022)
Cite this article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Zeyu Liao ORCID: orcid.org/0000-0002-7539-9760¹,
Ken Hayami²,
Keiichi Morikuni³ &
…
Jun-Feng Yin⁴

286 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Consider using the right-preconditioned GMRES (AB-GMRES) for obtaining the minimum-norm solution of inconsistent underdetermined systems of linear equations. Morikuni (Ph.D. thesis, 2013) showed that for some inconsistent and ill-conditioned problems, the iterates may diverge. This is mainly because the Hessenberg matrix in the GMRES method becomes very ill-conditioned so that the backward substitution of the resulting triangular system becomes numerically unstable. We propose a stabilized GMRES based on solving the normal equations corresponding to the above triangular system using the standard Cholesky decomposition. This has the effect of shifting upwards the tiny singular values of the Hessenberg matrix which lead to an inaccurate solution. We analyze why the method works. Numerical experiments show that the proposed method is robust and efficient, not only for applying AB-GMRES to underdetermined systems, but also for applying GMRES to severely ill-conditioned range-symmetric systems of linear equations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted and deflated global GMRES algorithms for solving large Sylvester matrix equations

Article 18 September 2018

Convergence of a generalized PMHSS method for a class of singular block two-by-two linear systems

Article 05 November 2020

On GSOR-based iteration methods for solving weakly nonlinear systems with complex symmetric coefficient matrices

Article 03 April 2021

References

ADVANPIX LLC.: Multiprecision Computing Toolbox for MATLAB. https://www.advanpix.com/. Version 4.4.5.12711
Björck, Å.: Numerical Methods for Least Squares Problems. SIAM. Philadelphia, PA (1996)
Book Google Scholar
Brezinski, C., Rodriguez, G., Seatzu, S.: Error estimates for the regularization of least squares problems. Numer. Algorithms 51(1), 61–76 (2009)
Article MathSciNet Google Scholar
Brown, P., Walker, H.: GMRES on (nearly) singular systems. SIAM J. Matrix Anal. Appl. 18(1), 37–51 (1997)
Article MathSciNet Google Scholar
Calvetti, D., Lewis, B., Reichel, L.: GMRES-type methods for inconsistent systems. Linear Algebra Appl. 316(1–3), 157–169 (2000)
Article MathSciNet Google Scholar
Davis, T., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Software 38(1), 1–25 (2011)
MathSciNet MATH Google Scholar
Fong, D.C.L., Saunders, M.: LSMR: An iterative algorithm for sparse least-squares problems. SIAM J. Sci. Comput. 33(5), 2950–2971 (2011)
Article MathSciNet Google Scholar
Foster, L.: San Jose State University Singular Matrix Database. http://www.math.sjsu.edu/singular/matrices/
Hansen, P.: Discrete Inverse Problems: Insight and Algorithms. SIAM. Philadelphia, PA (2010)
Book Google Scholar
Hayami, K., Sugihara, M.: A geometric view of Krylov subspace methods on singular systems. Numer. Linear Algebra Appl. 18(3), 449–469 (2011)
Article MathSciNet Google Scholar
Hayami, K., Yin, J., Ito, T.: GMRES methods for least squares problems. SIAM J. Matrix Anal. Appl. 31(5), 2400–2430 (2010)
Article MathSciNet Google Scholar
Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Research Nat. Bur. Standards 49(6), 409–436 (1952)
Article MathSciNet Google Scholar
Higham, N.J.: The Test Matrix Toolbox for MATLAB (version 3.0). University of Manchester, Manchester (1995)
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM. Philadelphia, PA (2002)
Book Google Scholar
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, New York, NY (2012)
Book Google Scholar
Iri, M.: General Theory of Linear Algebra. Asakura, (in Japanese) (2009)
Meza, J.C., Symes, W.W.: Deflated Krylov subspace methods for nearly singular linear systems. J. Optim. Theory Appl. 72(3), 441–457 (1992)
Article MathSciNet Google Scholar
Morikuni, K.: Inner-iteration Preconditioning for Least Squares Problems. Doctoral Thesis, Department of Informatics, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies (2013)
Morikuni, K., Hayami, K.: Convergence of inner-iteration GMRES methods for rank-deficient least squares problems. SIAM J. Matrix Anal. Appl. 36(1), 225–250 (2015)
Article MathSciNet Google Scholar
Morikuni, K., Rozložník, M.: On GMRES for singular EP and GP systems. SIAM J. Matrix Anal. Appl. 39(2), 1033–1048 (2018)
Article MathSciNet Google Scholar
Neuman, A., Reichel, L., Sadok, H.: Algorithms for range restricted iterative methods for linear discrete ill-posed problems. Numer. Algorithms 59(2), 325–331 (2012)
Article MathSciNet Google Scholar
Neuman, A., Reichel, L., Sadok, H.: Implementations of range restricted iterative methods for linear discrete ill-posed problems. Linear Algebra Appl. 436(10), 3974–3990 (2012)
Article MathSciNet Google Scholar
Paige, C., Rozložník, M., Strakoš, Z.: Modified Gram-Schmidt (mgs), least squares, and backward stability of MGS-GMRES. SIAM J. Matrix Anal. Appl. 28(1), 264–284 (2006)
Article MathSciNet Google Scholar
Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12(4), 617–629 (1975)
Article MathSciNet Google Scholar
Paige, C.C., Saunders, M.A.: LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software 8(1), 43–71 (1982)
Article MathSciNet Google Scholar
Reichel, L., Ye, Q.: Breakdown-free GMRES for singular systems. SIAM J. Matrix Anal. Appl. 26(4), 1001–1021 (2005)
Article MathSciNet Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM. Philadelphia, PA (2003)
Book Google Scholar
Saad, Y., Schultz, M.H.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)
Article MathSciNet Google Scholar
Sugihara, K., Hayami, K., Liao, Z.: GMRES using pseudo-inverse for range symmetric singular systems. (in revision)
Tebbens, J.D., Tůma, M.: On incremental condition estimators in the 2-norm. SIAM J. Anal. Appl. 35(1), 174–197 (2014)
Article MathSciNet Google Scholar
Yamamoto, Y., Nakatsukasa, Y., Yanagisawa, Y., Fukaya, T.: Roundoff error analysis of the CholeskyQR2 algorithm. Electron. Trans. Numer. Anal. 44(01), 306–326 (2015)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank Dr. Hiroshi Murakami and Professor Lothar Reichel for valuable comments. Ken Hayami was supported by JSPS KAKENHI Grant Number JP15K04768. Keiichi Morikuni was supported by JSPS KAKENHI Grant Number JP16K17639 and JP20K14356 and Hattori Hokokai Foundation. Jun-Feng Yin was supported by the National Natural Science Foundation of China (No. 11971354).

Author information

Authors and Affiliations

Department of Informatics, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies (SOKENDAI), 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
Zeyu Liao
National Institute of Informatics, and The Graduate University for Advanced Studies (SOKENDAI), 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
Ken Hayami
Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8573, Japan
Keiichi Morikuni
School of Mathematical Science, Tongji University, Siping Road 1239, Yangpu District, Shanghai, 200092, China
Jun-Feng Yin

Authors

Zeyu Liao
View author publications
You can also search for this author in PubMed Google Scholar
Ken Hayami
View author publications
You can also search for this author in PubMed Google Scholar
Keiichi Morikuni
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Feng Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zeyu Liao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1 Proof of statement in section 2.3

Lemma 11

Assume $\mathrm {\mathcal {N}}(\hat{A})$ $\cap$ $\mathrm {\mathcal {R}} (\hat{{A}})=\{0\}$, and $\mathrm {grade} (\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k$. Then, $\mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})$ = $\hat{A}\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})$ holds.

Proof

Note that

$$\begin{aligned} \hat{{A}}\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})&=\mathrm {span}\{\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \hat{{A}}^\mathrm{2}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \cdots ,\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}\\&\subseteqq \mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}=\mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}}). \end{aligned}$$

grade$(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k$ implies that

$$\begin{aligned} \mathcal {K}_{k+1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathcal {K}_{k}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})= \mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}. \end{aligned}$$

Hence,

$$\begin{aligned} \hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_0b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\quad c_i\in \mathbb {R}, i=0, 1, 2, \cdots , k-1. \end{aligned}$$

If $c_0=0,$

$$\begin{aligned} \hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Hence,

$$\begin{aligned}&c_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\\&\quad =\hat{{A}}(c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=0. \end{aligned}$$

Hence,

$$\begin{aligned} c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}-\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\in \mathrm {\mathcal {N}}(\hat{{A}})\cap \mathrm {\mathcal {R}}(\hat{{A}}) = \{0\}. \end{aligned}$$

which implies

$$\begin{aligned} \hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=c_1b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+c_2\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +c_{k-1}\hat{{A}}^{k-2} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathcal {K}_{k-1}(\hat{{A}},b|_\mathrm{{\mathcal {R}(\hat{{A}})}}), \end{aligned}$$

which contradicts with grade$(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=k$. Hence, $c_0\ne 0$, and

$$\begin{aligned} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}=d_1\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+d_2\hat{{A}}^2b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+\cdots +d_{k-1}\hat{{A}}^{k-1} b|_\mathrm{{\mathcal {R}(\hat{{A}})}}+d_k\hat{{A}}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}. \end{aligned}$$

Hence,

$$\begin{aligned}&\mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{{A}})}})=\mathrm {span}\{b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\hat{A}b|_\mathrm{{\mathcal {R}(\hat{{A}})}},\cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{{A}})}}\}\\&\quad \subseteqq \mathrm {span}\{\hat{{A}}b|_\mathrm{{\mathcal {R}(\hat{{A}})}}, \hat{{A}}^\mathrm{2 }b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}, \cdots ,\hat{A}^k b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}\}=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}). \end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}). \end{aligned}$$

Corollary 12

Assume $\mathrm {\mathcal {N}}(\hat{A})= \mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})$, and $\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k.$ Then, $\mathcal {K}_{k+1}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=\hat{A}\mathcal {K}_{k}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})$ holds.

Proof

$\mathrm {\mathcal {N}}(\hat{A}) = \mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})$ implies that

$$\begin{aligned} \mathrm {\mathcal {N}}(\hat{ {A}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\mathrm {\mathcal {N}}(\hat{ {A}}^{\mathsf {T}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\mathrm {\mathcal {R}}(\hat{ {A}})^{\bot }\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}. \end{aligned}$$

Hence, from Lemma 11, Corollary 12 holds.

Appendix 2 Proof of statement in section 2.3

Lemma 13

Assume $\mathrm {\mathcal {N}}(\hat{A})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}$, $\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k$, and $b\notin \mathcal {R}(\hat{A})$. Then, $\mathrm {dim}(\mathcal {K}_{k+1}(\hat{A},b))=k+1$ holds.

Proof

Let $c_0, c_1, \cdots , c_k\in \mathbb {R}$ satisfy

$$\begin{aligned} c_0b+c_1\hat{ {A}}b+\cdots +c_{k}\hat{ {A}}^{k} b=0. \end{aligned}$$

Since $\mathrm {\mathcal {N}}(\hat{A})$ $\cap$ $\mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}$,

$$\begin{aligned} b=b|_\mathrm{{\mathcal {R}}(\hat{ {A}})}\oplus b|_\mathrm{{\mathcal {N}}(\hat{ {A}})}, \end{aligned}$$

where $b|_\mathrm{{\mathcal {N}}(\hat{ {A}})}$ denotes the orthogonal projection of b onto $\mathcal {N}(\hat{ {A}}).$ Hence,

$$\begin{aligned} c_0b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}+c_0b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+c_1\hat{ {A}}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+\cdots +c_{k}\hat{ {A}}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}=0. \end{aligned}$$

If $c_0\ne 0$

$$\begin{aligned} b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}=-b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}-\frac{c_1}{c_0}\hat{A}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}-\cdots -\frac{c_{k}}{c_0}\hat{A}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}\in \mathrm {\mathcal {R}(\hat{ {A}})}. \end{aligned}$$

Hence,

$$\begin{aligned} b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}\in \mathrm {\mathcal {N}}(\hat{ {A}})\cap \mathrm {\mathcal {R}}(\hat{ {A}})=\{0\}. \end{aligned}$$

Thus, $b|_\mathrm{{\mathcal {N}(\hat{ {A}})}}=0,$ which contradicts $b\notin \mathcal {R}(\hat{ {A}})$. Hence, we have $c_0= 0$, and

$$\begin{aligned} c_1\hat{A}b+c_2\hat{A}^2b+\cdots +c_{k}\hat{A}^{k} b=c_1\hat{A}b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+c_2\hat{A}^2b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}+\cdots +c_{k}\hat{A}^{k} b|_\mathrm{{\mathcal {R}(\hat{ {A}})}}=0. \end{aligned}$$

But, since

$$\begin{aligned}&{\mathrm{dim}}( {\mathrm{span}}\{\hat{{A}}b|_{{\mathcal {R}(\hat{{A}})}}, \hat{A}^{2}b|_{{\mathcal {R}(\hat{{A}})}}, \cdots ,\hat{A}^k b|_{{\mathcal {R}(\hat{{A}})}}\})\\&\quad ={\mathrm{dim}}( \hat{{A}}\, \mathrm{span}\{b|_{{\mathcal {R}(\hat{{A}})}},\hat{A}b|_{{\mathcal {R}(\hat{{A}})}}\cdots ,\hat{A}^{k-1} b|_{{\mathcal {R}(\hat{{A}})}}\})={\mathrm{dim}}( \hat{{A}}\mathcal {K}_{k}(\hat{A},b|_{{\mathcal {R}(\hat{{A}})}}))=k \end{aligned}$$

holds from Lemma 11, we have $c_1=c_2=\cdots =c_k=0,$ which implies $\mathrm dim( \mathcal {K}_{k+1}(\hat{ {A}},b))=k+1$.

Corollary 14

Assume $\mathrm {\mathcal {N}}(\hat{ {A}})=$ $\mathrm {\mathcal {N}}(\hat{A}^{\mathsf {T}})$, $\mathrm {grade}(\hat{A},b|_\mathrm{{\mathcal {R}(\hat{ {A}})}})=k$, and $b\notin \mathcal {R}(\hat{A})$.

Then, $\mathrm {dim}(\mathcal {K}_{k+1}(\hat{A},b))=k+1$ holds.

Proof

$\mathrm {\mathcal {N}}(\hat{ {A}})=$ $\mathrm {\mathcal {N}}(\hat{A}^{\mathsf {T}})$ implies $\mathrm {\mathcal {N}}(\hat{A})$ $\cap$ $\mathrm {\mathcal {R}}(\hat{{A}})=\{0\}$. Hence, the corollary follows from Lemma 13.

Appendix 3 Proof of Theorem 9 in section 4.6

Theorom 9 Let $R_k=(r_{pq})\in \mathbb {R}^{k\times k}$ be an upper-triangular matrix and

$$\begin{aligned} R_{k+\mathrm {1}}=\left( \begin{array}{cc} R_k &{} d \\ 0^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) \in \mathbb {R}^{(k+\mathrm {1} )\times (k+\mathrm {1} )}.\end{aligned}$$

(54)

Assume $R_k$ is numerically nonsingular, and $R_k=\mathcal {O}(\mathrm {1})$, $R_k^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )$, $M(R_k)^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )$, $d=\mathbb {O}(\mathrm {1} )$ and $O(k)=O(k^2)=O(\mathrm {1}).$ Then, the following holds:

$$\begin{aligned} {\mathrm{f\!l}} (R_{k+{1}}^{\mathsf {T}}R_{k+{1}})~is~numerically~nonsingular \quad \Longleftrightarrow \quad {\mathrm{f\!l }} (r^2_{k+{1},k+{1}})> {\mathrm{f\!l}} (d^{\mathsf {T}}d)O(u). \end{aligned}$$

Proof

Note that

$$\begin{aligned} R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }=\left( \begin{array}{cc} R_k &{} 0 \\ d^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) \left( \begin{array}{cc} R_k &{} d \\ 0^{\mathsf {T}} &{} r_{k+\mathrm {1} ,k+\mathrm {1} } \\ \end{array} \right) =\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k &{} R_k^{\mathsf {T}}d \\ d^{\mathsf {T}}R_k &{} d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2 \\ \end{array} \right) . \end{aligned}$$

Proof of ($\Rightarrow$)

Assume fl$(r^2_{k+\mathrm {1} ,k+\mathrm {1} })$ $\le$ fl$(d^{\mathsf {T}}d)O(u).$ Then, since

$$\begin{aligned} {\mathrm{f\!l}} (d^{\mathsf {T}}d)&=d^{\mathsf {T}}d+O(ku)d^{\mathsf {T}}d=({1}+O(ku))d^{\mathsf {T}}d,\\ {\mathrm{f\!l}} (d^{\mathsf {T}}d+r^{2}_{k+{1},k+{1}})&=(d^{\mathsf {T}}d+r^2_{k+{1},k+{1}})({1}+O(ku))=d^{\mathsf {T}}d({1}+O(ku)),\\ R_k&=\mathcal {O}({1}),\quad {{ \mathrm and}}\quad d=\mathbb {O}({1}), \end{aligned}$$

we have

$$\begin{aligned} \mathrm f\!l (R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} })&=\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k +O(ku)|R_k|^{\mathsf {T}}|R_k|&{} R_k^{\mathsf {T}}d+O(ku)|R_k|^{\mathsf {T}}|d| \\ d^{\mathsf {T}}R_k+O(ku)|d|^{\mathsf {T}}|R_k| &{} d^{\mathsf {T}}d+O(ku)d^{\mathsf {T}}d\\ \end{array} \right) \nonumber \\&=\left( \begin{array}{c}R_k^{\mathsf {T}}\\ d^{\mathsf {T}}\end{array}\right) \left( \begin{array}{cc}R_k&d\end{array}\right) +\mathcal {O}(ku). \end{aligned}$$

(55)

Note

$$\begin{aligned}\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) =-R_k R_k^{-\mathrm {1} }d+d=0, \end{aligned}$$

since $R_k$ is nonsingular.

Hence,

$$\begin{aligned} {\mathrm{f\!l}}\left(\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-{1}}d\\ {1}\end{array}\right) \right)={\mathrm{f\!l}} \{R_k {\mathrm{f\!l}}(-R_k^{-{1}}d)+d\} = [{\mathrm{f\!l}} \{R_k{\mathrm{f\!l}}(-R_k^{-{1}}d)\}+d]\{{1}+O(u)\}. \end{aligned}$$

Note here that

$$\begin{aligned} \mathrm f\!l \{R_k\mathrm f\!l (-R_k^{-\mathrm {1} }d)\}=R_k\mathrm f\!l (-R_k^{-\mathrm {1} }d)+O(ku)|R_k||R_k^{-\mathrm {1} }d|, \end{aligned}$$

and

$$\begin{aligned} \mathrm f\!l (-R_k^{-\mathrm {1} }d)= -R_k^{-\mathrm {1} }d+O(k^\mathrm{2 }u)M(R_k)^{-\mathrm {1} }|d| \end{aligned}$$

(56)

from Theorem 8. Hence,

$$\begin{aligned} \mathrm f\!l (\left( \begin{array}{cc}R_k&d\end{array}\right) \left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) ) = O(k^\mathrm{2 }u)R_sM(R_k)^{-\mathrm {1} }|d|+O(ku)|R_k||R_k^{-\mathrm {1} }d| =\mathbb {O}(k^\mathrm{2 }u), \end{aligned}$$

since $R_k^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )$ and $M(R_k)^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} ).$

Then,

$$\begin{aligned} &\mathrm f\!l (R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) )\\ = \mathrm f\!l \left(\{\left( \begin{array}{c}R_k^{\mathsf {T}}\\ d^{\mathsf {T}}\end{array}\right) \left( \begin{array}{cc}R_k&d\end{array}\right) +\mathcal {O}(ku)\}\left( \begin{array}{c}-R_k^{-\mathrm {1} }d+\mathcal {O}(k^2u)M(R_k)^{-\mathrm {1} }|d|\\ \mathrm {1} \end{array}\right) \right)=\mathbb {O}(k^\mathrm{2 }u)=\mathbb {O}(u), \end{aligned}$$

since (55), (56), and $O(k^2)=O(\mathrm {1} ).$ Since $\left( \begin{array}{c}-R_k^{-\mathrm {1} }d\\ \mathrm {1} \end{array}\right) =\mathbb {O}(\mathrm {1} ),$ $R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }$ is numerically singular. By contraposition, ($\Rightarrow$) holds.

Proof of ($\Leftarrow$)

Assume $R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }$ is not numerically nonsingular. Then, there exists a vector $\left( \begin{array}{c} z \\ w \\ \end{array} \right) \in \mathbb {R}^{k+\mathrm {1} }$ such that $\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| >\mathbb {O}(u),$ and

$$\begin{aligned} \mathrm f\!l \{R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) \}= R_{k+\mathrm {1} }^{\mathsf {T}}\left( R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) +|R_{k+\mathrm {1} }|\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| O((k+\mathrm {1} )u)\right) +\\ \left| R_{k+\mathrm {1} }^{\mathsf {T}}\right| \left| R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) +|R_{k+\mathrm {1} }|\left| \left( \begin{array}{c} z \\ w \\ \end{array} \right) \right| O((k+\mathrm {1} )u)\right| O((k+\mathrm {1} )u)=\mathbb {O}(u) \end{aligned}$$

assuming $O(k+\mathrm {1} )=O(\mathrm {1} ).$

Hence,

$$\begin{aligned} \mathrm f\!l \{R_{k+\mathrm {1} }^{\mathsf {T}}R_{k+\mathrm {1} }\left( \begin{array}{c} z \\ w \\ \end{array} \right) \}=\left( \begin{array}{cc} R_k^{\mathsf {T}}R_k &{} R_k^{\mathsf {T}}d \\ d^{\mathsf {T}}R_k &{} d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2 \\ \end{array} \right) \left( \begin{array}{c} z \\ w \\ \end{array} \right) +\mathbb {O}(u)=\mathbb {O}(u).\end{aligned}$$

Thus,

$$\begin{aligned}&R_k^{\mathsf {T}}R_sz+wR_k^{\mathsf {T}}d=\mathbb {O}(u), \end{aligned}$$

(57)

$$\begin{aligned}&d^{\mathsf {T}}R_sz+(d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2)w=\mathbb {O}(u). \end{aligned}$$

(58)

(57) can be expressed as $R_k^{\mathsf {T}}(R_sz+wd)=\mathbb {O}(u).$ From Lemma 15, $R_k^\mathsf {T}$ is numerically nonsingular, so that

$$\begin{aligned} R_sz+wd=\mathbb {O}(u). \end{aligned}$$

(59)

Hence, from (58), $d^{\mathsf {T}}R_sz+w(d^{\mathsf {T}}d+r_{k+\mathrm {1} ,k+\mathrm {1} }^2)=d^{\mathsf {T}}(R_sz+wd)+wr_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u).$ Thus, $wr_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u)$. If $w=O(u),$ $R_sz=\mathbb {O}(u)$ from (59). Since $R_k$ is numerically nonsingular, $z=\mathbb {O}(u),$ which contradicts with the assumption.

Hence, $|w|>O(u),$ so that $r_{k+\mathrm {1} ,k+\mathrm {1} }^2=O(u),$ which gives

$$\begin{aligned} \mathrm f\!l (r_{k+\mathrm {1} ,k+\mathrm {1} }^\mathrm{2 }) = O(u)\le \mathrm f\!l ( d^{\mathsf {T}}d)O(u).\end{aligned}$$

Lemma 15

Let $n=O(\mathrm {1} )$. If $A\in \mathbb {R}^{n\times n}$ is numerically nonsingular, and $A^{-\mathrm {1} }=\mathcal {O}(\mathrm {1} )$, then $A^{\mathsf {T}}$ is numerically nonsingular.

Proof

If

$$\begin{aligned} \mathrm f\!l (A^{\mathsf {T}}x) = A^{\mathsf {T}}x+\mathbb {O}(nu)|A^{\mathsf {T}}||x|=\mathbb {O}(nu), \end{aligned}$$

then

$$\begin{aligned} \mathrm f\!l (x^{\mathsf {T}}A) = x^{\mathsf {T}}A+\mathbb {O}^{\mathsf {T}}(nu)=\mathbb {O}^{\mathsf {T}}(nu). \end{aligned}$$

Thus,

$$\begin{aligned} \mathrm{f\!l}(x^{\mathsf {T}}Ay) = \mathrm{f\!l} (x^{\mathsf {T}}A)y+O(nu)|\mathrm{f\!l}(x^{\mathsf {T}}A)||y|=O(nu) \end{aligned}$$

holds for all $y=\mathbb {O}(\mathrm {1} ).$

For arbitrary $z=\mathbb {O}(\mathrm {1} )\in \mathbb {R}^{n},$ let

$$\begin{aligned}y=A^{-\mathrm {1} }z=\mathbb {O}(\mathrm {1} ).\end{aligned}$$

Then,

$$\begin{aligned} \mathrm f\!l( Ay) = Ay+O(nu)|A||y|=z+O(nu)|A||y|. \end{aligned}$$

Hence,

$$\begin{aligned} z=\mathrm f\!l (Ay) + O(nu)|A||y| = \mathrm f\!l (Ay) + \mathbb {O}(nu). \end{aligned}$$

Thus, we have

$$\begin{aligned} \mathrm f\!l( x^{\mathsf {T}}z) = x^{\mathsf {T}}z + O(nu)|x|^{\mathsf {T}}|z| =\mathrm f\!l (x^{\mathsf {T}}Ay) + O(nu)= O(nu) \end{aligned}$$

for arbitrary $z=\mathbb {O}(\mathrm {1} )\in \mathbb {R}^n.$ Hence, $x=\mathbb {O}(u),$ so that $A^{\mathsf {T}}$ is numerically nonsingular.

Proof of Theorem 10 in section 5.1.2

Proof

Let the singular value decomposition of $R_k$ be given by $R_k=U\Sigma V^{\mathsf {T}}\in \mathbb {R}^{k\times k},$ where U, V are orthogonal matrices and $\Sigma =\mathrm diag (\sigma _\mathrm{1}, \sigma _2, \dots , \sigma _k).$ Let $\rm{I}_k\in \mathbb {R}^{k\times k}$ be the identity matrix. Then, we have $R_k'=\left( \begin{array}{c} R_k\\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right) =U'\Sigma 'V^{\mathsf {T}}$, where $U'=\left( \begin{array}{cc} U &{} 0\\ 0 &{} V \end{array}\right)$ and $\Sigma '=\left( \begin{array}{c} \Sigma \\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right) .$ Since $\Sigma '^{\mathsf {T}}\Sigma '=\Sigma ^2+\lambda I_k=\mathrm diag (\sigma _1^2+\lambda , \sigma _2^2+\lambda , \dots , \sigma _{ k}^\mathrm{2}+\lambda ),$ the singular values of $\left( \begin{array}{c} R_k\\ \sqrt{\lambda }\rm{I}_k\\ \end{array}\right)$ are $\sqrt{\sigma _1^2+\lambda }\ge \sqrt{\sigma _2^2+\lambda }\ge \cdots \ge \sqrt{\sigma _k^2+\lambda }.$

About this article

Cite this article

Liao, Z., Hayami, K., Morikuni, K. et al. A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations. Japan J. Indust. Appl. Math. 39, 717–751 (2022). https://doi.org/10.1007/s13160-022-00505-2

Download citation

Received: 08 December 2021
Revised: 06 February 2022
Accepted: 28 February 2022
Published: 28 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s13160-022-00505-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations

Abstract

Access this article

Similar content being viewed by others

Weighted and deflated global GMRES algorithms for solving large Sylvester matrix equations

Convergence of a generalized PMHSS method for a class of singular block two-by-two linear systems

On GSOR-based iteration methods for solving weakly nonlinear systems with complex symmetric coefficient matrices

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1 Proof of statement in section 2.3

Lemma 11

Proof

Corollary 12

Proof

Appendix 2 Proof of statement in section 2.3

Lemma 13

Proof

Corollary 14

Proof

Appendix 3 Proof of Theorem 9 in section 4.6

Proof

Lemma 15

Proof

Proof of Theorem 10 in section 5.1.2

Proof

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A stabilized GMRES method for singular and severely ill-conditioned systems of linear equations

Abstract

Access this article

Similar content being viewed by others

Weighted and deflated global GMRES algorithms for solving large Sylvester matrix equations

Convergence of a generalized PMHSS method for a class of singular block two-by-two linear systems

On GSOR-based iteration methods for solving weakly nonlinear systems with complex symmetric coefficient matrices

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1 Proof of statement in section 2.3

Lemma 11

Proof

Corollary 12

Proof

Appendix 2 Proof of statement in section 2.3

Lemma 13

Proof

Corollary 14

Proof

Appendix 3 Proof of Theorem 9 in section 4.6

Proof

Lemma 15

Proof

Proof of Theorem 10 in section 5.1.2

Proof

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation