Exact solutions in low-rank approximation with zeros

https://doi.org/10.1016/j.laa.2022.01.021Get rights and content

Abstract

Low-rank approximation with zeros aims to find a matrix of fixed rank and with a fixed zero pattern that minimizes the Euclidean distance to a given data matrix. We study the critical points of this optimization problem using algebraic tools. In particular, we describe special linear, affine, and determinantal relations satisfied by the critical points. We also investigate the number of critical points and how this number is related to the complexity of nonnegative matrix factorization problem.

Introduction

The best rank-r approximation problem aims to find a real rank-r matrix that minimizes the Euclidean distance to a given real data matrix. The solution of this problem is completely addressed by the Eckart-Young-Mirsky theorem which states that the best rank-r approximation is given by the first r components of the singular value decomposition (SVD) of the data matrix.

We study the structured best rank-r approximation problem, namely we consider additional linear constraints on rank-r matrices. We focus on coordinate subspaces, i.e., linear spaces that are defined by setting some entries to zero. Let S[m]×[n] denote the indices of zero entries. Given U=(uij)Rm×n, our optimization problem becomesminXdU(X):=i=1mj=1n(xijuij)2s.t.xij=0(i,j)S and rank(X)r.

Structured low-rank approximation problem has been studied in [4], [20], [21]; see also [13] for low rank approximations with weights. Exact solutions to this problem have been investigated by Golub, Hoffman and Stewart [14], and by Ottaviani, Spaenlehauer and Sturmfels [24].

In [14], rank-r critical points are studied under the constraint that entries in a set of rows or in a set of columns of a matrix stay fixed. This situation is more general than ours in the aspect that the fixed entries are not required to be zero but more restrictive when it comes to the indices of the entries that are fixed. In [24], rank-r critical points restricted to generic subspaces of matrices are studied. In our paper, the linear spaces set some entries equal to zero and hence are not generic. Because of this, we cannot use many powerful tools from algebraic geometry and intersection theory and we have to come up with algebraic and computational techniques that exploit this special structure. For some properties of determinantal ideals of matrices with 0 entries and their relations to problems in graph theory we refer the reader to [9] and references therein. Horobet and Rodriguez study the problem when at least one solution of a certain family of optimization problems satisfies given polynomial conditions, and address the structured low-rank approximation as a particular case [17, Example 15].

The global minimum of the optimization problem (1.1) always exists, because we can select any point X in the feasible region and consider the feasible region intersected with the closed ball BdU(X)(U) centered at U and with radius dU(X). Since the feasible region is a closed semialgebraic set, then the intersection is closed and bounded, and hence compact. The distance function is continuous, thus achieves its minimum on this set. This minimum is a global minimum of (1.1). The optimization problem (1.1) is nonconvex and often local methods are used to solve it. They return a local minimum of the optimization problem. There are heuristics for finding a global minimum, but these heuristics do not guarantee that a local minimum is indeed a global minimum. We refer to [20] for various algorithms and to [31] for an algorithm with locally quadratic convergence. Cifuentes recently introduced convex relaxations for structured low-rank approximation that under certain assumptions have provable guarantees [6]. Another interesting direction, closely related but not directly applicable to our problem, is to employ recent optimization techniques for simultaneously sparse and low rank approximation [27], [30].

To compute a global minimizer of (1.1) algebraically, we need to look at all the complex critical points of the polynomial function dU:Cm×nC on the intersection LrS:=XrLS, whereXr:={XCm×n|rank(X)r},LS:={XCm×n|xij=0(i,j)S}, and then select the real solution that minimizes the Euclidean distance. The problem of finding critical points of dU on LrS can be considered in the more general setting when U is a complex data matrix. This setting includes the practically meaningful setting when U has real entries. If UCm×n is generic, namely if it belongs to the complement of a Zariski closed set, then the number of critical points is constant and is called the Euclidean Distance degree (ED degree) of LrS. We denote this invariant by EDdegree(LrS). The importance of the ED degree is that it measures the algebraic complexity of writing the optimal solution as a function of U. More generally, the ED degree of an algebraic variety is introduced in [10]. The main goal of this paper is to study the critical points and the ED degree of the minimization problem (1.1).

When rank is one, then characterizing critical points becomes a combinatorial problem. More precisely, listing all critical points translates to the problem of listing minimal vertex covers of a bipartite graph. The complexity of counting vertex covers in a bipartite graph is known to be #P-complete [25]. Our main result about rank-one critical points is Proposition 3.3 which gives the ED degree of L1S in terms of the minimal covers. For row/column and diagonal zero patterns this results in explicit formulas (Corollary 3.8, Corollary 3.9).

Our first main result for rank-r critical points is Theorem 4.3 which studies the linear span of rank-r critical points of dU. We call it the critical space in the structured setting. This is motivated by the notion of critical space of a tensor in the unstructured setting defined by Draisma, Ottaviani and Tocino [12]. From the algebraic perspective, Theorem 4.3 provides a lower bound on the minimal number of generators of degree one in the zero dimensional ideal of rank-r critical points of dU. When LrS is an irreducible variety, we expect this lower bound to be also an upper bound, as stated in Conjecture 4.4.

In the unstructured setting, the rank-one critical points form a basis of the critical space and the rank-r critical points are linear combinations of the basis vectors with coefficients in {0,1}. In the structured setting, there are not enough rank-one critical points to give a basis of the critical space. We leave it as an open question, whether there is a natural extension to a basis and whether the coefficients that give rank-r critical points as linear combinations of basis elements have a nice description.

Our second main result is Proposition 4.12 that describes affine linear relations that are satisfied by the rank-r critical points of dU in the unstructured setting. In the structured setting, we conjecture the affine linear relations satisfied by the rank-r critical points of dU. The last kind of constraints satisfied by the rank-r critical points that we consider are nonlinear determinantal constraints given in Proposition 4.18. The ED degree of dU is studied in Section 5. Our experiments indicate that the ED degree is exponential in |S|.

The optimization problem (1.1) is motivated by the nonnegative matrix factorization (NMF) problem. Given a nonnegative matrix XR0m×n, the nonnegative rank of X is the smallest r such thatX=AB, where AR0m×r and BR0r×n. NMF aims to find a matrix X of nonnegative rank at most r that minimizes the Euclidean distance to a given data matrix UR0m×n, see [15] for further details.

In Section 6, we apply the structured best rank-two approximation problem to NMF. Let M2 be the set of matrices of nonnegative rank at most two and consider a matrix UR03×3. In order to compute the best nonnegative rank-2 approximation of U, we need to compute the critical points of the Euclidean distance function dU over M2LS for all zero patterns S[3]×[3]. We show that the minimal number of critical points needed to determine the global minimum of dU is 756 for a generic U. For the same case, we show experimentally that the optimal critical point may have a few zeros.

The rest of the paper is organized as follows. In Section 2 we set our notations (Section 2.1), we recall the basics of ED minimization on an algebraic variety (Section 2.2) and we discuss Frobenius distance minimization on a variety of low-rank matrices (Section 2.3). In Section 3 we address the best rank-one approximation problem with assigned zero patterns (Section 3.1) and best rank-r approximation for rectangular and block diagonal matrices (Section 3.2). In Section 4 we investigate special polynomial relations among the critical points of dU. In particular, in Sections 4.1 and 4.3 we concentrate on particular linear and affine relations among critical points respectively, and in Section 4.4 on some special nonlinear relations. Observations for generic linear constraints not necessarily coming from assigned zero patterns are given in Section 4.2. In Section 5 we provide conjectural ED degree formulas for special formats and zero patterns S, obtained from computational experiments. In Section 6 we relate the minimization problem (1.1) to nonnegative matrix factorization. The results of Sections 5 and 6 are supported by computations that use the HomotopyContinuation.jl [3] software package as well as the software Macaulay2 [16] and Maple™ 2016 [19]. The code can be found at github.com/kaiekubjas/exact-solutions-in-low-rank-approximation-with-zeros.

Section snippets

Preliminaries

The preliminaries section consists of three subsections on algebra basics and notations (Section 2.1), Euclidean distance minimization (Section 2.2) and unstructured low-rank approximation (Section 2.3).

Rank-one structured approximation and beyond

This section is divided into two subsections: In Section 3.1, we focus on rank-one approximation with zeros, and in Section 3.2, on the simplest cases of rank-r approximation with zeros for rectangular and block-diagonal matrices.

Special relations among critical points

In this section we provide (some of) the generators of the ideal of critical points on LrS of dU. In particular, in Sections 4.1 and 4.3 we concentrate on particular linear and affine relations among critical points respectively, and in Section 4.4 on some special nonlinear relations. Observations for generic linear constraints not necessarily coming from assigned zero patterns are given in Section 4.2.

We stress that in our statements we always consider a real m×n matrix U. However, as we

Computations of Euclidean distance degrees

In this section we present various experiments that study the ED degree of LrS, when r2 and the zero pattern S involves only elements in the diagonal.

First, we restrict to square matrices and consider the zero pattern S={(1,1)}. Since the number of (complex) critical points of dU on Ln1S is constant for a generic (complex) data matrix U, it is reasonable to apply a monodromy technique for computing these critical points numerically. For this, we use the HomotopyContinuation.jl [3] software

Nonnegative low-rank matrix approximation

In this section, we apply rank-two approximation with zeros to the problem of nonnegative rank-two approximation. Our goal is to find the best nonnegative rank-two approximation with a guarantee that we have found the correct solution. There are two options for the critical points of the Euclidean distance function over M2:

  • 1.

    A critical point of the Euclidean distance function over M2 is a critical point of the Euclidean distance function over the set X2 of matrices of rank at most two.

  • 2.

    A critical

Declaration of Competing Interest

No declaration of competing interest.

Acknowledgements

We thank Giorgio Ottaviani, Grégoire Sergeant-Perthuis, Pierre-Jean Spaenlehauer, and Bernd Sturmfels for helpful discussions and suggestions. We thank two anonymous reviewers for insightful comments which improved the original manuscript. Kaie Kubjas and Luca Sodomaco are partially supported by the Academy of Finland Grant No. 323416. Elias Tsigaridas is partially supported by ANR JCJC GALOP (ANR-17-CE40-0009), the PGMO grant ALMA, and the PHC GRAPE.

References (31)

  • Diego Cifuentes

    A convex relaxation to compute the nearest structured rank deficient matrix

    SIAM J. Matrix Anal. Appl.

    (2021)
  • David Cox et al.

    Ideals, Varieties, and Algorithms

    (1992)
  • Aldo Conca et al.

    Lovász–Saks–Schrijver ideals and coordinate sections of determinantal varieties

    Algebra Number Theory

    (2019)
  • Jan Draisma et al.

    The Euclidean distance degree of an algebraic variety

    Found. Comput. Math.

    (2016)
  • Dmitriy Drusvyatskiy et al.

    The Euclidean distance degree of orthogonally invariant matrix varieties

    Isr. J. Math.

    (2017)
  • Cited by (0)

    View full text