1 Introduction and Main Result

Let V be a complex vector space equipped with a nondegenerate symmetric bilinear form q ∈Sym2V, identified in this paper with its associated quadratic form. The orthogonal group SO(V,q) = SO(V ) consists of linear transformations of V leaving q invariant. Let XV be an algebraic variety defined over \(\mathbb {R}\), this includes the case when X is the cone over a projective variety defined over \(\mathbb {R}\). We assume that X is G-invariant for the action of a Lie group GSO(V ). In many cases of interest X is H-invariant for a larger group H and we can take G = SO(V ) ∩ H, see Section 2 for the case of partially symmetric tensors.

We denote by \(\mathfrak {g}=T_{e}G\) the Lie algebra of G, where e is the identity element, note that \(\mathfrak {g} \subset \mathfrak {so}(V) \). The tangent space to the orbit Gf at f is \(f+\mathfrak {g}\cdot f\). Denoting by Gf = {gG | gf = f} the isotropy group of f, we have \(\dim \mathfrak {g}\cdot f=\dim \mathfrak {g}-\dim G_{f}\).

We define for fV the critical space of f as the subspace

$$ H_{f}:=\left( \mathfrak{g}\cdot f\right)^{\perp}=\left\{v\in V~|~q(v,w)=0\quad\forall w\in\mathfrak{g}\cdot f\right\}. $$
(1.1)

We remark that \(\text {codim} H_{f}=\dim \mathfrak {g}\cdot f\), so we have \(\text {codim} H_{f}\le \dim \mathfrak {g}\) and the equality holds for general f in many cases, but it cannot hold in the cases when \(\dim \mathfrak {g}\ge \dim V\) (this happens when \(V=\mathbb {C}^{a}\otimes \mathbb {C}^{b}\otimes \mathbb {C}^{c}\) and c is large, in the setting of Section 2.2). Consider the function \(d_{f}\colon X\to \mathbb {C}\), df(x) = q(fx,fx), which, in the case f is real, extends the squared distance function from f defined over \(\mathbb {R}\).

Note that at a critical point x of df we have fx ∈ (TxX).

Lemma 1.1

If \(g\in \mathfrak {g}\) then q(gx,y) = −q(x,gy), in particular q(gx,x) = 0.

Proof

If \(\mathfrak {g}=0\) then the statement is trivial. If \(\mathfrak {g}\neq 0\), let g(t) ⊂ G be a path such that g(0) = e and \(\dot {g}(0)=g\). Taking the derivative at t = 0 of the constant function q(g(t) ⋅ x,g(t) ⋅ y) the thesis follows. □

Our main result is the following theorem. Its proof is quite simple, nevertheless we will see in the rest of the paper it has some nontrivial consequences.

Theorem 1.2

Let X be G-invariant for the action of GSO(V ).

  1. 1.

    The critical points of df on X lie in Hf.

  2. 2.

    When f is real, any closest point to f in \(X_{\mathbb {R}}\) (with respect to q) belongs to Hf.

  3. 3.

    fHf.

Proof

Let x be a critical point. We need to prove q(xf,gf) = 0 \(\forall g\in \mathfrak {g}\). We have \(q(g\cdot f,f)=0\quad \forall g\in \mathfrak {g}\) from Lemma 1.1. So it is enough to show that q(x,gf) = 0 \(\forall g\in \mathfrak {g}\). The crucial remark is that since X is G-invariant then \(\mathfrak {g}\cdot x\subset T_{x}X\). Since x is critical it follows the chain of equalities (the second and the third one by Lemma 1.1 0 = q(gx,xf) = −q(gx,f) = q(x,gf), which proves (1).

(2) is an immediate consequence of (1).

(3) follows by q(f,gf) = 0. □

A partial converse to Theorem 1.2 is the following.

Theorem 1.3

Let X be G-invariant for the action of GSO(V ). Let xHfX.

  1. 1.

    If the orbit Gx is dense in X then x is a critical point of df restricted to X.

  2. 2.

    If X is a cone, x is not isotropic and the orbit G ⋅ [x] is dense in \(\mathbb {P}X\) then there is \(\lambda \in \mathbb {C}\) such that λx is a critical point of df restricted to X.

Proof

We have the equality \(\mathfrak {g}\cdot x = T_{x}X\) by assumption, and with this equality all the steps of the proof of Theorem 1.2 are invertible. This proves (1). The assumption of (2) implies that \(\mathfrak {g}\cdot x +\langle x\rangle = T_{x}X\). Since x is not isotropic there is λ such that since q(λx,λxf) = 0, namely \(\lambda =\frac {q(x,f)}{q(x,x)}\), so that orthogonality is guaranteed on the subspace 〈x〉⊂ TxX. To check orthogonality on the remaining part of TxX we may replace x with \(\frac {q(x,f)}{q(x,x)} x\) and the same argument in (1) works. □

A stronger converse form will be proved for tensors, see Theorems 2.2 and 2.4 and for Grassmann varieties, see Theorem 3.1. In Theorem 3.4 we will compute the Euclidean Distance degree (EDdegree) of a complete flag variety with respect to the Frobenius product.

We recall that EDdegree(X) (introduced in [4] by following an idea by Bernd Sturmfels) is the number of critical points of df restricted to X for general f. In many cases of interest it happens that HfX is finite and reduced for general f, in these cases its cardinality counts EDdegree(X), see (2.4) and Theorem 3.4.

The critical space was introduced for tensors in [14] and for partially symmetric tensors in [5], see also [15]. In Corollary 2.6 we get an alternative proof of the fact proved in [5] by Draisma, Tocino and the author that any best rank q approximation of a partially symmetric tensor f lies in the critical space.

Our approach is somehow dual to the one in [3, 6], where EDdegree was considered in an orthogonally invariant setting, but certain subvarieties of X were constructed in order to cut transversally the orbits.

2 Symmetric and Partially Symmetric Tensors

2.1 Symmetric Tensors

Let W a space of dimension n + 1 and V = SymdW. We assume that W is equipped with a nondegenerate quadratic form qW and we choose coordinates in W such that \(q_{W}={\sum }_{i=0}^{n}{x_{i}^{2}}\). There is a unique nondegenerate bilinear form q such that

$$ q(x^{d},y^{d})=q_{W}(x,y)^{d}\quad\forall x, y\in W, $$
(2.1)

which is called the Frobenius (or Bombieri-Weyl) form. Since every polynomial in SymdW can be written as a sum of powers of linear forms, it is enough to ask (2.1) for any power xd, yd. The group G = SO(W,qW) acts over V by the analogous rule g ⋅ (xd) = (gx)d. We get the inclusion GSO(V,q), so that we are in the setting of Section 1; our aim is to apply Theorem 1.2. The Frobenius form has the coordinate expression \(q\left ({\sum }_{\alpha }{d\choose \alpha }f_{\alpha } x^{\alpha }, {\sum }_{\alpha }{d\choose \alpha }g_{\alpha } x^{\alpha }\right ) = {\sum }_{\alpha }{d\choose \alpha }f_{\alpha } g_{\alpha }\) which, up to a scalar factor, has the nice M2 [10] implementation


diff(f,g).

Note that SL(W) ∩ SO(V ) = SO(W), but we will not need this fact. The monomials are orthogonal but not orthonormal with respect to q.

Proposition 2.1

$$ \mathfrak{so}(W)\cdot f=\left\langle x_{j}\frac{\partial f}{\partial x_{i}}-x_{i}\frac{\partial f}{\partial x_{j}}\right\rangle_{0\le i< j\le n}. $$
(2.2)

Proof

It is convenient to denote

$$ D_{ij}(f)= x_{j}\frac{\partial f}{\partial x_{i}}-x_{i}\frac{\partial f}{\partial x_{j}}. $$
(2.3)

For any skew-symmetric matrix A we have that eA is orthogonal. Then f(etAx) is a path in the SO-orbit of f. By taking the derivative at t = 0 we get \({\sum }_{p=0}^{n}\frac {\partial f}{\partial x_{p}}(Ax)_{p}\in \mathfrak {so}(W)\cdot f\). By taking A = eijeji we get exactly Dijf and these elements span \(\mathfrak {so}(W)\cdot {f}\). □

The rank one tensors in SymdW have the form xd and make a cone over the Veronese variety \(v_{d}\mathbb {P}W\), where the origin has been removed from the cone. We recall that the eigenvectors of f ∈SymdW are the critical points of the function df(xd) = q(fxd) restricted to the rank one tensors [11, 12, 16, 17, 19]. In this paper we are interested in the condition xdHf, which does not distinguish between x and its scalar multiples, so by abuse of notation we may shift to projective space \(\mathbb {P}W\) and denote by the same symbol the point \(x\in \mathbb {P}W\). The eigenvectors correspond to the non isotropic x (i.e. q(x)≠ 0) such that ∇f(x) = x in \(\mathbb {P} W\), which means that any representatives of the right and the left hand side differ by a nonzero scalar multiple. The connection with (2.2) and (2.3) is that the eigenvectors of f make the base locus of the linear system 〈Dijf〉.

It follows from Theorem 1.2 that the eigenvectors of f lie in Hf (which is obvious from the above description since Dijf are the minors of the matrix \(\begin {pmatrix}\nabla f\\ x\end {pmatrix}\)) and moreover the critical points of df on the secant varieties of d-Veronese variety lie in Hf, which is not obvious from the definition and it was proved first in [5, Theorem 1.1]. We will state more precisely this claim in the more general setting of partially symmetric tensors in Corollary 2.6.

We give now a more precise converse to Theorem 1.2 (1) in the case when X is the cone of symmetric tensors of rank one.

Theorem 2.2

For general f ∈SymdW, \(H_{f}\cap v_{d}\mathbb {P} W\) consists exactly of the critical points of df restricted to \(v_{d}\mathbb {P} W\), namely of the eigenvectors of f.

Proof

Let \(v^{d}\in H_{f}\cap v_{d}\mathbb {P} W\). In particular q(vd,gf) = 0 for any \(g\in \mathfrak {g}\), which implies that Dijf vanishes at v. This is equivalent to the matrix

$$ \begin{pmatrix}\nabla f\\ x\end{pmatrix} $$

having rank one at v, which is the condition that v is eigenvector of f, if v is not isotropic. By [6, Lemma 4.2] the critical points of df for a general f avoid any proper closed subset of \(v_{d}\mathbb {P} W\), so for general f it is guaranteed that no isotropic v is found. □

Remark 2.3

Note that for even d, \(\mathfrak {g}\cdot (f+cq^{d/2})=\mathfrak {g}\cdot f + [q^{d/2}]\) for any \(c\in \mathbb {C}\setminus \{0\}\). Conversely, if \(\mathfrak {g}\cdot f=\mathfrak {g}\cdot h\) for general f,h then we get Hf = Hh, so that f, h have the same eigenvectors and Turatti proves in [20] (generalizing previous results from [1, 2]) that there exists \(c\in \mathbb {C}\) such that f + cqd/2 = h.

2.2 Partially Symmetric Tensors

Consider the tensor product \(\text {Sym}^{d_{1}} V_{1}\otimes \cdots \otimes \text {Sym}^{d_{k}}V_{k}=V\). We assume we have nondegenerate symmetric bilinear forms qi on Vi. V is equipped with the Frobenius form q such that on decomposable elements

$$ q\left( v_{1}^{d_{1}}\otimes\cdots\otimes v_{k}^{d_{k}}, w_{1}^{d_{1}}\otimes\cdots\otimes w_{k}^{d_{k}}\right) = {\prod}_{i=1}^{k} q_{i}(v_{i},w_{i})^{d_{i}}. $$

The decomposable elements make a cone over the Segre–Veronese variety \(X\simeq \mathbb {P} V_{1}\times \cdots \times \mathbb {P} V_{k}\) embedded in \(\mathbb {P} V\) with the line bundle \(\mathcal {O}(d_{1},\ldots , d_{k})\). The group G = SO(V1,q1) ×⋯ × SO(Vk,qk) acts on V, we have again the inclusion GSO(V,q) and Theorem 1.2 applies. Denote by \(x_{i,0}{\ldots } x_{i,n_{i}}\) an orthogonal coordinate system on Vi. Analogously to Proposition 2.1 the orbit \(\mathfrak {so}(V_{1})\times \cdots \times \mathfrak {so}(V_{k})\cdot f\) is spanned by \(x_{p,j}\frac {\partial f}{\partial x_{p,i}}-x_{p,i}\frac {\partial f}{\partial x_{p,j}}\) for 0 ≤ i < jnp, p = 1,…,k. It follows that the critical space Hf defined according to (1.1) coincides with the one defined in [5].

The critical points of df(x) = q(fx) restricted to the Segre–Veronese variety are the singular t-ples of f [12], their number is called EDdegree in [4] and it is counted by the formula in [7], see also [4, §8].

The proof of Theorem 2.2 generalizes to this setting and gives

Theorem 2.4

For general \(f\in \text {Sym}^{d_{1}}V_{1}\otimes \cdots \otimes \text {Sym}^{d_{k}}V_{k}=V\), let \(X\subset \mathbb {P} V\) be the Segre–Veronese variety of rank one tensors. HfX consists exactly of the singular t-ples of f.

Since general partially symmetric tensors f have trivial isotropic groups, in the binary case \(X_{\mathbf {d}}=\mathbb {P}^{1}\times \cdots \times \mathbb {P}^{1}\) embedded in \(\mathbb {P}(\text {Sym}^{d_{1}}\mathbb {C}^{2}\otimes \cdots \otimes \text {Sym}^{d_{k}}\mathbb {C}^{2})\) with the line bundle \(\mathcal {O}(d_{1},\ldots , d_{k})\) we have \(G=(\mathbb {C}^{\ast })^{k}\), \(\mathfrak {g}=\mathbb {C}^{k}\) and the nice coincidence \(\text {codim} H_{f}=k=\dim X_{\mathbf {d}}\). Hence the cardinality of the intersection between Hf and Xd can be counted by Bezout Theorem and it follows an alternative proof of the formula

$$ \text{EDdegree}(X_{\mathbf{d}})=\deg X_{\mathbf{d}}=k!d_{1}{\ldots} d_{k}, $$
(2.4)

already known from [7], [18, Eq. (1.6)]. Our approach explains that the resulting equality between EDdegree and \(\deg \) of Xd is not a coincidence. Note that Bezout Theorem applies when the intersection scheme has the expected codimension, without assuming Hf being general, see (1) in [8, §8.4]. We will apply again this approach to complete flag varieties in Theorem 3.4.

Example 2.5

If \(\dim A=\dim B=\dim C=2\) we denote by QA, (resp. QB, QC ) the isotropic quadric consisting of two points on \(\mathbb {P}(A)\) (resp. \(\mathbb {P}(B)\), \(\mathbb {P}(C)\)).

We have that \(\dim (\mathfrak {so}\cdot f)<3\) if and only if f belongs to one of the following six \(\mathbb {P}^{3}\) linearly embedded in \(\mathbb {P}(A\otimes B\otimes C)\) (each item consists of two \(\mathbb {P}^{3}\)’s)

$$ Q_{A}\times \mathbb{P}(B\otimes C),\quad Q_{B}\times\mathbb{P}(A\otimes C),\quad Q_{C}\times\mathbb{P}(A\otimes B). $$

The following result was proved in [5], joint with J. Draisma and A. Tocino. The proof given here, as a consequence of Theorem 1.2, is maybe simpler.

Corollary 2.6

[5, Theorem 1.1] Let Xq be the q-secant variety to the Segre–Veronese variety in \(\mathbb {P}\left (\text {Sym}^{d_{1}}V_{1}\otimes \cdots \otimes \text {Sym}^{d_{k}}V_{k}\right )\). Then the critical points of the distance function from a tensor f to Xq lie in Hf. In particular any best rank q approximation of f (when it exists) lie in Hf.

3 Grassmann and Flag Varieties

3.1 Grassmann Varieties

Let V = ∧kW, we consider the Grassmann variety Gr(k,W) of k-dimensional subspaces of W, its cone is embedded in V. Again, a nondegenerate quadratic form qW on W extends to the Frobenius form q on V by requiring \(q(v_{1}\wedge \cdots \wedge v_{k}, w_{1}\wedge \cdots \wedge w_{k})=\det \left (q_{W}(v_{i}, w_{j})\right )\) (the Gram determinant). If v = v1 ∧⋯ ∧ vk ∈∧kW then the derivative \(\frac {\partial v}{\partial x_{i}}\in \wedge ^{k-1}W\) is defined by the Leibniz formula

$$ \frac{\partial v}{\partial x_{i}} = {\sum}_{j=1}^{k} v_{1}\wedge{\cdots} \wedge \frac{\partial v_{j}}{\partial x_{i}}\wedge\cdots\wedge v_{d} $$

and extended by linearity to all ∧kW. This is compatible with the inclusion ∧kWWk and the form q just defined is the restriction of the Frobenius form on Wk of the previous section. The same formula (2.2) holds formally in case SO(W) acts on ∧kW.

$$ \mathfrak{so}(W)\cdot f=\left\langle \frac{\partial f}{\partial x_{i}}\wedge x_{j}-\frac{\partial f}{\partial x_{j}}\wedge x_{i}\right\rangle_{0\le i< j\le n}. $$
(3.1)

The EDdegree of Grassmann varieties with respect to the Frobenius form is still unknown in general.

For a general f ∈∧kW, we have that a non isotropic v = v1 ∧⋯ ∧ vk is a critical point for df if \(T(v_{1}\wedge \cdots \widehat {v_{i}}\cdots \wedge v_{k})=q(v_{i},-)\)i = 1,…k. Again, the proof of Theorem 2.2 generalizes to this setting and gives

Theorem 3.1

For general f ∈∧kW, HfGr(k,W) consists exactly of the critical points of df restricted to the Grassmann variety Gr(k,W).

3.2 Flag Varieties

For a flag variety X = SL(W)/P, where P is a parabolic subgroup of SL(W) [9, §23.3], embedded by a very ample line bundle L, HfX consists exactly of the critical points of df. The embedding space is a Schur module SαW where the Frobenius form is defined again by restriction of the one on Wk and again we have G = SO(W).

For complete flag varieties \(\mathbb {F}_{n}\), which parametrize complete flags (L1 ⊂⋯ ⊂ Ln) ⊂ W with \(\dim L_{i}=i\) (partial flags may miss some Li’s), the above principle becomes effective in computing the number of critical points. We recall that \(\dim \mathbb {F}_{n}=n(n+1)/2\) and that \(\mathbb {F}_{n}=SL(n+1)/B\) where B is the Borel subgroup of upper triangular matrices. The following two lemmas are well known, we include the proofs for the convenience of the reader.

Lemma 3.2

\(\chi (\mathbb {F}_{n},\mathbf {Z})=(n+1)!.\)

Proof

A general section of the tangent bundle \(T\mathbb {F}_{n}\) is given by a matrix ASL(n + 1) = SL(W) with distinct eigenvalues and corresponding eigenvectors v1,…vn+ 1. The zero locus of this section consists of A-invariant complete flags (L1 ⊂… ⊂ Ln) with \(\dim L_{i}=i\). There are (n + 1) choices for Ln, obtained by the span of n among the vi. For each Ln there are correspondingly n choices for Ln− 1, and so on there are (n + 1)! choices for each A-invariant complete flag. The thesis follows from Gauss–Bonnet theorem. □

Lemma 3.3

  1. (i)

    Let \(\mathbb {F}_{n}\) be embedded with the line bundle \(\mathcal {O}(a_{1},\ldots , a_{n})\) in the projective space over \(S^{a_{1},\ldots , a_{n}}\mathbb {C}^{n+1}\), the module with Young diagram having \({\sum }_{i=j}^{n}a_{i}\) boxes in the j th row. The degree of the embedded variety is

    $$ {{n+1}\choose 2}!{\prod}_{1\le i<j\le n+1}\frac{a_{i}+\cdots+a_{j-1}}{j-i}. $$
    (3.2)
  2. (ii)

    When ai = 1 we get \(\deg \mathbb {F}_{n}={{n+1}\choose 2}!\).

Proof

We have \(H^{0}(\mathbb {F}_{n},\mathcal {O}(a_{1},\ldots , a_{n}))={\prod }_{1\le i<j\le n+1}\frac {a_{i}+\cdots +a_{j-1}+j-i}{j-i}\) by Weyl character formula (see [9, equation (15.17)]). Then the Hilbert polynomial is \(H^{0}(\mathbb {F}_{n},\mathcal {O}(ta_{1},\ldots , ta_{n}))={\prod }_{1\le i<j\le n+1}\frac {t(a_{i}+\cdots +a_{j-1})+j-i}{j-i}\) and computing the leading term we get the thesis. In case (ii) the Hilbert polynomial simplifies to \(\chi (\mathbb {F}_{n},\mathcal {O}(t,\ldots , t))=(t+1)^{{n+1}\choose 2}\). □

Theorem 3.4

Let BSL(n + 1) be the Borel subgroup of upper triangular matrices. For a complete flag variety \(\mathbb {F}_{n}=SL(n+1)/B\), embedded by a very ample line bundle \(L=\mathcal {O}(a_{1},\ldots , a_{n})\), with respect to the Frobenius form we have that \(\text {EDdegree} \mathbb {F}_{n}=\deg \mathbb {F}_{n}\) is given by (3.2).

Proof

For general fH0(SL(n + 1)/B), we have again the nice coincidence that the codimension of Hf is equal to the dimension of \(\mathbb {F}_{n}\), which is \({{n+1}\choose 2}=\dim SO(n+1)\), so that the critical points are cut by a linear space of complementary dimension. □

Example 3.5

For n = 2, the flag variety SL(3)/B (see I § 3.1 of [13]) embedded with \(\mathcal {O}(a,b)\) has \(\text {EDdegree} \mathbb {F}_{2}=\deg \mathbb {F}_{2}=3ab(a+b)\).