Brought to you by:
Paper

An operational measure for squeezing

, and

Published 13 October 2016 © 2016 IOP Publishing Ltd
, , Citation Martin Idel et al 2016 J. Phys. A: Math. Theor. 49 445304 DOI 10.1088/1751-8113/49/44/445304

1751-8121/49/44/445304

Abstract

We propose and analyse a mathematical measure for the amount of squeezing contained in a continuous variable quantum state. We show that the proposed measure operationally quantifies the minimal amount of squeezing needed to prepare a given quantum state and that it can be regarded as a squeezing analogue of the 'entanglement of formation'. We prove that the measure is convex and subadditive and we provide analytic bounds as well as a numerical convex optimisation algorithm for its computation. By example, we then show that the amount of squeezing needed for the preparation of certain multi-mode quantum states can be significantly lower than naive state preparation suggests.

Export citation and abstract BibTeX RIS

1. Introduction

The interplay between quantum optics and the field of quantum information processing, in particular via the subfield of continuous variable quantum information, has been developing for several decades and is interesting also due to its experimental success (see [KL10] for a thorough introduction).

Coherent bosonic states and the broader class of Gaussian bosonic states, quantum states whose Wigner function is characterised by its first and second moments, are of particular interest in the theory of continuous variable quantum information. Their interest is also due to the fact that modes of light in optical experiments behave like Gaussian coherent states.

For any bosonic state, its matrix of second moments, the so called covariance matrix, must fulfil Heisenberg's uncertainty principle in all modes. If the state possesses a mode, where despite this inequality ${\rm{\Delta }}x{\rm{\Delta }}p\geqslant {\hslash }/2$ either ${\rm{\Delta }}x$ or ${\rm{\Delta }}p$ is strictly smaller than $\sqrt{{\hslash }/2}$, it is called squeezed. The production of squeezed states is experimentally possible, but it requires the use of nonlinear optical elements [Bra05], which are more difficult to produce and handle than the usual linear optics (i.e. beam splitters and phase shifters). Nevertheless, squeezed states play a crucial role in many experiments in quantum information processing and beyond. Therefore, it is natural both theoretically and practically to investigate the amount of squeezing which is necessary to create an arbitrary quantum state.

As a qualitative answer, squeezing is known to be an irreducible resource with respect to linear quantum optics [Bra05]. In the Gaussian case, it is also known to be closely related to entanglement of states [WEP03] and the non-additivity of quantum channel capacities [LGW13]. In addition, quantitative measures of squeezing have been provided on multiple occasions [Kra+03, Lee88], yet none of these measures are operational for more than a single mode in the sense that they do not measure the minimal amount of squeezing necessary to prepare a given state.

The goal of this paper is therefore twofold: first, we define and study operational squeezing measures, especially measures quantifying the amount of squeezing needed to prepare a given state. Second, we reinvestigate in how far squeezing is a resource in a mathematically rigorous manner and study the resulting resource theory by defining preparation measures.

In order to give a brief overview of the results, we assume the reader is familiar with standard notation of the field, which is also gathered in section 2. In particular, let γ denote covariance matrices. A squeezed state is a state where at least one of the eigenvalues of γ is smaller than one.

To obtain operational squeezing measures, we first study operational squeezing in section 3: suppose we want to implement an operation on our quantum state corresponding to some unitary U. Any such unitary can be implemented as the time-evolution of Hamiltonians. Recall that any quantum-optical Hamiltonian can be split into 'passive' and 'active' parts, where the passive parts are implementable by linear optics and the active parts require nonlinear media. We assume that the active transformations available are single-mode squeezers with Hamiltonian

where the j denotes squeezing in the jth mode. We therefore consider any Hamiltonian of the form

Equation (1)

where ck are complex coefficients, which can be seen as the interaction strength of the medium and ${H}_{\mathrm{passive}}$ is an arbitrary passive Hamiltonian. Then, a natural measure of the squeezing costs to implement this Hamiltonian would be given by

Our squeezing measure for the operation U is then defined as the mimimum of ${f}_{\mathrm{squeeze}}(H)$ for all Hamiltonians implementing the operation U of the form (1). With this definition, we have an operational measure answering the question: given an operation U, what is the most efficient way (in terms of squeezing) to implement it using passive operations and single-mode squeezers?

Instead of working with the generators, which are unbounded operators and therefore introduce a lot of analytic problems, we will work on the level of Wigner functions and therefore with the symplectic group. The unitary U then corresponds to a symplectic matrix S and we prove that the most efficient way to implement it is by using the Euler decomposition, also known as Bloch–Messiah decomposition. We show this result first in the case where the functions ci are step functions and later on in the more general case of measurable c (section 3.2). In particular, the result implies that the minimum amount of squeezing to implement the symplectic matrix $S\in {{\mathbb{R}}}^{2n\times 2n}$ is given by

Equation (2)

where ${s}_{i}^{\downarrow }$ denotes the ith singular value of S ordered decreasingly.

With this in mind, we define a squeezing measure for preparation procedures where one starts out with a covariance matrix of an unsqueezed state and then performs symplectic (and possibly other) operations to obtain the state. More precisely, we define

Equation (3)

One of the main results of this paper, which will be proven in section 5, is that this measure is indeed operational in that it quantifies the minimal amount of single-mode squeezing necessary to prepare a state with covariance matrix γ, using linear optics with single-mode squeezers, ancillas, measurements, convex combinations and addition of classical noise.

We also define a second squeezing measure, which is a squeezing-analogue of the entanglement of formation, the 'squeezing of formation', i.e. the amount of single-mode squeezed resource states needed to prepare a given state using only passive operations and adding of noise. This is done in section 5.3, where we also prove that this measure is equal to G.

In addition, we prove several structural facts about G in section 4. In particular, G is convex, lower semicontinuous everywhere, continuous on the interior and subadditive. Moreover, we show

with the eigenvalues ${\lambda }_{j}$ of γ. Equality in this lower bound is usually not achievable, albeit numerical tests have shown that the bound is often very good.

The measure would lose a lot of its appeal, if it could not be computed. Although we cannot give an efficient analytical formula for more than one mode, we provide a numerical algorithm to obtain G for any state. To demonstrate that this works in principle, we calculate G approximately for a state studied in [MK08] (section 6). The calculations also demonstrate that the preparation procedure obtained from minimising G can greatly lower the squeezing costs when compared to naive preparation procedures. Finally, we critically discuss the flexibility and applicability of our measures in section 7. We believe that while we managed to give reasonable measures and interesting tools to study the resource theory of squeezing from a theoretical perspective, G might not reflect the experimental reality in all parts. In particular, it becomes extraordinarily difficult to achieve high squeezing in a single mode [And+15], which is not reflected by taking the logarithm of the squeezing parameter. We show that this shortcoming can be easily corrected for a broad class of cost functions. In addition, the form of the active part of the Hamiltonian (1) might not reflect the form of the Hamiltonian in the lab. This cannot be corrected as easily but in any case, our measure will give a lower bound.

2. Preliminaries

In this section, we collect basic notions from continuous variable quantum information and symplectic linear algebra that we need later on. For a broader overview, we refer to [ARL14, BL05].

2.1. Phase space in quantum physics

Consider a bosonic system with n-modes, each of which is characterised by a pair of canonical variables $\{{Q}_{k},{P}_{k}\}$. Setting $R={({Q}_{1},{P}_{1},\ldots ,{Q}_{n},{P}_{n})}^{T}$ the canonical commutation relations (CCRs) take on the form $[{R}_{k},{R}_{l}]={\rm{i}}{\sigma }_{{kl}}$ with the standard symplectic form

Since it will sometimes be convenient, we also introduce another basis of the canonical variables: let $\tilde{R}={({Q}_{1},{Q}_{2},\ldots ,{Q}_{n},{P}_{1},{P}_{2},\ldots ,{P}_{n})}^{T}$, then the symplectic CCRs take on the form $[{\tilde{R}}_{k},{\tilde{R}}_{l}]={{\rm{i}}{J}}_{{kl}}$ with the symplectic form

Clearly, J and σ differ only by a permutation, since R and $\tilde{R}$ differ only by a permutation. From functional analysis, it is well-known that the operators Qk and Pk cannot be represented by bounded operators on a Hilbert space. In order to avoid complications associated to unbounded operators, it is usually easier to work with a representation of the CCR-relations on some Hilbert space ${ \mathcal H }$, instead. The standard representation is known as the Schrödinger representation and defines the Weyl system, a family of unitaries ${W}_{\xi }$ with $\xi \in {{\mathbb{R}}}^{2n}$ and

fulfiling the Weyl relations ${W}_{\xi }{W}_{\eta }={\exp }^{-{\rm{i}}/2\xi \sigma \eta }{W}_{\xi +\eta }$ for all $\xi ,\eta $. Such a system is unique up to isomorphism under further assumptions of continuity and irreducibility as obtained by the Stone–von Neumann theorem. Given ${W}_{\xi }$ it is important to note that

Equation (4)

In this paper, we will not use many properties of the Weyl system, since instead, we can work with the much simpler moments of the state: given a quantum state $\rho \in {{ \mathcal S }}_{1}({L}^{2}({{\mathbb{R}}}^{2n}))$ (trace-class operators on L2), its first and second centred moments are given by

Equation (5)

Equation (6)

with $\{\cdot ,\,\cdot \}{}_{+}$ the regular anticommutator. We will write Γ instead of γ for the covariance matrix, if we work with $\tilde{R}$ instead of R. Again, a simple permutation relates the two.

An important question one can ask is when a matrix γ can occur as a covariance matrix of a quantum state. The answer is given by Heisenberg's principle, which here takes the form of a matrix inequality:

Proposition 2.1. Let $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$, then there exists a quantum state $\rho $ with covariance matrix $\gamma $ if and only if

where $\geqslant $ denotes the standard partial order on matrices (i.e. $\gamma \geqslant {\rm{i}}\sigma $ if $\gamma -{\rm{i}}\sigma $ is positive semidefinite). Note that we leave out the usual factor of ${\hslash }/2$ to simplify notation.

Another question one might ask is when a covariance matrix belongs to a pure quantum state. This question cannot be answered without more information about the higher order terms If we however require the state to be uniquely determined by its first and second moments, i.e. if we consider the so called Gaussian states, we have an answer (see [ASI04]):

Proposition 2.2. Let $\rho $ be an $n$-mode Gaussian state (i.e. completely determined by its first and second moments), then $\rho $ is pure if and only if $\det ({\gamma }_{\rho })=1$.

2.2. The linear symplectic group and squeezing

A very important set of operations on a quantum system are those, that leave the CCRs invariant, i.e. linear transformations S such that $[{{SR}}_{k},{{SR}}_{l}]={\rm{i}}{\sigma }_{{kl}}$. Such transformations are called symplectic transformations.

Definition 2.3. Given a symplectic form σ on ${{\mathbb{R}}}^{2n\times 2n}$, the set of matrices $S\subset {{\mathbb{R}}}^{2n\times 2n}$ such that ${S}^{T}\sigma S=\sigma $ is called the linear symplectic group and is denoted by ${Sp}(2n,{\mathbb{R}},\sigma )$.

We will usually drop both σ and ${\mathbb{R}}$ in the description of the symplectic group since this will be clear from the context. The linear symplectic group is a Lie group and as such contains a lot of structure. For more information on the linear symplectic group and its connection to physics, we refer the reader to [Gos06, MS98] chapter 2. An overview for physicists is also found in [Arv+95a]. All of the following can be found in that paper:

Definition 2.4. Let $O(2n,{\mathbb{R}})$ be the real orthogonal group, Then we define the following three subsets of ${Sp}(2n)$:

The first subset is the maximally compact subgroup of ${Sp}(2n)$, the second subset is the subset of single-mode-squeezers. It generates the multiplicative subgroup ${ \mathcal A }(2n)$, a maximally abelian subgroup of ${Sp}(2n)$. The third set is the set of positive definite symplectic matrices.

In addition, since ${Sp}(2n)$ is a Lie group, it possesses a Lie algebra. Let us collect a number of relevant facts about the Lie algebra and some subsets:

Proposition 2.5. The Lie algebra ${\mathfrak{sp}}(2n)$ of ${Sp}(2n)$ is given by

together with the commutator as Lie bracket. Certain other Lie algebras or subsets of Lie algebras are of relevance to us:

  • (1)  
    ${\mathfrak{so}}(2n):= \{A\in {{\mathbb{R}}}^{2n\times 2n}| A+{A}^{T}=0\}$ the Lie algebra of ${SO}(2n)$.
  • (2)  
    ${\mathfrak{k}}(n):= \{A\in {{\mathbb{R}}}^{2n\times 2n}| A=\left(\begin{array}{cc}a & b\\ -b & a\end{array}\right),a=-{a}^{T},b={b}^{T}\}$ the Lie algebra of $K(n)$.
  • (3)  
    $\pi (n):= \{A\in {{\mathbb{R}}}^{2n\times 2n}| A=\left(\begin{array}{cc}a & b\\ b & -a\end{array}\right),a={a}^{T},b={b}^{T}\}$ the subspace of the Lie algebra ${\mathfrak{sp}}(2n)$ corresponding to ${\rm{\Pi }}(n)$.

Since the Lie algebra is a vector space, it is spanned by a set of vectors, the generators. A standard decomposition is given by taking the generators of ${\mathfrak{k}}(n)$, the so called passive transformations as one part and the generators of $\pi (n)$, the so called active transformations as the other part. That these two sets together determine the Lie algebra completely can be seen with the polar decomposition:

(Polar decomposition [Arv+95a]).

Proposition 2.6 For every symplectic matrix $S\in {Sp}(2n)$ there exists a unique $U\in K(n)$ and a unique $P\in {\rm{\Pi }}(n)$ such that $S={UP}$.

A basis for the Lie algebras ${\mathfrak{k}}(n)$ and $\pi (n)$ therefore characterises the complete Lie algebra ${\mathfrak{sp}}(2n)$. Elements of the Lie algebras are also called generators and a basis of generators therefore fixes the Lie algebra. Via the polar decomposition, this implies that they also generate the whole Lie group. We will need a set of generators ${g}_{{ij}}^{(p)}\in {\mathfrak{k}}(n)$ and ${g}_{{ij}}^{(a)}\in \pi (n)$ later on, which we will fix via the metaplectic representation:

(Metaplectic representation [Arv+95a]).

Proposition 2.7 Let ${W}_{\xi }$ be the continuous irreducible Weyl system defined above and let $S\in {Sp}(2n)$. Then there exists an up to a phase unique unitary ${U}_{S}$ with

Since we have the liberty of a phase, this is not really a representation of the symplectic group, but of its two-fold cover, the metaplectic group. We can also study the generators of this representation, which are given by $1/2{\{{R}_{k},{R}_{l}\}}_{+}$.

For the reader familiar with annihilation and creation operators, if we denote by ${a}_{i},{a}_{i}^{\dagger }$ the annihilation and creation operators of the n bosonic modes, the generators of the metaplectic representation are given by

Equation (7)

Equation (8)

where the p stands for 'passive' and the a for 'active'. The passive generators are also frequently called linear transformations in the literature (see [Kok+07]). We can now define a set of generators of the symplectic group ${Sp}(2n)$ by using the set of metaplectic generators Gij above and take corresponding generators gij in the Lie algebra ${\mathfrak{sp}}(2n)$ in a consistent way. As one would expect from the name, the passive metaplectic generators correspond to a set of passive generators of ${\mathfrak{k}}(n)$ and the set of active metaplectic generators corresponds to a set of active generators of $\pi (n)$. The details of the correspondence are irrelevant (they are explicitly spelled out in equation (6.6b) in [Arv+95a]), except for the fact that the set ${G}_{{ii}}^{a(3)}$, i = 1,...,n corresponds to the generators ${g}_{{ii}}^{a(3)}$ generating matrices in Zn.

Given a Hamiltonian, the associated time evolution corresponds to a path on the Lie group: for a (sufficiently regular) path $\gamma \,:[0,1]\to {Sp}(2n)$ we can find a function $A(t)\in {\mathfrak{sp}}(2n)$ such that

Equation (9)

Instead of directly studying Hamiltonians with time-dependent coefficients as in equation (1), it is equivalent to study functions $A\,\,:\,[0,1]\to {\mathfrak{sp}}(2n)$.

There are a number of decompositions of the Lie group and its subgroup in addition to the polar decomposition. We will mostly be concerned with the so called Euler decomposition (sometimes called Bloch–Messiah decomposition) and Williamson's decomposition:

(Euler decomposition [Arv+95a]).

Proposition 2.8 Let $S\in {Sp}(2n)$, then there exist $K,{K}^{\prime }\in K(n)$ and $A\in { \mathcal A }(n)$ such that $S={{KAK}}^{\prime }$.

(Williamson's theorem [Wil36]).

Proposition 2.9 Let $M\in {{\mathbb{R}}}^{2n\times 2n}$ be a positive definite matrix, then there exists a symplectic matrix $S\in {Sp}(2n,{\mathbb{R}})$ and a diagonal matrix $D\in {{\mathbb{R}}}^{n\times n}$ such that

where $\tilde{D}=\mathrm{diag}(D,D)$ is diagonal. The entries of $D$ are also called symplectic eigenvalues.

In particular, for $M\in {\rm{\Pi }}(n)$, this implies that M has a symplectic square root. Since covariance matrices are always positive definite, this implies also that a Gaussian state is pure if and only if its covariance matrix is symplectic. Heisenberg's uncertainty principle has also a Williamson version:

Corollary 2.10. A positive definite matrix $M$ is a covariance matrix of a quantum state if and only if all symplectic eigenvalues are larger or equal to one.

The proof is simple and therefore omitted.

2.3. Quantum optical operations and squeezing

We have already noted that an important class of operations are those, which leave the CCR-relations invariant, namely the symplectic transformations. Given a quantum state ρ, the action of the symplectic group on the canonical variables R descends to a subgroup of unitary transformations on ρ via the metaplectic representation (see [Arv+95b]). Its action on the covariance matrix ${\gamma }_{\rho }$ of ρ is even easier: Given $S\in {Sp}(2n)$,

Equation (10)

In quantum optics, symplectic transformations can be implemented by the means of

  • (1)  
    beam splitters and phase shifters, implementing operations in K(n) ([Rec+94])
  • (2)  
    single-mode squeezers, implementing operations in Z(n).

Via the Euler decomposition, this implies that any symplectic transformation can be implemented (approximately) by a combination of those three elements.

Definition 2.11. An $n$-mode bosonic state $\rho $ is called squeezed, if its covariance matrix ${\gamma }_{\rho }$ possesses an eigenvalue $\lambda \lt 1$.

Especially in the early literature, squeezing is usually defined differently: a state ρ is squeezed if there exists a unitary transformation $K\in K(n)$ such that ${K}^{T}{\gamma }_{\rho }K$ has a diagonal entry smaller than one. This again comes from the physical definition of squeezed states being states where the Heisenberg uncertainty relations are satisfied with equality for at least one mode. These definitions however are well-known to be equivalent (see [SMD94]).

3. An operational squeezing measure for symplectic transformations

Throughout this section, we will always use σ as our standard symplectic form.

3.1. Definition and basic properties

We will now define a first operational squeezing measure for symplectic transformations, which will later be used to define a measure for operational squeezing.

Definition 3.1. Define the function $F\ :{{\mathbb{R}}}^{2n\times 2n}\to {\mathbb{R}}$

Equation (11)

where ${s}_{i}^{\downarrow }$ are the decreasingly ordered singular values of A.

Note that we sum only over half of the singular values. Restricting this function to symplectic matrices will yield an operational squeezing measure for symplectic transformations: recall that the symplectic group is generated by symplectic orthogonal matrices and single-mode squeezers. The orthogonal matrices are easy to implement and therefore will be considered a free resource. The squeezers have singular values s and ${s}^{-1}$ and they are experimentally hard to implement and should therefore be assigned a cost that depends on the squeezing parameter s. Using this, the amount of squeezing seems to be characterised by the largest singular values. Here, we quantify the amount of squeezing by a cost $\mathrm{log}(s)$, which can be seen as the interaction strength of the Hamiltonian needed to implement the squeezing.

Let us make this more precise: define the map

The image of Δ for a given symplectic matrix contains all possible ways to construct S as a product of matrices from K(n) or Z(n). We define:

Definition 3.2. Let $\overline{F}\ :{Sp}(2n)\to {\mathbb{R}}$ be a map defined via

Equation (12)

Proposition 3.3. If $S\in {Sp}(2n)$ then $F(S)=\overline{F}(S)$.

Proof. Let $S={{KAK}}^{\prime }$ be the Euler decomposition of S with $K,{K}^{\prime }\in K(n)$ and $A\in { \mathcal A }(n)$. Assume without loss of generality that $A=\mathrm{diag}({a}_{1},{a}_{1}^{-1},\ldots ,{a}_{n},{a}_{n}^{-1})$ and ${a}_{1}\geqslant {a}_{2}\geqslant \ldots \,\geqslant \,{a}_{n}\geqslant 1$ and define ${A}_{i}=\mathrm{diag}(1,\ldots ,1,{a}_{i},{a}_{i}^{-1},1,\ldots ,1)$. By construction $A={A}_{1}\cdots {A}_{n}$ and ${A}_{i}\in Z(n)$. Since $K,{K}^{\prime }\in K(n)$, $(K,{A}_{1},\ldots ,{A}_{n},{K}^{\prime })\in {\rm{\Delta }}(S)$. Using that ${s}_{i}^{\downarrow }(K)={s}_{i}^{\downarrow }({K}^{\prime })=1$ and the fact that the Euler decomposition is actually equivalent to the singular value decomposition of S, we obtain:

Conversely, consider $({S}_{1},\ldots ,{S}_{m})\in {\rm{\Delta }}(S)$. Using that by definition for each ${S}_{j}\in K(n)\cup Z(n)$ we have ${\prod }_{i=1}^{n}{s}_{i}^{\downarrow }({S}_{j})={s}_{1}^{\downarrow }({S}_{j})$, we conclude:

where in $(\ast )$ we used a special case of a theorem by Gel'fand and Naimark ([Bha96], theorem III.4.5 and equation (III.19)). Taking the infimum on the right-hand side gives $F(S)\leqslant \overline{F}(S)$. □

Let us write the last observation in $(\ast )$ as a small lemma for later use:

Lemma 3.4. Let $S,{S}^{\prime }\in {Sp}(2n)$. Then $F({{SS}}^{\prime })\leqslant F(S)+F({S}^{\prime })$.

3.2. Lie algebraic definition

Up to now, we have only considered products of symplectic matrices, which would correspond to a chain of beam splitters, phase shifters and single-mode squeezers. The goal of this section is to prove that one cannot improve the results with arbitrary paths on ${Sp}(2n)$, corresponding to general Hamiltonians of the form of equation (1) as described in section 2.

Let ${{ \mathcal C }}^{r}(S)$ be the set of absolutely continuous paths $\alpha \ :[0,1]\to {Sp}(2n)$ with a derivative which is bounded almost everywhere such that $\alpha (0)={\mathbb{1}}$ and $\alpha (1)=S$. Such paths seem to capture most if not all physically relevant cases.

Recall the set of generators g of ${\mathfrak{sp}}(2n)$ defined in section 2 and order them in a single vector. Usin equation (9), any $\alpha \in {{ \mathcal C }}^{r}(S)$ corresponds to a $A\in {L}^{\infty }([0,1],{\mathfrak{sp}}(2n))$. Since the generators g form a basis, we can write $A(t)={c}_{\alpha }(t)\cdot g$ with a function ${c}_{\alpha }\in {L}^{\infty }([0,1],{\mathfrak{sp}}(2n))$. Both A or ${c}_{\alpha }$ together with the condition $\alpha (0)={\mathbb{1}}$ uniquely define α.

The goal of this section is to prove that this does not give us any better way to avoid squeezing:

Theorem 3.5. For any $S\in {Sp}(2n)$, we have

Equation (13)

where we introduced the notation $\vec{c}$ to clarify that ${g}^{p/a}$ are actually vectors containing a set of generators each, and the coefficients might differ for each of these generators.

The proof of this theorem is quite lengthy in details, thus we split it up into several lemmata. The general idea is easy to relate: we first show that paths corresponding to products of symplectic matrices of type Z(n) or K(n) produce the same outcome in (13) and (12). We then use an approximation argument: given any path, we can approximate it by a path of products of symplectic matrices to arbitrary precision.

To start, we prove the following lemma:

Lemma 3.6. Let $A\in {\mathfrak{sp}}(2n)$ and write $A=1/2(A+{A}^{T})+1/2(A-{A}^{T})=: {A}_{+}+{A}_{-}$. Then ${A}_{+}\in \pi (2n)$ and ${A}_{-}\in {\mathfrak{k}}(n)$ and we have $F(\exp (A))\leqslant F(\exp ({A}_{+}))$.

Proof. First note that F is continuous in S since the singular values are. Using the Trotter-formula, we obtain:

where we used that $F(\exp ({A}_{-}))=0$ since ${A}_{-}\in {\mathfrak{k}}(n)$ and in $(\ast )$, we used a version of a theorem by Gel'fand and Naimark again (see [Bha96], equation (III.20)). □

Let us define yet another version of F which we call $\hat{F}$ in the following way:

This definition is of course reminiscent of the definition of $\overline{F}$ in equation (12):

Lemma 3.7. For $S\in {Sp}(2n)$, we have $\hat{F}(S)=\overline{F}(S)$.

Proof. To prove $\hat{F}\leqslant \overline{F}$, consider the Euler decomposition $S={K}_{1}{A}_{1}\ldots {A}_{n}{K}_{2}$ with ${A}_{i}\in Z(n)$ and ${K}_{1},{K}_{2}\in {Sp}(2n)$. Since K(n) is compact, the exponential map is surjective and there exist ${\vec{c}}_{1}^{p}$ and ${\vec{c}}_{2}^{p}$ such that $\exp ({\vec{c}}_{1}^{p}{g}^{p})={K}_{1}$ and $\exp ({\vec{c}}_{2}^{p}{g}^{p})={K}_{2}$. Recall that we ordered the vector ga in such a way that the generators gai generate the matrices in Z(n) for i = 1,...,n, hence we know that there exist ${\vec{c}}_{i}^{a}={(0,\ldots ,0,({\vec{c}}_{i}^{a})}_{(i)},0,\ldots ,0)$ for i = 1,...,n such that

This implies

Here we used that ${({\vec{c}}_{i}^{a})}_{(i)}$ is also the largest singular value of $\exp {(({\vec{c}}_{i}^{a})}_{(i)}{g}_{i}^{a})\in Z(n)$, as $F{(\exp {(({\vec{c}}_{i}^{a})}_{(i)}{g}_{i}^{a}))=({\vec{c}}_{i}^{a})}_{(i)}$ by normalisation of g.

For the other direction $\hat{F}\geqslant \overline{F}$, let S be arbitrary. Let $c\in C(S)$ and consider each vector ${\vec{c}}_{i}$ separately. We drop the index i for readability, since we need to consider the entries of the vector ${\vec{c}}_{i}$. To make the distinction clear, we denote the jth entry of the vector $\vec{c}$ by ${\vec{c}}_{(j)}$. Recall that the active generators are exactly those generating the positive matrices. Then:

where we basically redid the calculations we used to prove lemma 3.6, using the continuity of F and the Trotter formula from matrix analysis. Until now, we have considered only one ${\vec{c}}_{i}$ of $c\in C(S)$. Now, if we define ${S}_{i}=\exp ({\vec{c}}_{i}g)$, then we have ${\prod }_{i}{S}_{i}=S$ and hence, using lemma 3.4, we find:

But this means $\overline{F}(S)\leqslant \hat{F}(S)$, as we claimed.□

We can now prove the first half of the theorem:

Lemma 3.8. For $S\in {Sp}(2n)$ we have

Proof. Let $S\in {Sp}(2n)$ and consider the Euler decomposition $S={K}_{1}{S}_{1}\cdots {S}_{n}{K}_{2}$. We can define a function $A\ :[0,1]\to {\mathfrak{sp}}(2n)$ via:

Equation (14)

where $({\vec{c}}_{1}^{p},0,0,{\vec{c}}_{2}^{a},\ldots ,0,{\vec{c}}_{n+1}^{a},{\vec{c}}_{n+2}^{p},0)$ denotes the element in ${C}^{n+2}(S)$ for the Euler decomposition and vector indices are denoted by a subscript ${}_{(i)}$ as before. Let $U(s,t)$ be the propagator corresponding to A, then for $t\in [0,1/(n+2))$ according to proposition A.1, since A does not depend on t on this interval, it is given by $U(t,s)=\exp ((t-s)A)$. In particular, $U(1/(n+2),0)=\exp ({\vec{c}}_{n+2}^{p}{g}^{p})={K}_{2}$.

Iterating the procedure above, using $U(0,1)=U(0,1/(n+2))$ $\cdots U((n+1)/(n+2),1)$, we can see that by construction, $U(0,1)={K}_{1}{S}_{1}\cdots {S}_{n}{K}_{2}=S$. Hence A defines a continuous path on ${Sp}(2n)$ via $U(s,t)$. We can calculate:

where we used that the integral over the interval $[0,1/(n+2))$ and $[(n+1)/(n+2),1]$ is empty due to the fact that all active components are zero. In the last step, we used that for the Euler decomposition, which takes the minimum in $\hat{F}$, this value is exactly ${\sum }_{i}| {({\vec{c}}_{i+1}^{a})}_{(i)}| ={\sum }_{i}\parallel {\vec{c}}_{i+1}^{a}{\parallel }_{1}$, since ${({\vec{c}}_{i+1}^{a})}_{(j)}=0$ for $j\ne i$. Taking the infimum on the left-hand side only decreases the value.□

For the other direction, we need some facts about ordinary differential equations that are collected in appendix A.

Lemma 3.9. For $S\in {Sp}(2n)$ we have

Equation (15)

Proof. Let $S\in {Sp}(2n)$ be arbitrary. Combining the proof of lemma 3.8 with proposition 3.3 and lemma 3.7 we have already proved:

The only thing left to prove is that we can drop the step-function assumption. This will be done by a standard approximation argument: let $\tilde{F}(S)$ denote the right-hand side of equation (15). Let $\varepsilon \gt 0$ and consider an arbitrary $A\in {L}^{\infty }$ such that

Equation (16)

i.e. A corresponds to a path that is close to the infimum in the definition of $\tilde{F}$. We can now approximate ${c}_{\alpha }$ by step-functions ${c}_{\alpha ^{\prime} }$ (corresponding to a function ${A}^{\prime }$, see lemma A.2) such that

Equation (17)

Using the fact that the propagators ${U}_{A},{U}_{A^{\prime} }$ are differentiable almost everywhere (proposition A.1) and absolutely continuous when one entry is fixed, we can define a function $f(s):= {U}_{A}(0,s){U}_{{A}^{\prime }}(s,t)$, which is also differentiable almost everywhere. Furthermore, the fundamental theorem of calculus holds for f(s) (see [Rud87], theorems 6.10 and 7.8).

almost everywhere, which implies:

Since U and g are bounded in $\parallel \cdot {\parallel }_{\infty }$, we obtain

Equation (18)

M can explicitly be computed by the bounds given in proposition A.1.

Up to now, we have taken a path α to S close to the infimum and approximated it by a path $\alpha ^{\prime} $. It is immediate by equations (16) and (17) that

Equation (19)

Since ${c}_{\alpha ^{\prime} }\in {C}^{N}({S}^{\prime })$ for some $N\in {\mathbb{N}}$ and ${S}^{\prime }={U}_{{A}^{\prime }}(0,1)$, we would be done if ${S}^{\prime }=S$. To remedy this, we want to extend ${\alpha }^{\prime }$ to a path $\tilde{\alpha }$ such that it ends at S. This is where equation (18) enters: set $\tilde{S}:= {U}_{{A}^{\prime }}{(0,1)}^{-1}{U}_{A}(0,1)$, then

Equation (20)

hence $\tilde{S}\approx {\mathbb{1}}$ for ε small enough. Using the polar decomposition, we can write $\tilde{S}=\exp ({\vec{c}}_{N+1}^{p})\exp ({\vec{c}}_{N+2}^{a})$. A quick calculation yields

Equation (21)

This lets us construct a new $\tilde{A}\,:[0,2]\to {\mathfrak{sp}}(2n)$:

By construction, for the corresponding propagator we have ${U}_{\tilde{A}}(0,2)=S$ and $\tilde{\alpha }$ is a feasible path for $\tilde{F}(S)$ (at least after reparameterisation) fulfiling:

Since, ${c}_{\tilde{\alpha }}\in {C}^{N+2}(S)$, $\tilde{\alpha }$ is a valid path for $\hat{F}(S)$, which implies that for any $\epsilon \gt 0$, choosing $\varepsilon := \epsilon /(2+C)$, we have seen:

Equation (22)

For $\epsilon \to 0$, $\hat{F}(S)\leqslant \tilde{F}(S)$, which implies the lemma via lemma 3.7.□

4. A mathematical measure for squeezing of arbitrary states

Throughout this section, for convenience, we will switch to using J as symplectic form. Having defined the measure F, we will now proceed to define a squeezing measure for creating an arbitrary (mixed) state:

Definition 4.1. Let $\rho $ be an $n$-mode bosonic quantum state with covariance matrix ${\rm{\Gamma }}$. We then define:

Equation (23)

Note that G is always finite: for any given covariance matrix Γ, by Williamson's theorem and corollary 2.10, we can find $S\in {Sp}(2n)$ and $\tilde{D}\geqslant {\mathbb{1}}$ such that ${\rm{\Gamma }}={S}^{T}\tilde{D}S\geqslant {S}^{T}S$. Furthermore G is also non-negative since F is non-negative for symplectic S. We will prove in section 5 that this is indeed an operational measure.

4.1. Different reformulations of the measure

We will now give several reformulations of the squeezing measure and prove some of its properties. In particular, G is convex and one of the crucial steps towards proving convexity of G is given by a reformulation of G with the help of the Cayley transform. For the reader unfamiliar with the Cayley transform, a definition and basic properties are provided in appendix B.

Proposition 4.2. Let ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ and ${\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}$ symmetric. Then:

Equation (24)

Equation (25)

where ${ \mathcal H }$ is defined via:

Equation (26)

Proof. First note that the infimum in all three expressions is actually attained. We can see this most easily in the definition (23): the matrix inequalities ${\rm{\Gamma }}\geqslant {S}^{T}S(\geqslant {\rm{i}}{J})$ imply that the set of feasible S in the minimisation is compact, hence its minimum is attained. To see (23) = (24), first note that (24) ≤ (23) since any $S\in {Sp}(2n)$ also fulfils ${S}^{T}S\geqslant {\rm{i}}{J}$, hence ${\rm{\Gamma }}\geqslant {S}^{T}S\geqslant {\rm{i}}{J}$. For equality, note that for any ${\rm{\Gamma }}\geqslant {{\rm{\Gamma }}}_{0}\geqslant {\rm{i}}{J}$, using Williamson's theorem we can find $S\in {Sp}(2n)$ and a diagonal $\tilde{D}\geqslant {\mathbb{1}}$ (via corollary 2.10) such that ${{\rm{\Gamma }}}_{0}={S}^{T}{DS}\geqslant {S}^{T}S\geqslant {\rm{i}}{J}$. But since $F({{\rm{\Gamma }}}_{0}^{1/2})\geqslant F({({S}^{T}S)}^{1/2})=F(S)$ via the Weyl monotonicity principle, the infimum is achieved on symplectic matrices.

Finally, let us prove equality with (25). First observe that we can replace ${Sp}(2n)$ by ${ \mathcal H }$ using proposition B.1(4).

Using the fact that ${s}_{i}^{\downarrow }{(S)={\lambda }_{i}^{\downarrow }{({S}^{T}S)}^{1/2}={\lambda }_{i}^{\downarrow }({ \mathcal C }(H))}^{1/2}$ and the fact that H is diagonalised by the same unitary matrices as ${ \mathcal C }{(H)=({\mathbb{1}}+H)\cdot ({\mathbb{1}}-H)}^{-1}$ whence its eigenvalues are

we have:

Next we claim ${\lambda }_{i}^{\downarrow }(H)={s}_{i}^{\downarrow }(A+{\rm{i}}{B})$ for i = 1,...,n. To see this note:

Equation (27)

The singular values of the matrix on the right-hand side of equation (27) are the eigenvalues of $\mathrm{diag}{({(A+{\rm{i}}{B})}^{\dagger }{(A+{\rm{i}}{B}),(A+{\rm{i}}{B})(A+{\rm{i}}{B})}^{\dagger })}^{1/2}$, which are the singular values of A + iB with double multiplicity. From the structure of H, it is immediate that the eigenvalues of the right-hand side of equation (27) and thus of H come in pairs $\pm {s}_{i}(A+{\rm{i}}{B})$. Hence ${\lambda }_{i}^{\downarrow }(H)={s}_{i}^{\downarrow }(A+{\rm{i}}{B})$ for i = 1,...,n and we have:

To see that that the right-hand side equals (25), we only need to use the fact that ${\rm{\Gamma }}\geqslant { \mathcal C }(H)\iff {{ \mathcal C }}^{-1}({\rm{\Gamma }})\geqslant H$ for all $H\in { \mathcal H }$ and ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ since the Cayley transform and its inverse are operator monotone.□

4.2. Convexity

The reformulation (25) will allow us to prove:

Theorem 4.3.  $G$ is convex on the set of covariance matrices $\{{\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}| {\rm{\Gamma }}\geqslant {\rm{i}}{J}\}$.

The crucial part of the proof is the following lemma:

Lemma 4.4. Consider the map $f\,:{{\mathbb{R}}}^{n\times n}\times {{\mathbb{R}}}^{n\times n}\to {\mathbb{R}}$:

Equation (28)

If we restrict $f$ to symmetric matrices $A$ and $B$ such that ${s}_{i}(A+{\rm{i}}{B})\lt 1$ for all $i=1,\ldots ,n,f$ is jointly convex in $A,B$, i.e.

Proof. Let $\tilde{A}:= {tA}+(1-t){A}^{\prime }$ and $\tilde{B}:= {tB}+(1-t){B}^{\prime }$. Note that $\tilde{A}$ and $\tilde{B}$ are also symmetric, and the largest singular value of $\tilde{A}+{\rm{i}}\tilde{B}$ fulfils ${s}_{1}^{\downarrow }(\tilde{A}+{\rm{i}}\tilde{B})\leqslant {{ts}}_{1}^{\downarrow }(A+{\rm{i}}{B})+(1-t){s}_{1}^{\downarrow }({A}^{\prime }+{{\rm{i}}{B}}^{\prime })$. Therefore, the singular values of any convex combination of A + iB and ${A}^{\prime }+{{\rm{i}}{B}}^{\prime }$ also lie in the interval $[0,1)$. This makes our restriction well-defined under convex combinations.

For any j = 1,...,n, by Thompson's theorem (see [Tho76]), which states that for every complex $A,B$, there exist unitary matrices $U,V$ such that $| A+B| \leqslant U| A| {U}^{* }+V| B| {V}^{* }$, we have

Using Lidskii's theorem ([Bha96], chapter III with explicit formulation in exercise III.4.3), we have

Equation (29)

with ${p}_{\pi }\geqslant 0$ and ${\sum }_{\pi }{p}_{\pi }=1$. In $(\ast )$, we used that unitaries do not change the spectrum. Now each summand in equation (28) is the Cayley transform of a singular value. We can use the log-convexity of the Cayley-transform to prove the joint convexity of f:

where in $(\ast \ast )$ we use that the sum of all eigenvalues is of course not dependent on the order of the eigenvalues.□

This lemma will later allow us to calculate G as a convex programme.

Proof of theorem 4.3. We can now finish the proof of the convexity of G.

First note that using the definition of f in lemma 4.4 we can reformulate (25) to

Equation (30)

Let ${\rm{\Gamma }}\geqslant {\rm{i}}{J},{{\rm{\Gamma }}}^{\prime }\geqslant {\rm{i}}{J}$ be two covariance matrices and let $H,{H}^{\prime }\in { \mathcal H }$ be the matrices that attain the minimum of $G({\rm{\Gamma }}),G({{\rm{\Gamma }}}^{\prime })$ respectively. Then, in particular, ${tH}+(1-t){H}^{\prime }\in { \mathcal H }$. Furthermore, since ${{ \mathcal C }}^{-1}({\rm{\Gamma }})\geqslant H$ and ${{ \mathcal C }}^{-1}({{\rm{\Gamma }}}^{\prime })\geqslant {H}^{\prime }$ we have

where we used the operator concavity of ${{ \mathcal C }}^{-1}$ in $(\ast )$. This means that ${tH}+(1-t){H}^{\prime }$ is a feasible matrix for the minimisation in G, which implies using equation (30)

The convexity now follows directly from lemma 4.4 and the fact that we chose H and ${H}^{\prime }$ to attain $G({\rm{\Gamma }})$ and $G({{\rm{\Gamma }}}^{\prime })$.□

4.3. Continuity properties

From the convexity of G on the set of covariance matrices, it follows from general arguments in convex analysis that G is continuous on the interior of the set of covariance matrices (see [Roc97], theorem 10.1). What more can we say about the boundary?

Theorem 4.5.  $G$ is lower semicontinuous on the set of covariance matrices $\{{\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}| {\rm{\Gamma }}\geqslant {\rm{i}}{J}\}$ and continuous on its interior. Moreover, $G({\rm{\Gamma }}+\varepsilon {\mathbb{1}})\to G({\rm{\Gamma }})$ for $0\lt \varepsilon \to 0$ for any ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$.

The ultimate goal is to extend continuity from the interior to the exterior, which we do not know how to do at present. The proof will need a few notions from set-valued analysis that we review in appendix C.

Proof of theorem 4.5. As already observed, G is continuous on the interior. Let ${{\rm{\Gamma }}}_{0}\geqslant {\rm{i}}{J}$ be arbitrary and suppose

By definition, ${ \mathcal A }$ is compact and convex for any Γ. Moreover, it defines a set-valued function on the set of covariance matrices with non-empty values. Let $\varepsilon \gt 0$, then for all ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ with $\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt \varepsilon $, we have that for any $\hat{{\rm{\Gamma }}}\in { \mathcal A }({\rm{\Gamma }})$, $\tilde{{\rm{\Gamma }}}:= \hat{{\rm{\Gamma }}}+({\rm{\Gamma }}-{{\rm{\Gamma }}}_{0})\in { \mathcal A }({{\rm{\Gamma }}}_{0})$ and $\parallel \hat{{\rm{\Gamma }}}-\tilde{{\rm{\Gamma }}}\parallel \lt \varepsilon $. This is the condition in lemma C.2 hence the set-valued function defined by ${ \mathcal A }$ is upper semicontinuous at ${{\rm{\Gamma }}}_{0}$, which implies that ${ \mathcal A }({\rm{\Gamma }})\cap \{X| {\rm{i}}{J}\leqslant X\}$ is also upper semicontinuous by proposition C.3. If ε is small enough (e.g. $\varepsilon \lt 1$), this implies

hence this set is upper semicontinuous at ${{\rm{\Gamma }}}_{0}$.

Since F is continuous on positive definite matrices, it is absolutely continuous if we restrict to a small neighbourhood of the covariance matrix ${{\rm{\Gamma }}}_{0}$. This means that for every $\varepsilon \gt 0$ there exists an $\epsilon \gt 0$ such that

Equation (31)

for all $\parallel \tilde{{\rm{\Gamma }}}-\hat{{\rm{\Gamma }}}\parallel \lt \epsilon $ and all $\tilde{{\rm{\Gamma }}},\hat{{\rm{\Gamma }}}\in {\bigcup }_{\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt 1}{ \mathcal G }({\rm{\Gamma }})$.

Assuming without loss of generality that $\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt 1$, the set ${ \mathcal G }({\rm{\Gamma }})$ is exactly the set for the minimisation in the definition of G. The upper semicontinuity of ${ \mathcal G }({\rm{\Gamma }})$ implies by lemma C.2 that for every $\epsilon \gt 0$ there exists a $\delta \gt 0$ such that for all $\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt \delta $ we have: for all $\hat{{\rm{\Gamma }}}\in { \mathcal G }({\rm{\Gamma }})$ there exists a $\tilde{{\rm{\Gamma }}}\in { \mathcal G }({{\rm{\Gamma }}}_{0})$ such that $\parallel \hat{{\rm{\Gamma }}}-\tilde{{\rm{\Gamma }}}\parallel \lt \epsilon $. In particular, this is true for all minimisers $\hat{{\rm{\Gamma }}}$ with $G({\rm{\Gamma }})=F({\hat{{\rm{\Gamma }}}}^{1/2})$, where $\hat{{\rm{\Gamma }}}$ and $\tilde{{\rm{\Gamma }}}\in {\bigcup }_{\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt 1}{ \mathcal G }({\rm{\Gamma }})$. Using equation (31) we obtain: for every $\varepsilon \gt 0$ there exists a $\delta \gt 0$ such that for all $\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt \delta $, we have a pair $\hat{{\rm{\Gamma }}},\tilde{{\rm{\Gamma }}}$ with $\hat{{\rm{\Gamma }}}\in { \mathcal G }({\rm{\Gamma }})$ minimising $G({\rm{\Gamma }})$ and $\tilde{{\rm{\Gamma }}}\in { \mathcal G }({{\rm{\Gamma }}}_{0})$ such that

This implies that for all $\varepsilon \gt 0$ there exists a $\delta \gt 0$ such that

for all $\parallel {\rm{\Gamma }}-{{\rm{\Gamma }}}_{0}\parallel \lt \delta $.

Taking the limit inferior on both sides implies that G is lower semicontinuous at ${{\rm{\Gamma }}}_{0}$. Upper semicontinuity would follow for instance if ${ \mathcal G }({{\rm{\Gamma }}}_{0})$ is also lower semicontinuous.

Finally, let us prove that $G({{\rm{\Gamma }}}_{0}+\varepsilon {\mathbb{1}})\to 0$ for $\varepsilon \to 0$. To see this, consider the closed sets

for any $n\in {\mathbb{N}}$. It is easy to see that ${C}_{n+1}\subseteq {C}_{n}$ and that ${\bigcap }_{n\in \infty }{C}_{n}={ \mathcal G }({{\rm{\Gamma }}}_{0})$. Moreover, C1 is compact. Now let ${{\rm{\Gamma }}}_{n}$ be the sequence of minimisers for $G({{\rm{\Gamma }}}_{0}+1/n{\mathbb{1}})$, then ${{\rm{\Gamma }}}_{n}\in {C}_{n}$ for all $n\in {\mathbb{N}}$. By compactness, a subsequence will converge to

Therefore, $G({{\rm{\Gamma }}}_{0})\leqslant {\mathrm{lim}}_{\varepsilon \to 0}G({{\rm{\Gamma }}}_{0}+\varepsilon {\mathbb{1}})$, but since ${{\rm{\Gamma }}}_{0}\leqslant {{\rm{\Gamma }}}_{0}+\varepsilon {\mathbb{1}}$ for all $\varepsilon \gt 0$ we also have $G({\rm{\Gamma }})\geqslant {\mathrm{lim}}_{\varepsilon \to 0}G({{\rm{\Gamma }}}_{0}+\varepsilon {\mathbb{1}})$.□

4.4. Additivity properties

Now we consider additivity properties of G. We switch our basis again and use γ and σ.

Proposition 4.6. For any covariance matrices ${\gamma }_{A}\in {{\mathbb{R}}}^{2{n}_{1}\times 2{n}_{1}}$ and ${\gamma }_{B}\in {{\mathbb{R}}}^{2{n}_{2}\times 2{n}_{2}}$, we have

In particular, $G$ is subadditive.

Proof. For subadditivity, let ${S}^{T}S\leqslant {\gamma }_{A}$ and ${S}^{\prime T}{S}^{\prime }\leqslant {\gamma }_{B}$ obtain the minimum in $G({\gamma }_{A})$ and $G({\gamma }_{B})$ respectively. Then $S\oplus {S}^{\prime }$ is symplectic and ${(S\oplus {S}^{\prime })}^{T}(S\oplus {S}^{\prime })\leqslant {\gamma }_{A}\oplus {\gamma }_{B}$ hence, $G({\gamma }_{A}\oplus {\gamma }_{B})\leqslant G(A)+G(B)$.

To prove the lower bound, we need the following equation that we will only prove later on (see equation (46)):

Equation (32)

Assuming this inequality, let $a\geqslant 1$ be such that $a{{\mathbb{1}}}_{{n}_{2}}\geqslant {\gamma }_{B}$, then

hence $G({\gamma }_{A})\leqslant G({\gamma }_{A}\oplus {\gamma }_{B})$ and since we can do the same reasoning for ${\gamma }_{B}$, we have $G({\gamma }_{A})+G({\gamma }_{B})\leqslant 2G({\gamma }_{A}\oplus {\gamma }_{B})$.□

We do not know whether G is also superadditive, which would make it additive. At present, we can only prove:

Corollary 4.7. Let ${\gamma }_{A}\in {{\mathbb{R}}}^{2{n}_{1}\times 2{n}_{1}}$ and ${\gamma }_{B}\in {Sp}(2{n}_{2})$, be two covariance matrices (i.e. ${\gamma }_{B}$ is a covariance matrix of a pure state). Then $G$ is additive.

Proof. Subadditivity has already been proven in the lemma. For superadditivity, we use the second reformulation of the squeezing measure in equation (24): note that there is only one matrix ${\gamma }_{B}\geqslant \gamma \geqslant {\rm{i}}\sigma $, namely ${\gamma }_{B}$ itself. Now write

for $\tilde{A}\in {{\mathbb{R}}}^{2{n}_{1}\times 2{n}_{1}}$ and $\tilde{B}\in {{\mathbb{R}}}^{2{n}_{2}\times 2{n}_{2}}$. Then in particular ${\gamma }_{B}-\tilde{B}\geqslant 0$, but also $\tilde{B}\geqslant {\rm{i}}\sigma $, hence ${\gamma }_{B}\geqslant \tilde{B}\geqslant {\rm{i}}\sigma $ and therefore $\tilde{B}={\gamma }_{B}$. But then

hence also C = 0 and the matrix that takes the minimum in $G({\gamma }_{A}\oplus {\gamma }_{B})$ must be block-diagonal. Then ${\gamma }_{A}\oplus {\gamma }_{B}\geqslant \tilde{A}\oplus {\gamma }_{B}\geqslant 0$ and $\tilde{A}$ is in the feasible set of $G({\gamma }_{A})$. □

Corollary 4.8. For any covariance matrices ${\gamma }_{A}\in {{\mathbb{R}}}^{2{n}_{1}\times 2{n}_{1}}$ and ${\gamma }_{B}\in {{\mathbb{R}}}^{2{n}_{2}\times 2{n}_{2}}$,

If $G$ is superadditive, then this inequality holds without the factor of two.

Proof. 

Here we used proposition 4.6 and then convexity of G in $(\ast )$. Finally, in $(\ast \ast )$ we used that for every

Equation (33)

we also have:

Equation (34)

and vice versa. Since the two matrices on the right-hand side of equations (33) and (34) have equal spectrum, the two squeezing measures of the matrices on the left-hand side need to be equal. □

4.5. Bounds

Let us give a few simple bounds on G.

(Spectral bounds).

Proposition 4.9 Let ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ be a valid covariance matrix and ${\lambda }^{\downarrow }({\rm{\Gamma }})$ be the vector of eigenvalues in decreasing order. Then:

Equation (35)

Proof. According to the Euler decomposition, a symplectic positive definite matrix has positive eigenvalues that come in pairs $s,{s}^{-1}$ and we can find $O\in {SO}(2n)$ such that for any ${S}^{T}S\leqslant {\rm{\Gamma }}$

But then, ${\lambda }_{k}^{\downarrow }({\rm{\Gamma }})\geqslant {\lambda }_{k}^{\downarrow }(\mathrm{diag}({s}_{1},\ldots ,{s}_{n},{s}_{1}^{-1},\ldots ,{s}_{n}^{-1}))$ via the Weyl inequalities ${\lambda }_{i}^{\downarrow }(A)\geqslant {\lambda }_{i}^{\downarrow }(B)$ for all i and $A-B\geqslant 0$ (see [Bha96], theorem III.2.3). This implies:

For the lower bound, given an optimal matrix S with eigenvalues si, we have

If ${S}^{T}S={O}^{T}\mathrm{diag}({s}_{1}^{2},\ldots ,{s}_{n}^{2},{s}_{1}^{-2},\ldots ,{s}_{n}^{-2})O$ with $O\in {SO}(2n)$ is the diagonalisation of ${S}^{T}S$, we can write:

and again by Weyl's inequalites, we can find for all $k\leqslant n$:

Equation (36)

Now, $-\tfrac{1}{2}{\sum }_{i=2n-k+1}^{2n}{\lambda }_{i}^{\downarrow }({\rm{\Gamma }})$ can be upper bounded by restricting to eigenvalues ${\lambda }_{i}^{\downarrow }({\rm{\Gamma }})\lt 1$. This implies

using that the number of eigenvalues ${\lambda }_{i}({\rm{\Gamma }})\lt 1$ can at most be n (hence $k\leqslant n$ in the inequality of equation (36)), since ${\rm{\Gamma }}\geqslant {S}^{T}S$ and STS has at least n eigenvalues bigger than one. □

Numerics suggest that the lower bound is often very good for low dimensions. In fact, it can sometimes be achieved:

Proposition 4.10. Let ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ be a covariance matrix, then $G$ achieves the lower bound in equation (35) if there exists an orthonormal eigenvector basis ${\{{v}_{i}\}}_{i=1}^{2n}$ of Γ with ${v}_{i}^{T}{{Jv}}_{j}={\delta }_{i,n+j}$. Conversely, if G achieves the lower bound, then ${v}_{i}^{T}{{Jv}}_{j}=0$ for all normalised eigenvectors ${v}_{i},{v}_{j}$ of ${\rm{\Gamma }}$ with ${\lambda }_{i},{\lambda }_{j}\lt 1$.

Proof. Suppose that the lower bound in equation (35) is achieved. Via Weyl's inequalities (see [Bha96] theorem III.2.3), for all ${S}^{T}S\leqslant {\rm{\Gamma }}$ in the definition of G we have ${\lambda }_{i}^{\downarrow }({S}^{T}S)\leqslant {\lambda }_{i}^{\downarrow }({\rm{\Gamma }})$. For the particular S achieving G, this implies that for all ${\lambda }_{i}^{\downarrow }({\rm{\Gamma }})\lt 1$ we have ${\lambda }_{i}^{\downarrow }({S}^{T}S)={\lambda }_{i}^{\downarrow }({\rm{\Gamma }})$. But then ${\rm{\Gamma }}\geqslant {S}^{T}S$ implies that STS and Γ share all eigenvectors to the smallest eigenvalue. Iteratively, every eigenvector of Γ with ${\lambda }_{i}({\rm{\Gamma }})\lt 1$ must be an eigenvector of STS with the same eigenvalue.

Since the matrix diagonalising STS also diagonalises ${{ \mathcal C }}^{-1}({S}^{T}S)$, the eigenvectors of the two matrices are the same. Now, since ${{ \mathcal C }}^{-1}({S}^{T}S)\in { \mathcal H }$ by reformulation (25), for any eigenvector vi of any eigenvalue ${{ \mathcal C }}^{-1}({\lambda }_{i})\lt 0$, Jvi is also an eigenvector of ${{ \mathcal C }}^{-1}({S}^{T}S)$ to the eigenvalue $-{{ \mathcal C }}^{-1}({\lambda }_{i})$, implying ${v}_{i}^{T}{{Jv}}_{j}=0$ for all $i,j$. By definition, this means that $\{{v}_{i},{{Jv}}_{j}\}$ forms a symplectic basis. Above, we already saw that the eigenvectors of Γ for ${\lambda }_{i}({\rm{\Gamma }})\lt 1$ are also eigenvalues of STS, hence ${v}_{i}^{T}{{Jv}}_{j}=0$ for all i such that ${\lambda }_{i}({\rm{\Gamma }})\lt 1$.

Conversely, suppose we have an orthonormal basis ${\{{v}_{i}\}}_{i=1}^{2n}$ such that ${v}_{i}^{T}{{Jv}}_{j}={\delta }_{i,j+n}$ (modulo $2n$ if necessary) for all eigenvectors of Γ, i.e. Γ is diagonalisable by a symplectic orthonormal matrix $\tilde{O}\in U(n)$. Then

Since ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ we have ${\lambda }_{i}{\lambda }_{2i}\geqslant 1$. Assume that ${\lambda }_{i}\geqslant {\lambda }_{n+i}$ for all i = 1,...,n and the ${\lambda }_{n+i}$ are ordered in decreasing order. Then ${\lambda }_{n+r}\lt 1\leqslant {\lambda }_{n+r-1}$ for some $r\leqslant n$ and

fulfils ${S}^{T}S\leqslant {\rm{\Gamma }}$ and obviously achieves the lower bound in equation (35).□

In contrast to this, the upper bound can be arbitrarily bad. For instance, consider the thermal state ${\rm{\Gamma }}=(2N+1)\cdot {\mathbb{1}}$ for increasing N. It can easily be seen that $G({\rm{\Gamma }})=0$, since ${\rm{\Gamma }}\geqslant {\mathbb{1}}\in {\rm{\Pi }}(n)$ and $F({\mathbb{1}})=0$, hence $G({\rm{\Gamma }})\leqslant 0$. However, the upper bound in equation (35) is $n/2\mathrm{log}(2N+1)\to \infty $ for $N\to \infty $, therefore arbitrarily bad.

We can achieve better upper bounds by using Williamson's normal form:

Proposition 4.11 (Williamson bounds).  Let ${\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}$ be such that ${\rm{\Gamma }}\geqslant {\rm{i}}J$ and consider its Williamson normal form ${\rm{\Gamma }}={S}^{T}{DS}$. Then:

Equation (37)

Proof. Since $D\geqslant {\mathbb{1}}$ via ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$, the upper bound follows directly from the definition. Also, $F(S)\leqslant F({{\rm{\Gamma }}}^{1/2})$, which makes this bound trivially better than the spectral upper bound in equation (35).

The lower bound follows from:

using Weyl's inequalities once again, implying that since ${S}^{T}S\leqslant {\rm{\Gamma }}$, we also have $F{(S)}^{2}=F({S}^{T}S)\leqslant F({\rm{\Gamma }})$.□

The upper bound here can also be arbitrarily bad. One just has to consider ${\rm{\Gamma }}:= {S}^{T}(N\cdot {\mathbb{1}})S$ with ${S}^{2}=\mathrm{diag}(N-1,\ldots ,N-1,{(N-1)}^{-1},\ldots ,{(N-1)}^{-1})\in {Sp}(2n)$. Then ${\rm{\Gamma }}\geqslant {\mathbb{1}}$, i.e. $G({\rm{\Gamma }})=0$, but $F(S)\to \infty $ for $N\to \infty $.

Proposition 4.12. Let ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$ be a covariance matrix. Then

Equation (38)

where $\pi (n)$ was defined in proposition 2.5 as the Lie algebra of the positive semidefinite symplectic matrices. This infimum can be computed efficiently as a semidefinite programme.

Proof. Recall that the logarithm is operator monotone on positive definite matrices. Using this, we have:

The last step is valid, because the eigenvalues of matrices ${\gamma }_{0}\in \pi (n)$ come in pairs $\pm {\lambda }_{i}$. Since the sum of all the singular values is just the trace-norm, we are done.

It remains to see that this can be computed by a semidefinite programme. First note that since the matrices $H\in \pi (n)$ are those symmetric matrices with ${HJ}+{JH}=0$, the constraints are already linear semidefinite matrix inequalities. The trace norm is an SDP by standard reasoning [RFP10, VB96]:

which is clearly a semidefinite programme.□

Numerics for small dimensions suggest that this bound is mostly smaller than the spectral lower bounds.

5. An operational definition of the squeezing measure

We claim that G answers the question: given a state, what is the minimal amount of single-mode squeezers needed to prepare it? In other words, it quantifies the amount of squeezing needed for the preparation of a state.

5.1. Operations for state preparation and an operational measure for squeezing

We first specify the preparation procedure. Since we want to quantify squeezing, it seems natural that we allow to freely draw states from the vacuum or a thermal bath to start with. Furthermore, we can perform an arbitrary number of the following operations for free:

  • (1)  
    Add ancillary states also from a thermal bath or the vacuum.
  • (2)  
    Add Gaussian noise.
  • (3)  
    Implement any gate from linear optics.
  • (4)  
    Perform Weyl-translations of the state.
  • (5)  
    Perform selective or non-selective Gaussian measurements such as homodyne or heterodyne detection.
  • (6)  
    Forget part of the state.
  • (7)  
    Create convex combinations of ensembles.In addition, the following operation comes with an associated cost:
  • (8)  
    Implement single-mode squeezers at a cost of $\mathrm{log}(s)$, where s is the squeezing parameter.

All these operations are standard operations in quantum optics and they should capture all important Gaussian operations except for squeezing.

It is well-known that all of these operations are captured by the following set of operations on the covariance matrix (for a justification, see appendix D):

  • We can always draw N-mode states with $\gamma \in {{\mathbb{R}}}^{2N\times 2N}$ for any dimension N from the vacuum $\gamma ={\mathbb{1}}$ or a bath fulfiling $\gamma \geqslant {\mathbb{1}}$.
  • We can always add ancillary modes from the vacuum ${\gamma }_{\mathrm{anc}}={\mathbb{1}}$ or a bath $\gamma \geqslant {\mathbb{1}}$ and consider $\gamma \oplus {\gamma }_{\mathrm{anc}}$.
  • We can freely add noise with ${\gamma }_{\mathrm{noise}}\geqslant 0$ to our state, which is simply added to the covariance matrix of a state.
  • We can perform any beam splitter or phase shifter and in general any operation $S\in K(n)$, which translates to a map $\gamma \mapsto {S}^{T}\gamma S$ on covariance matrices of states.
  • We can perform any single-mode squeezer $S=\mathrm{diag}(1,\ldots ,1,s,{s}^{-1},1\ldots ,1)$ for some $s\in {{\mathbb{R}}}_{+}$.
  • We can perform any Weyl-translation leaving the covariance matrix invariant.
  • Given two states with covariance matrices ${\gamma }_{1}$ and ${\gamma }_{2}$, we can always take their convex combination $p{\gamma }_{1}+(1-p){\gamma }_{2}$ for any $p\in [0,1]$.
  • At any point, we can perform a selective measurement of the system corresponding to a projection into a finitely or infinitely squeezed state. Given a state with covariance matrix $\gamma =\left(\begin{array}{cc}A & B\\ {B}^{T} & C\end{array}\right)$, this maps
    where MP denotes the Moore–Penrose pseudoinverse.

Only operation (O4) comes at a cost of $\mathrm{log}(s)$, all other operations are free.

We are now ready to state our main theorem, which states that the minimal squeezing cost for any possible preparation procedure consisting of operations (1)–(8). is given by G.

Theorem 5.1. Let $\rho $ be a quantum state with covariance matrix $\gamma $. Consider arbitrary sequences

where ${\gamma }_{0}$ fulfils (O0) and every arrow corresponds to an arbitrary operation (O1)–(O5) or (O7). Using (O6), we can merge two sequences ${\vec{\gamma }}_{{N}_{1}}$ and ${\vec{\gamma }}_{{N}_{2}}$ to one resulting tree with ${\gamma }_{{N}_{1}+{N}_{2}+1}=\lambda {\gamma }_{{N}_{1}}+(1-\lambda ){\gamma }_{{N}_{2}}$ for some $\lambda \in (0,1)$. Iteratively, we can construct trees of any depth and width using operations (O1)–(O7).

Let ${{\mathfrak{O}}}_{N}(\gamma )$ be the set of such trees with $N$ operations ending with γ (i.e. ${\gamma }_{N}=\gamma $). Let ${\mathfrak{O}}(\gamma )={\bigcup }_{N=1}^{\infty }{{\mathfrak{O}}}_{N}(\gamma )$.

Furthermore, for any tree $\hat{\gamma }\in {{\mathfrak{O}}}_{N}(\gamma )$, let $\vec{s}={\{{s}_{i}\}}_{i=1}^{M}$ be the sequence of the largest singular values of any single-mode squeezer (O4) implemented along the sequence (in particular, $M\leqslant N$). Then

Equation (39)

5.2. Proof of the main theorem

Since we consider many different operations, the proof is rather lengthy, where the main difficulties will be in showing that measurements do not squeeze. In order to increase readability, the proof will be split into several lemmata.

Lemma 5.2. Let $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$ be a covariance matrix, ${\gamma }_{0}\geqslant {\mathbb{1}}$, let $N\in {\mathbb{N}}$ and

Equation (40)

be any sequence of actions (O1)–(O5) or (O7). If we denote the cost (sum of the logarithm of the largest singular values of any symplectic matrix involved) of this sequence by c, then one can replace this sequence by:

Equation (41)

with ${\gamma }_{\mathrm{anc}}\geqslant {\mathbb{1}}$, ${\gamma }_{\mathrm{noise}}\geqslant 0$, $S\in {Sp}(2n)$ and ${ \mathcal M }$ a partial Gaussian measurement of type specified in (O7). For this sequence, $c\geqslant F(S)$.

Proof. We prove the proposition by proving that given any chain ${\gamma }_{0}\to {\gamma }_{1}\to \cdots \,\to \,{\gamma }_{N}=\gamma $ as in (40), we can interchange all operations and obtain a chain as in equation (41). For readability, we will not always specify the size of the matrices and we will assume that $\gamma \geqslant {\rm{i}}\sigma $, ${\gamma }_{\mathrm{anc}}\geqslant {\mathbb{1}}$, ${\gamma }_{\mathrm{noise}}\geqslant 0$, and S a symplectic matrix, whenever the symbols arise:

  • (1)  
    We can combine any sequence ${\gamma }_{i}\to {\gamma }_{i+1}\to \cdots \,\to \,{\gamma }_{i+m}$ for some $m\in {\mathbb{N}}$ where each of the arrows corresponds to a symplectic transformation Sj, j = 1,...,m as in (O3) or (O4), into a single symplectic matrix $S\in {Sp}(2n)$ such that ${\gamma }_{i+m}={S}^{T}{\gamma }_{i}S$. Furthermore lemma 3.4 implies $F(S)\leqslant {\sum }_{i}{s}_{1}^{\downarrow }({S}_{i})$, hence this recombination of steps only lowers the amount of squeezing.
  • (2)  
    Any sequence $\gamma \to {S}^{T}\gamma S\to {S}^{T}\gamma S+{\gamma }_{\mathrm{noise}}$ can be converted into a sequence $\gamma \to {S}^{T}(\gamma +{\tilde{\gamma }}_{\mathrm{noise}})S$ with the same S and hence the same costs by setting ${\tilde{\gamma }}_{\mathrm{noise}}:= {S}^{-T}{\gamma }_{\mathrm{noise}}{S}^{-1}\geqslant 0$.
  • (3)  
    Any sequence $\gamma \to {S}^{T}\gamma S\to {S}^{T}\gamma S\oplus {\gamma }_{\mathrm{anc}}$ can be converted into a sequence $\gamma \to \gamma \oplus {\gamma }_{\mathrm{anc}}\to {\tilde{S}}^{T}(\gamma \oplus {\gamma }_{\mathrm{anc}})\tilde{S}$ by setting $\tilde{S}=S\oplus {\mathbb{1}}$ with ${\mathbb{1}}$ of the same dimension as ${\gamma }_{\mathrm{anc}}$. Since we only add the identity, we have $F(\tilde{S})={\sum }_{i}\mathrm{log}{s}_{i}^{\downarrow }(\tilde{S})=F(S)$ and the costs do not increase.
  • (4)  
    Any sequence $\gamma \to \gamma +{\gamma }_{\mathrm{noise}}\to (\gamma +{\gamma }_{\mathrm{noise}})\oplus {\gamma }_{\mathrm{anc}}$ can be converted into a sequence $\gamma \to \gamma \oplus {\gamma }_{\mathrm{anc}}\to \gamma \oplus {\gamma }_{\mathrm{anc}}+{\tilde{\gamma }}_{\mathrm{noise}}$ by setting ${\tilde{\gamma }}_{\mathrm{noise}}={\gamma }_{\mathrm{noise}}\oplus 0\geqslant 0$, which is again a valid noise matrix. As no operation of type (O4) is involved, the squeezing costs do not change.

In a next step we consider measurements. We will only consider homodyne detection, since the proof is exactly the same for arbitrary Gaussian measurements of type (O7). Given a covariance matrix γ, we assume a decomposition

as in the definition of (O7) with $\pi =\mathrm{diag}(1,0)$.

  • (5)  
    Any sequence $\gamma \to { \mathcal M }(\gamma )\to {S}^{T}{ \mathcal M }(\gamma )S$ can be converted into a sequence $\gamma \to {\tilde{S}}^{T}\gamma \tilde{S}\to { \mathcal M }({\tilde{S}}^{T}\gamma \tilde{S})$ by setting $\tilde{S}=S\oplus {{\mathbb{1}}}_{2}$. To see this, write ${S}^{T}{ \mathcal M }{(\gamma )S={S}^{T}{AS}-{S}^{T}C(\pi B\pi )}^{{\rm{MP}}}{C}^{T}S$ and
    hence the final covariance matrices are the same. By the same reasoning as in (3), the costs are equivalent.
  • (6)  
    Any sequence $\gamma \to { \mathcal M }(\gamma )\to { \mathcal M }(\gamma )+{\gamma }_{\mathrm{noise}}$ can be converted into a sequence $\gamma \to \gamma +{\tilde{\gamma }}_{\mathrm{noise}}\to { \mathcal M }(\gamma +{\tilde{\gamma }}_{\mathrm{noise}})$ by setting ${\tilde{\gamma }}_{\mathrm{noise}}={\gamma }_{\mathrm{noise}}\oplus 0$, with 0 on the last mode being measured. Since no symplectic matrices are involved, the costs are equivalent.
  • (7)  
    Any sequence $\gamma \to { \mathcal M }(\gamma )\to { \mathcal M }(\gamma )\oplus {\gamma }_{\mathrm{anc}}$ can be changed into a sequence $\gamma \to \gamma \oplus {\gamma }_{\mathrm{anc}}\to \tilde{{ \mathcal M }}(\gamma \oplus {\gamma }_{\mathrm{anc}})$, where the measurement $\tilde{{ \mathcal M }}$ measures the last mode of γ, i.e.
    Clearly, the resulting covariance matrices of the two sequences are the same and the costs are equivalent.

We can now easily prove the lemma. Let ${\gamma }_{0}\to \cdots \to \,{\gamma }_{n}$ be an arbitrary sequence with operations of type (O1)–(O5) or (O7). We can first move all measurements to the right of the sequence, i.e. we first perform all operations of type (O1)–(O5) and then all measurements. This is done using the observations above. Note also that this step is similar to the quantum circuit idea to 'perform all measurements last' (see [NC00], chapter 4).

Similarly, we can combine operations of type (O3) and (O4) and rearrange the other operations to obtain a new sequence as in equation (41) with at most the costs of the sequence ${\gamma }_{1}\to \cdots \,\to \,{\gamma }_{m}$ we started with.□

We can now slowly work towards theorem 5.1:

Lemma 5.3. Let $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$ be a covariance matrix, then

Equation (42)

Proof. First note that for any $\gamma \geqslant {\rm{i}}\sigma $, we can find $S\in {Sp}(2n)$, ${\gamma }_{0}\in {{\mathbb{R}}}^{2n\times 2n}$ with ${\gamma }_{0}\geqslant {\mathbb{1}}$ and ${\gamma }_{\mathrm{noise}}\in {{\mathbb{R}}}^{2n\times 2n}$ with ${\gamma }_{\mathrm{noise}}\geqslant 0$ such that $\gamma ={S}^{T}({\gamma }_{0}+{\gamma }_{\mathrm{noise}})S$ by using Williamson's theorem, hence the feasible set is never empty. The lemma is immediate by observing that for any $\gamma ={S}^{T}({\gamma }_{0}\oplus {\gamma }_{\mathrm{anc}}+{\gamma }_{\mathrm{noise}})S$ since $({\gamma }_{0}\oplus {\gamma }_{\mathrm{anc}}+{\gamma }_{\mathrm{noise}})\geqslant {\mathbb{1}}$ we have $\gamma \geqslant {S}^{T}S$ and conversely, for any $\gamma \geqslant {S}^{T}S$, defining ${\gamma }_{0}:= {S}^{-T}\gamma {S}^{-1}\geqslant {\mathbb{1}}$, we have $\gamma ={S}^{T}{\gamma }_{0}S$. □

As an intermediate step we introduce the following notation:

Equation (43)

Then we have:

Lemma 5.4. For $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$ a covariance matrix, we have

Equation (44)

Proof. This follows from lemma 5.3:

Equation (45)

by taking the infimum over all measurements last. □

Note here, that equation (45) together with the following proposition 5.5 finishes the proof of proposition 4.6 via:

Equation (46)

for $a\geqslant 1$, using that measuring the last modes we obtain ${ \mathcal M }(\gamma \oplus a{{\mathbb{1}}}_{{n}_{2}})=\gamma $ and therefore, $\gamma \oplus a{{\mathbb{1}}}_{{n}_{2}}$ is in the feasible set of $\tilde{G}(\gamma )=G(\gamma )$.

Proposition 5.5. For $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$ a covariance matrix we have

This proposition shows that G is operational if we exclude convex combinations (and therefore also non-selective measurements).

Proof. Using lemma 5.4, the proof of this proposition reduces to the question whether:

Equation (47)

Since we do not need to use measurements, ≤ is obvious.

Let $\tilde{\gamma }\geqslant \hat{\gamma }\geqslant {\rm{i}}\sigma $ for some ${ \mathcal M }(\tilde{\gamma })=\gamma $. Our first claim is that

Equation (48)

${ \mathcal M }(\hat{\gamma })\geqslant {\rm{i}}\sigma $ is clear from the fact that $\hat{\gamma }$ is a covariance matrix and a measurement takes states to states. $\gamma \geqslant { \mathcal M }(\hat{\gamma })$ is proved using Schur complements. Let ${ \mathcal M }$ be a Gaussian measurement as in equation (68) with ${\gamma }_{G}=\mathrm{diag}(d,1/d)$ with $d\in {{\mathbb{R}}}^{+}$. It is well-known that

where S denotes the Schur complement of the block in the lower-right corner of the matrix. For homodyne measurements, we take the limit $d\to \infty $. Since for any $\tilde{\gamma }\geqslant \hat{\gamma }\geqslant 0$, the Schur complements of the lower right block fulfil ${\tilde{\gamma }}^{S}\geqslant {\hat{\gamma }}^{S}\geqslant 0$ (see [Bha07], exercise 1.5.7), we have $\gamma \geqslant { \mathcal M }(\hat{\gamma })$ as claimed in equation (48).

Next, we claim

Equation (49)

To prove this claim, note that via the monotonicity of the exponential function on ${\mathbb{R}}$, it suffices to prove

when we assume $\hat{\gamma }\in {{\mathbb{R}}}^{2n\times 2n}$ and ${ \mathcal M }(\hat{\gamma })\in {{\mathbb{R}}}^{2m\times 2m}$ with $m\leqslant n$. If we write

then the state after measurement is given by ${ \mathcal M }{(\hat{\gamma })=\hat{A}-\hat{C}(\hat{B}+\mathrm{diag}(d,1/d))}^{-1}{\hat{C}}^{T}$ or the limit $d\to \infty $ for homodyne measurements. In any case $\hat{C}{(\hat{B}+\mathrm{diag}(d,1/d))}^{-1}{\hat{C}}^{T}\geqslant 0$ and ${ \mathcal M }(\hat{\gamma })\leqslant \hat{A}$ and therefore, by Weyl's inequalities, also

Now we use Cauchy's interlacing theorem (see [Bha96], corollary III.1.5): as $\hat{A}$ is a submatrix of $\hat{\gamma }$, we have ${\lambda }_{i}^{\downarrow }(\hat{A})\leqslant {\lambda }_{i}^{\downarrow }(\hat{\gamma })$ for all $i=1,\ldots ,2m$. Since at least m eigenvalues of $\hat{A}$ are bigger or equal one and at least n eigenvalues of $\hat{\gamma }$ are bigger or equal one, this implies

Equation (50)

In particular, this proves equation (49).

We can then complete the proof: let $\tilde{\gamma }\geqslant \hat{\gamma }\geqslant {\rm{i}}\sigma $ for some ${ \mathcal M }(\tilde{\gamma })=\gamma $ in equation (47). We have just seen that this implies $\gamma \geqslant { \mathcal M }(\hat{\gamma })\geqslant {\rm{i}}\sigma $ via equation (48) and furthermore that $F{({\hat{\gamma }}^{1/2})\geqslant F({ \mathcal M }(\hat{\gamma })}^{1/2})$ via equation (49). But this means that we have found $\overline{\gamma }:= { \mathcal M }(\hat{\gamma })$ such that $\gamma \geqslant \overline{\gamma }\geqslant {\rm{i}}\sigma $. Hence $\overline{\gamma }$ is in the feasible set of the right-hand side of (47) and $F({\tilde{\gamma }}^{1/2})\geqslant F({\overline{\gamma }}^{1/2})$, which implies ≥ in equation (47).□

Finally, we can prove theorem 5.1 by also covering convex combinations:

Proof. Let $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$ be a covariance matrix. First consider only sequences $\vec{\gamma }$: we replace any sequence by the special type of sequence of lemma 5.3. For these sequences, we have seen that the minimum cost is given by $G(\gamma )$ in proposition 5.5.

However, we explicitly excluded convex combinations (O6) by considering only sequences and not trees $\hat{\gamma }$: consider a tree of operations (O1)–(O7) which has γ at its root and ${\gamma }_{0}={\mathbb{1}}$ as leaves. Let us consider any node closest to the leaves. At such a node, we start with two covariance matrices ${\gamma }_{1}$ and ${\gamma }_{2}$ that were previously constructed without using convex combinations and with costs $G({\gamma }_{1})$ and $G({\gamma }_{2})$. The combined matrix would be $\tilde{\gamma }:= \lambda {\gamma }_{1}+(1-\lambda ){\gamma }_{2}$ for some $\lambda \in (0,1)$ and the costs would be $\lambda G({\gamma }_{1})+(1-\lambda )G({\gamma }_{2})$.

By convexity of G (see theorem 4.3):

which means that we can find a sequence (without any convex combinations) producing $\lambda {\gamma }_{1}+(1-\lambda ){\gamma }_{2}$ which is cheaper than first producing ${\gamma }_{1}$ and ${\gamma }_{2}$ and then taking a convex combination. Iteratively, this means we can eliminate every node from the tree and replace the tree by a sequence of operations (O1)–(O5) and (O7), which is cheaper than the tree and trees do not matter.□

5.3. The squeezing measure as a resource measure

We have now seen that the measure G can be interpreted as a measure of the amount of single-mode squeezing needed to create a state ρ. Let us now take a different perspective, which is the analogue of the entanglement of formation for squeezing: consider covariance matrices of the form

Equation (51)

These are single-mode squeezed states with squeezing parameter $s\geqslant 1$. We will now allow these states as resources and ask the question: given a (Gaussian) state ρ with covariance matrix γ, what is the minimal amount of these resources needed to construct γ, if we can freely transform the state by the same operations as before excluding squeezing ((O1)–(O7) excluding (O4)).

The corresponding measure is once again G:

Theorem 5.6. Let $\rho $ be an $n$-mode state with covariance matrix $\gamma \in {{\mathbb{R}}}^{2n\times 2n}$. Then

Equation (52)

where ${ \mathcal T }\,:{{\mathbb{R}}}^{2m\times 2m}\to {{\mathbb{R}}}^{2n\times 2n}$ is a combination of the operations (1)–(6) above.

Proof. ≤: Note that for any feasible $S\in {Sp}(2n)$ in $G(\gamma )$, i.e. any S with ${S}^{T}S\leqslant \gamma $, we can find $O\in {Sp}(2n)\cap O(2n)$ and $D={\bigoplus }_{i=1}^{n}{\gamma }_{{s}_{i}}$ with ${S}^{T}S={O}^{T}{DO}$ via the Euler decomposition. Using that the Euler decomposition minimises F, we have $F(S)=\tfrac{1}{2}F(D)={\sum }_{i=1}^{n}\tfrac{1}{2}\mathrm{log}({s}_{i})$. But then, since we can find ${\gamma }_{\mathrm{noise}}\geqslant 0$ such that $\gamma ={O}^{T}{\bigoplus }_{i=1}^{n}{\gamma }_{{s}_{i}}O+{\gamma }_{\mathrm{noise}}$, we have that D is a feasible resource state to produce γ. This implies ${G}^{\mathrm{resource}}(\gamma )\leqslant G(\gamma )$.

≥: For the other direction, the proof proceeds exactly as the proof of theorem 5.1. First, we exclude convex combinations. Then, we realise that we can change the order of the different operations (even if we include adding resource states during any stage of the preparation process) according to lemma 5.2, making sure that any preparation procedure can be implemented via:

where $O\in {Sp}(2m+2{m}^{\prime })\cap O(2m+2{m}^{\prime })$, ${\gamma }_{\mathrm{noise}}\in {{\mathbb{R}}}^{2m+2{m}^{\prime }\times 2m+2{m}^{\prime }}$ with ${\gamma }_{\mathrm{noise}}\geqslant 0$ and ${ \mathcal M }$ a measurement. Now the only difference to proof of 5.1 is that we had the vacuum ${\mathbb{1}}$ instead of ${\bigoplus }_{i=1}^{m}{\gamma }_{{s}_{i}}\oplus {{\mathbb{1}}}_{2{m}^{\prime }}$ and an arbitrary symplectic matrix S instead of O, but the two ways of writing the maps are completely interchangeable, so that the proof proceeds as in theorem 5.1.□

We could call this measure the '(Gaussian) squeezing of formation', as it is the analogue to the Gaussian entanglement of formation. Note also that the measure is similar to the Gaussian entanglement of formation as defined in [Wol+04]. One natural further question would be whether 'distillation of squeezing' is possible with Gaussian operations. It is impossible in some sense for the minimal eigenvalue via [Kra+03], while it is possible and has been investigated for non-Gaussian states in many papers (see [Fil13, Hee+06] and references therein). In our case, it is not immediately clear whether extraction of single-mode squeezed states with less squeezing is possible or not. This could be investigated in future work.

6. Calculating the squeezing measure

We have seen that the measure G is operational. However, to be useful, we need a way to compute it.

6.1. Analytical solutions

Proposition 6.1. Let $n=1$, then $G({\rm{\Gamma }})=-\tfrac{1}{2}\,{\min }_{i}\,\mathrm{log}({\lambda }_{i}({\rm{\Gamma }}))$ for all ${\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}$.

Proof. Note that this is the lower bound in proposition 4.9, hence $-\tfrac{1}{2}\,{\min }_{i}\,\mathrm{log}({\lambda }_{i}({\rm{\Gamma }}))\leqslant G({\rm{\Gamma }})$. Now consider the diagonalisation ${\rm{\Gamma }}=O\,\mathrm{diag}({\lambda }_{1},{\lambda }_{2}){O}^{T}$ with $O\in {SO}(2)$ and assume ${\lambda }_{1}\geqslant {\lambda }_{2}$. Then, ${\lambda }_{2}^{-1}\leqslant {\lambda }_{1}$ since otherwise, ${\rm{\Gamma }}\,\not\hspace{-2pt}{\geqslant }\,{\rm{i}}{J}$.

Consider $\mathrm{diag}({\lambda }_{1},{\lambda }_{2})\geqslant {O}^{-T}{S}^{T}{{SO}}^{-1}$ for some $S\in {Sp}(2)$ with eigenvalues $s\geqslant 1$ and ${s}^{-1}$. Since $\mathrm{diag}({\lambda }_{1},{\lambda }_{2})\geqslant {O}^{-T}{S}^{T}{{SO}}^{-1}$, this implies in particular that ${s}^{-1}\leqslant {\lambda }_{2}$ by Weyl's inequality. Since $F({S}^{T}S)=\mathrm{log}s$, in order to minimise F(S) over ${S}^{T}S\leqslant {\rm{\Gamma }}$, we need to maximize ${s}^{-1}$. Setting ${s}^{-1}={\lambda }_{2}$ we obtain $s={\lambda }_{2}^{-1}\leqslant {\lambda }_{1}$ and $\mathrm{diag}({\lambda }_{1},{\lambda }_{2})\geqslant \mathrm{diag}(s,{s}^{-1})$. Since ${SO}(2)=K(1)$, ${S}^{T}S:= {O}^{T}\,\mathrm{diag}({\lambda }_{1},{\lambda }_{2})O\leqslant {\rm{\Gamma }}$ is the minimising matrix in G and $G({\rm{\Gamma }})=F(S)=\tfrac{1}{2}\mathrm{log}{\lambda }_{2}^{-1}$.□

Proposition 6.2. Let $\rho $ be a pure, Gaussian state with covariance matrix ${\rm{\Gamma }}\in {{\mathbb{R}}}^{2n\times 2n}$. Then $G({\rm{\Gamma }})=F({{\rm{\Gamma }}}^{1/2})$.

Proof. From proposition 2.2, we know that $\det ({\rm{\Gamma }})=1$ in particular. Therefore, the bounds in proposition 4.11 are tight and $G({\rm{\Gamma }})=F({{\rm{\Gamma }}}^{1/2})$. □

6.2. Numerical calculations using Matlab

The crucial observation to numerically find the optimal squeezing measure is given in lemma 4.4: if we use G in the form of equation (25), we know that the function to be minimised is convex on ${ \mathcal H }$. In general, convex optimisation with convex constraints is efficiently implementable and there is a huge literature on the topic (see [BV04] for an overview).

In our case, a certain number of problems occur when performing convex optimisation:

  • (1)  
    The function f in equation (28) is highly nonlinear. It is also not differentiable at eigenvalue crossings of A + iB or $H\in { \mathcal H }$. In particular, it is not differentiable when one of the eigenvalues becomes zero, which is to be expected at the minimum.
  • (2)  
    While the constraints ${{ \mathcal C }}^{-1}(\gamma )\geqslant H$ and ${\mathbb{1}}\gt H\gt -{\mathbb{1}}$ are linear in matrices, they are nonlinear in simple parameterisations of matrices.
  • (3)  
    For γ on the boundary of the set of allowed density operators, the set of feasible solutions might not have an inner point.

The first and second problem imply that most optimisation methods are unsuitable, as they are either gradient-based or need more problem structure. It also means that there is no guarantee for good stability of the solutions. The third problem implies that interior point methods become unsuitable on the boundary, which limits applications. For instance, our example of the next section (see equation (53)) lies on the boundary. As a proof of principle implementation, we used the Matlab-based solver SolvOpt (for details see the manual [KK97]). We believe our implementation could be made more efficient and more stable, but it seems to work well in most cases for less than ten modes. More information on the programme is provided in appendix E.

6.3. Squeezing-optimal preparation for certain three-mode separable states

Let us now work with a particular example that has been studied in the quantum information literature. In [MK08], Mišta Jr and Korolkova define the following three-parameter group of three-mode states where the modes are labelled $A,B,C$:

Equation (53)

with

where $a=\cosh (2r)$, $c=\sinh (2r)$, $\tan \phi ={e}^{-2r}\sinh (2d)+\sqrt{1+{e}^{-4r}{\sinh }^{2}(2d)}$. The remaining parameters are $d\geqslant r\gt 0$ and $x\geqslant 0$. For

the state becomes fully separable [MK08]. The state as such is a special case of a bigger family described in [Gie+01]. In [MK08], it was used to entangle two systems at distant locations using fully separable mediating ancillas (here the system labelled C). Therefore, Mišta Jr and Korolkova considered also an LOCC procedure to prepare the state characterised by (53). For our purposes, this is less relevant and we allow for arbitrary preparations of the state. This was also done in [MK08] by first preparing modes A and B each in a pure squeezed-state with position quadratures ${e}^{2(d-r)}$ and ${e}^{2(d+r)}$. A vacuum mode in C was added and $x({q}_{1}{q}_{1}^{T}+{q}_{2}{q}_{2}^{T})$ was added as random noise. Therefore, the squeezing needed to produce this state in this protocol is given by

Equation (54)

We numerically approximated the squeezing measure for ${\gamma }_{{ABC}}$, choosing $x={x}_{\mathrm{sep}}$, which leaves a two-parameter family of states. We chose parameters d and r according to

Equation (55)

with $i,j\in \{1,\ldots ,30\}$ for a total of 900 data points. Since the algorithm is not an interior point algorithm as described above, to check the result, we reprepared the state in the following way:

  • (1)  
    Let S be the symplectic matrix at the value optimum found by SolvOpt for a covariance matrix ${\gamma }_{{ABC}}$.
  • (2)  
    Calculate ${S}^{-T}{\gamma }_{{ABC}}{S}^{-1}$ and calculate its lowest eigenvalue ${\lambda }_{2n}$.
  • (3)  
    Define $\tilde{\gamma }:= {S}^{-T}{\gamma }_{{ABC}}{S}^{-1}+(1-\min \{1,{\lambda }_{2n}\}){\mathbb{1}}\geqslant {\mathbb{1}}$. Calculate the largest singular value of ${S}^{T}\tilde{\gamma }S-\gamma $.

If S was a feasible point, then ${S}^{T}\tilde{\gamma }S=\gamma $. Since it is obvious how to prepare $\tilde{\gamma }$ with operations specified in section 5, the largest singular value of ${S}^{T}\tilde{\gamma }S-\gamma $ is an indicator of how well we can approximate the state we want to prepare by a state with comparably low squeezing costs.

The results of the numerical computation are shown in figure 1. We computed the minimum both with the help of numerical and analytical subgradients and took the value with a better approximation error. At rare occasions, one algorithm failed to obtain a minimum. Possible reasons for this are discussed in appendix E. The optimal values computed by the algorithm are close to the lower bound and a lot better than the upper bound and the costs obtained by equation (54). One can easily see that ${\gamma }_{{ABC}}$ cannot achieve the spectral lower bound as the assumptions of lemma 4.10 are not met.

Figure 1.

Figure 1. Results of numerical calculations (formulas for d and r in equation (55)). On the upper figure, the lower range of points (green in the online version) are the best lower bound, the middle points (blue) denote the value of the objective function at the minimum found by SolvOpt and the upper points (red) denote the squeezing costs of the preparation protocol of [MK08] (equation (54)). The lower figure shows the preparation error. It is mostly below 10−6.

Standard image High-resolution image

7. Discussion of modifications to allowed operations

In experiments, squeezing of a state is most commonly measured by the logarithm of the smallest eigenvalue (up to a constant) and the unit is usually referred to as decibel (dB) [Lvo15]. We know of no operational interpretation for this measure that is similar to the interpretation given in section 5 and the measure is not natural for multimode states.

In contrast, G is a natural measure for multimode states. However, squeezing is not just experimentally challenging, it gets much harder if we want to achieve a larger amount of single-mode squeezing. Currently, the highest amount of squeezing obtained in quantum optical systems seems to be about $13\,\mathrm{dB}$ (see [And+15]). In other words, the two states ρ and ${\rho }^{\prime }$ with covariance matrices

Equation (56)

will not be equally hard to prepare although $G(\gamma )=G({\gamma }^{\prime })$. This is due to the fact that we quantified the cost of a single-mode squeezer by $\mathrm{log}s$.

To amend this, one could propose an easy modification to the definition of F in equation (11):

Equation (57)

by inserting another function $g\ :{\mathbb{R}}\to {\mathbb{R}}$ to make sure that for the corresponding measure ${G}_{g}(\rho )\equiv {G}_{g}(\gamma )$, we have ${G}_{g}(\gamma )\ne {G}_{g}({\gamma }^{\prime })$ in equation (56). We pose the following natural restrictions on g:

  • We need $g(1)=1$ since ${G}_{g}(\rho )$ should be zero for unsqueezed states.
  • Squeezing should get harder with larger parameter, hence g should be monotonously increasing.
  • For simplicity, we assume g to be differentiable.

Let us first consider squeezing operations and the measure Fg. We proved in proposition 3.3 and theorem 3.5 that F is minimised by the Euler decomposition. A crucial part was given by lemma 3.4. In order to be useful for applications, we must require the same to be true for Fg, i.e.

This puts quite strong restraints on g: considering n = 1 and assuming that S and ${S}^{\prime }$ are diagonal with ordered singular values, this implies that g must fulfill $g({xy})\leqslant g(x)g(y)$ for $x,y\geqslant 1$. This submultiplicativity restraint rules out all interesting classes of functions: Assume for instance that $g(2)=c$, then $g({2}^{n})\leqslant {c}^{n}$, where equality is attained if $g(x)=c\cdot x$. Therefore, all submultiplicative functions g(x) for $x\geqslant 1$ must lie below $g(x)=c\cdot x$ at least periodically. Hence, lemma 3.4 does not hold if we consider increasingly growing functions g. This implies that one could make the measure arbitrarily small by splitting the single-mode squeezer into many successive single-mode squeezers with smaller squeezing parameter, which does not reflect experimental reality.

A way to circumvent the failure of lemma 3.4 would be to work with the 'squeezing of formation' measure. Likewise, one could require that there was only one operation of type (O4) as specified in section 5 in any preparation procedure. In that case we have:

Proposition 7.1. If $g\,:{\mathbb{R}}\to {\mathbb{R}}$ fulfils

  • (1)  
    $\mathrm{log}\,\circ \,g\,\circ \,{ \mathcal C }$ is convex on $(1,\infty )$,
  • (2)  
    $\mathrm{log}(g(\exp (t)))$ is convex and monotone increasing in $t$,

then the squeezing of formation measure Gg is still operational, i.e. theorem 5.1 still holds.

Proof. The first condition replaces the $\mathrm{log}$-convexity of the Cayley transform in the proof of theorem 4.3, making the measure convex. Using [Bha96], II.3.5 (v), the second condition makes sure that equation (50) still holds. The second condition can probably be relaxed while the proof of theorem 5.1 is still applicable. A function g fulfilling these prerequisites is $g(x)=\exp (x)$, which would correspond to a squeezing cost increasing linearly in the squeezing parameter. One could even introduce a cutoff after which g would be infinite. □

A simpler way to reflect the problems of equation (56) would be to consider the measures G and ${G}_{\mathrm{minEig}}$ together (calculating ${G}_{\mathrm{minEig}}$ of both the state and the minimal preparation procedure in G).

Another problem is associated with the form of the Hamiltonian (1). In the lab, the Hamiltonians that can be implemented might not be single-mode squeezers, but other operations such as symmetric two-mode squeezers (e.g. [SZ97], chapter 2.8). It is clear how to define a measure ${G}^{\prime }$ for these kinds of squeezers. Using the Euler decomposition, G is a lower bound to ${G}^{\prime }$, but we did not investigate this any further.

Acknowledgments

MI thanks Konstantin Pieper for discussions about convex optimisation and Alexander Müller-Hermes for discussions about Matlab. MI is supported by the Studienstiftung des deutschen Volkes.

Appendix A.: Preliminaries for the proof of theorem 3.5

Let us collect facts about ordinary differential equations needed in the proof:

Proposition A.1. Consider the following system of differential equations for $x\,:[0,1]\to {{\mathbb{R}}}^{2n}$:

Equation (58)

where $A\in {L}^{\infty }([0,1],{\mathfrak{sp}}(2n))$. Then this system has a unique solution, which is linear in ${x}_{s}$ and defined on all of $[0,1]$ such that we can define a map

via $x{(t)}^{T}={x}_{s}^{T}U(s,t)$ called the propagator of (58) that fulfils:

  • (1)  
    U is continuous and differentiable almost everywhere.
  • (2)  
    $U(s,\cdot )$ is absolutely continuous in t.
  • (3)  
    $U(t,t)={\mathbb{1}}$ and $U(s,r)U(r,t)=U(s,t)$ for all $s,t\in [0,1]$.
  • (4)  
    $U{(s,t)}^{-1}=U(t,s)$ for all $s,t\in [0,1]$.
  • (5)  
    U is the unique generalised (i.e. almost everywhere) solution to the initial value problem
    Equation (59)
    on $C({[0,1]}^{2},{{\mathbb{R}}}^{2n\times 2n})$.
  • (6)  
    If $A(t)=A$ does not depend on t, then $S(r)=\exp ({rA})$ solves equation (59) with $U(s,t):= S(t-s)$.
  • (7)  
    for all $s,t\in [0,1]$:
  • (8)  
    $U(s,t)\in {Sp}(2n)$ for all $t,s\in [0,1]$ and $\gamma (t)=U(0,t)$ fulfills equation (9) with $\gamma (0)={\mathbb{1}}$.

Proof. The proof of this (except for the part about $U(s,t)\in {Sp}(2n)$) can be found in [Son98] (theorem 55 and lemma C.4.1) for the transposed differential equation $\dot{x}(t)=A(t)x(t)$.

For the last part, note that since $U(s,s)={\mathbb{1}}\in {Sp}(2n)$, we have $U{(s,s)}^{T}{JU}(s,s)=J$. We can now calculate almost everywhere:

since $A(t)\in {\mathfrak{sp}}(2n)$ and therefore ${A}^{T}(t)J-{JA}(t)=0$.

But this implies $U{(t,s)}^{T}{JU}(t,s)=J$, hence U is symplectic. Obviously, $U(0,t)$ solves equation (9).□

We will also need another well-known lemma from functional analysis:

Lemma A.2. Let $A\ :[0,1]\to {\mathfrak{sp}}(2n)$, $A\in {L}^{\infty }([0,1],{{\mathbb{R}}}^{2n\times 2n})$. Then A can be approximated in $\parallel \cdot {\parallel }_{1}$-norm by step-functions, which we can assume to map to ${\mathfrak{sp}}(2n)$ without loss of generality.

The approximation by step-function can be found e.g. in [Rud87] (chapter 2, exercise 24).

Appendix B.: The Cayley trick for matrices

In this appendix, we give an introduction to the Cayley-transform. The definition and properties needed in the main text are summarised by the following proposition:

Proposition B.1. Define the Cayley transform and its inverse via:

Equation (60)

Equation (61)

${ \mathcal C }$ is a diffeomorphism onto its image with inverse ${{ \mathcal C }}^{-1}$. Furthermore, it has the following properties:

  • (1)  
    ${ \mathcal C }$ is operator monotone and operator convex on matrices A with $\mathrm{spec}(A)\subset (-1,1)$.
  • (2)  
    ${{ \mathcal C }}^{-1}$ is operator monotone and operator concave on matrices A with $\mathrm{spec}(A)\subset (-1,\infty )$.
  • (3)  
    ${ \mathcal C }\ :{\mathbb{R}}\to {\mathbb{R}}$ with ${ \mathcal C }(x)=(1+x)/(1-x)$ is log-convex on $[0,1)$.
  • (4)  
    For $n=2m$ even, $H\in {{\mathbb{R}}}^{2m\times 2m}$ and $H\in { \mathcal H }$ if and only if ${ \mathcal C }(H)\in {Sp}(2m,{\mathbb{R}})$ and ${ \mathcal C }(H)\geqslant {\rm{i}}{J}$.

where ${ \mathcal H }$ is defined via:

The definition and the fact that this maps the upper half plane of positive definite matrices to matrices inside the unit circle is present in [AG88] (I.4.2) and [MS98] (proposition 2.51, proof 2). Since no proof is given in the references and they do not cover the whole proposition, we provide them here.

We start with well-definedness:

Lemma B.2.  ${ \mathcal C }$ and ${{ \mathcal C }}^{-1}$ are well-defined and inverses of each other. Moreover, ${ \mathcal C }$ is a diffeomorphism onto its image $\mathrm{dom}({{ \mathcal C }}^{-1})$.

Proof. If $\mathrm{spec}(H)\cap \{+1\}=\varnothing $, then ${\mathbb{1}}-H$ is invertible and $H\mapsto ({\mathbb{1}}+H)/({\mathbb{1}}-H)$ is well-defined, as $[{\mathbb{1}}+H,{\mathbb{1}}-H]=0$. Now let $H\in {{\mathbb{R}}}^{m\times m}$ be such that $\mathrm{spec}(H)\cap \{+1\}=\varnothing $. We will show that ${ \mathcal C }(H)$ contains no eigenvalue −1. To see this, let

Equation (62)

be the Jordan normal form with block sizes ni and eigenvalues ${\lambda }_{i}$. Let us here consider the complex Jordan decomposition, i.e. ${\lambda }_{i}$ are allowed to be complex. Then:

Equation (63)

and thus

For the inverse of the Jordan blocks, we can use the well-known formula:

In particular, this is still upper triangular. Then $J{({n}_{i},1+{\lambda }_{i})J({n}_{i},1-{\lambda }_{i})}^{-1}$ is still upper triangular with diagonal entries $(1+{\lambda }_{i})/(1-{\lambda }_{i})$. Since $(1+{\lambda }_{i})/(1-{\lambda }_{i})\ne -1$ for all ${\lambda }_{i}\in {\mathbb{C}}$, we find that $J{({n}_{i},1+{\lambda }_{i})J({n}_{i},1-{\lambda }_{i})}^{-1}$ cannot have eigenvalue −1 for any i, hence $\mathrm{spec}({ \mathcal C }(H))\cap \{-1\}\ne \varnothing $.

Finally, we observe:

Moreover, set ${f}_{1}(A)=-2A-{\mathbb{1}}$ for all matrices $A\in {{\mathbb{R}}}^{m\times m}$, ${f}_{2}(A)={A}^{-1}$ for all invertible matrices $A\in {{\mathbb{R}}}^{m\times m}$ and ${f}_{3}(A)=A-{\mathbb{1}}$ for all matrices $A\in {{\mathbb{R}}}^{m\times m}$. Then we have

Equation (64)

Since fi are differentiable for all $i=1,2,3$, we have that ${ \mathcal C }$ is invertible.

The same considerations with a few signs reversed also lead us to conclude that ${{ \mathcal C }}^{-1}$ is well-defined and indeed the inverse of ${ \mathcal C }$. We can similarly decompose ${{ \mathcal C }}^{-1}$ to show that it is differentiable, making ${ \mathcal C }$ a diffeomorphism. Here, we define ${g}_{1}(A)=2A+{\mathbb{1}}$ for all $A\in {{\mathbb{R}}}^{m\times m}$, ${g}_{2}(A)={A}^{-1}$ for all invertible $A\in {{\mathbb{R}}}^{m\times m}$ and ${g}_{3}(A)=A+{\mathbb{1}}$ for all $A\in {{\mathbb{R}}}^{m\times m}$. A quick calculation shows

Equation (65)

Denote by ${ \mathcal H }$ the set

Equation (66)

where $H\lt {\mathbb{1}}$ means that ${\mathbb{1}}-H$ is positive definite (not just positive semidefinite). We can then prove the Cayley trick:

Proposition B.3. Let $H\in {{\mathbb{R}}}^{2n\times 2n}$. Then $H\in { \mathcal H }\iff ({ \mathcal C }(H)\in {Sp}(2n)\wedge { \mathcal C }(H)\geqslant {\rm{i}}{J})$.

Proof. Note that for $H\in { \mathcal H }$, $1\notin \mathrm{spec}(H)$, hence ${ \mathcal C }(H)$ is always well-defined. ${ \mathcal C }{(H)=({\mathbb{1}}+H)({\mathbb{1}}-H)}^{-1}\geqslant 0$, since ${\mathbb{1}}+H\geqslant 0$ and ${({\mathbb{1}}-H)}^{-1}\geqslant 0$ as $-{\mathbb{1}}\lt H\lt {\mathbb{1}}$. Observe:

Then we can calculate:

hence ${ \mathcal C }{(H)J=J{ \mathcal C }(H)}^{-1}$ and as ${ \mathcal C }(H)$ is Hermitian, we have ${ \mathcal C }{(H)}^{T}J{ \mathcal C }(H)=J$ and ${ \mathcal C }(H)$ is symplectic. Via corollary 2.10, as ${ \mathcal C }(H)$ is symplectic and positive definite, we can conclude that ${ \mathcal C }(H)\geqslant {\rm{i}}{J}$.

Conversely, let $S\in {Sp}(2n)$ and $S\geqslant {\rm{i}}{J}$. Then S ≥ −iJ by complex conjugation and $S\geqslant 0$ after averaging the two inequalities. Since any element of ${Sp}(2n)$ is invertible, this implies $S\gt 0$. From this we obtain:

Write ${(S-{\mathbb{1}})\cdot (S+{\mathbb{1}})}^{-1}=\left(\begin{array}{cc}A & B\\ C & D\end{array}\right)$. As S is Hermitian, ${A}^{T}=A$ and $C={B}^{T}$, ${D}^{T}=D$.

We have on the one hand

and on the other hand

Put together this implies $B={B}^{T}$ and $D=-A$, hence ${{ \mathcal C }}^{-1}(S)\in { \mathcal H }$, which is what we claimed.□

Proposition B.4. The Cayley transform ${ \mathcal C }$ is operator monotone and operator convex on the set of $A={A}^{T}\in {{\mathbb{R}}}^{m\times m}$ with $\mathrm{spec}(A)\subset (-1,1)$. ${{ \mathcal C }}^{-1}$ is operator monotone and operator concave on the set of $A={A}^{T}\in {{\mathbb{R}}}^{m\times m}$ with $\mathrm{spec}(A)\subset (-1,\infty )$.

Proof. Recall equation (64) and the definition of ${f}_{1},{f}_{2},{f}_{3}$. f1 and f3 are affine and thus for all $X\geqslant Y$ : ${f}_{3}(X)\geqslant {f}_{3}(Y)$ and ${f}_{1}(X)\leqslant {f}_{1}(Y)$. For $X\geqslant Y\geqslant 0$, we also have ${f}_{2}(Y)\ \geqslant {f}_{2}(X)\geqslant 0$ since matrix inversion is antimonotone. Now let $-{\mathbb{1}}\leqslant Y\leqslant X\leqslant 1$, then $-2{\mathbb{1}}\leqslant {f}_{3}(Y)\leqslant {f}_{3}(X)\leqslant 0$ and $-1/2{\mathbb{1}}\geqslant {f}_{2}\,\circ \,{f}_{3}(Y)\geqslant {f}_{2}\,\circ \,{f}_{3}(X)\geqslant 0$ and finally ${ \mathcal C }(X)\ \geqslant { \mathcal C }(Y)\geqslant 0$, proving monotonicity of ${ \mathcal C }$. Similarly, one can prove that ${{ \mathcal C }}^{-1}$ is monotonous using equation (65).

For the convexity of ${ \mathcal C }$, we note that since ${f}_{1},{f}_{3}$ are affine they are both convex and concave. It is well-known that $1/x$ is operator convex for positive definite and operator concave for negative definite matrices (to prove this, consider convexity/concavity of the functions $\langle \psi ,{X}^{-1}\psi \rangle $ for all ψ). It follows that for $-{\mathbb{1}}\leqslant H\leqslant {\mathbb{1}}$ we have ${f}_{3}(x)\leqslant 0$, hence ${f}_{2}\,\circ \,{f}_{3}$ is operator concave on $-{\mathbb{1}}\leqslant H\leqslant {\mathbb{1}}$. As ${f}_{1}(A)=-2A-{\mathbb{1}}$, this implies that ${ \mathcal C }={f}_{1}\,\circ \,{f}_{2}\,\circ \,{f}_{3}$ is operator convex.

For the concavity of ${{ \mathcal C }}^{-1}$, recall equation (65) and the definitions of ${g}_{1},{g}_{2},{g}_{3}$. Then, given $-{\mathbb{1}}\leqslant X$, we have ${g}_{3}(X)$ is positive definite and concave as an affine map. g2 is concave on positive definite matrices, as $1/x$ is convex and $(-1)$ is order-reversing, hence $-1/x$ is concave on positive definite matrices. Since g1 is concave as an affine map, ${g}_{1}\,\circ \,{g}_{2}\,\circ \,{g}_{3}={{ \mathcal C }}^{-1}$ is operator concave for all $-{\mathbb{1}}\leqslant X$.□

Lemma B.5.  ${ \mathcal C }\ :{\mathbb{R}}\to {\mathbb{R}}$ is $\mathrm{log}$-convex on $[0,1)$.

Proof. We need to see that the function $h(x)=\mathrm{log}\tfrac{1+x}{1-x}$ is convex for $x\in [0,1)$. Since h is differentiable on $[0,1)$, this is true iff the second derivative is non-negative:

is clearly positive on $[0,1)$ and h is therefore $\mathrm{log}$-convex. □

Appendix C.: Continuity of set-valued functions

Here, we provide some definitions and lemmata from set-valued analysis for the reader's convenience. This branch of mathematics deals with functions $f\ :X\to {2}^{Y}$ where X and Y are topological spaces and ${2}^{Y}$ denotes the power set of Y.

In order to state the results interesting to us we define:

Definition C.1. Let $X,Y\subseteq {{\mathbb{R}}}^{n\times m}$ and $f\ :X\to {2}^{Y}$ be a set-valued function. Then we say that a function is upper semicontinuous (often also called upper hemicontinuous to distinguish it from other notions of continuity) at ${x}_{0}\in X$ if for all open neighbourhoods $Q$ of $f({x}_{0})$ there exists an open neighbourhood $W$ of ${x}_{0}$ such that $W\subseteq \{x\in X| f(x)\subset Q\}$. Likewise, we call it lower semicontinuous (often called lower hemicontinuous) at a point ${x}_{0}$ if for any open set $V$ intersecting $f({x}_{0})$, we can find a neighbourhood $U$ of ${x}_{0}$ such that $f(x)\cap V\ne \varnothing $ for all $x\in U$.

Note that the definitions are valid in all topological spaces, but we only need the case of finite dimensional normed vector spaces. Using the metric, we can give the following characterisation of upper semicontinuity:

Lemma C.2. Let $X,Y\subseteq {{\mathbb{R}}}^{n\times m}$ and $f\ :X\to {2}^{Y}$ be a set-valued function such that $f(x)$ is compact for all $x$. Then $f$ is upper semicontinuous at ${x}_{0}$ if and only if for all $\varepsilon \gt 0$ there exists a $\delta \gt 0$ such that for all $x\in X$ with $\parallel x-{x}_{0}\parallel \lt \delta $ we have: for all $y\in f(x)$ there exists a $\tilde{y}\in f({x}_{0})$ such that $\parallel y-\tilde{y}\,\parallel \lt \varepsilon $.

Proof.  $\Rightarrow $: Let f be lower semicontinuous at ${x}_{0}$. For any $\varepsilon \gt 0$ the set

Equation (67)

is an open neighbourhood of $f({x}_{0})$. Hence there exists an open neighbourhood W of ${x}_{0}$, which contains a ball of radius $\delta \gt 0$ such that ${B}_{\delta }({x}_{0})\subseteq W\subseteq \{x\in X| f(x)\subset B(\varepsilon ,f({x}_{0}))\}$. Clearly this implies the statement.

⇐: Let Q be a neighbourhood of $f({x}_{0})$. Since $f({x}_{0})$ is compact this implies that there is a $\varepsilon \gt 0$ such that $B(\varepsilon ,f({x}_{0}))\subseteq Q$ where this set is defined as in equation (67). If this were not the case, for every $n\in {\mathbb{N}}$ there must be a ${y}_{n}\in Y\setminus Q$ such that ${\inf }_{\hat{y}\in f({x}_{0})}\parallel {y}_{n}-\hat{y}\,\parallel \lt 1/n$. Since by construction this implies that ${y}_{n}\in B(1,f({x}_{0}))$, which is compact, a subsequence of these yn must converge to y. As $Y\setminus Q$ is closed as Q is open, $y\in Y\setminus Q$. However, ${\inf }_{\hat{y}\in f({x}_{0})}\parallel y-\hat{y}\,\parallel =0$ by construction and since $f({x}_{0})$ is compact, the infimum is attained, which implies $y\in f({x}_{0})$. This contradicts the fact that Q is a neighbourhood of $f({x}_{0})$.

Hence we know that for any open Q containing $f({x}_{0})$ there exists a $\varepsilon \gt 0$ such that $B(\varepsilon ,f({x}_{0}))\subseteq Q$. By assumption, this implies that there exists a $\delta \gt 0$ such that ${B}_{\delta }({x}_{0})\subseteq \{x\in X| f(x)\subset B(\varepsilon ,f({x}_{0}))\}$. Since clearly $\{x\in X| f(x)\subset B(\varepsilon ,f({x}_{0}))\}\ \subseteq \{x\in X| f(x)\subset Q\}$ we can choose $W:= {B}_{\delta }({x}_{0})$ to finish the proof.□

This second characterisation is sometimes called upper Hausdorff semicontinuity and it can equally be defined in any metric space. Clearly, the notions can differ for set-valued functions with non-compact values or in spaces which are not finite dimensional. With these two definitions, we can state the following classic result:

([DR79]).

Proposition C.3 Let $Y$ be a complete metric space, $X$ a topological space and $f\,:X\to {2}^{Y}$ a compact-valued set-valued function. The following statements are equivalent:

  • $f$ is upper semicontinuous at ${x}_{0}$.
  • for each closed $K\subseteq X$, $K\cap f({x}_{0})$ is upper semicontinuous at ${x}_{0}$.

An interesting question would be whether the converse is also true. Even if f(x) is always convex, this need not be the case if $K\cap f({x}_{0})$ has empty interior as simple counterexamples can show. In case the interior is non-empty, another classic results guarantees a converse in many cases:

([Mor75]).

Proposition C.4 Let $X$ be a compact interval and $Y$ a normed space. Let $f\,:X\to {2}^{Y}$ and $g\,:X\to {2}^{Y}$ be two convex-valued set-valued functions. Suppose that $\mathrm{diam}(f(t)\cap g(t))\lt \infty $ and $f(t)\cap \mathrm{int}(g(t))\ne \varnothing $ for all $t$. Then if $f,g$ are continuous (in the sense above) so is $f\cap g$.

Appendix D.: Reduction of the set of necessary operations for state preparation

In this section, we give a justification of why the operations (O0)–(O7) are enough to implement all operations described in section 5. All of this is known albeit scattered throughout the literature, hence we collect it here.

In order to prepare a state, we could start with the vacuum $\gamma ={\mathbb{1}}$ or alternatively a thermal state for some bath ($\gamma =(1/2+N){\mathbb{1}}$ with photon number N, see e.g. [Oli12]). Of course, we should be able to draw arbitrary ancillary modes of this system, too. The effect of Gaussian noise on the covariance matrix is given in [Lin00]. Since for any $\gamma \geqslant {\mathbb{1}}$ we can decompose it as $\gamma ={\mathbb{1}}+{\gamma }_{\mathrm{noise}}$, this implies that the operations (O0)–(O2) are enough to implement all operations 1. and 2.

As with other squeezing measures, passive transformations should not change the squeezing measure, while single-mode-squeezers are not free. The effect of symplectic transformations on the covariance matrix has already been observed in equation (10), hence (O3) and (O4) implement operations (3) and (8).

Since we have the Weyl-system at our disposal, we can also consider its action on a quantum state (translation in phase space). Direct computation shows that it does not affect the covariance matrix. Including it as operation (O5) is beneficial if we consider a convex combination of states. In an experiment, this can be done by creating ensembles of the states of the convex combination and creating another ensemble where the ratio of the different states is that of the convex combination. On the level of covariance matrices, we have the following lemma:

Lemma D.1. Let $\rho $ and ${\rho }^{\prime }$ be two states with displacement ${d}^{\rho }$ and ${d}^{\rho ^{\prime} }$ and (centred) covariance matrices ${\gamma }^{\rho }$ and ${\gamma }^{\rho ^{\prime} }$. For $\lambda \in (0,1)$, the covariance matrix of $\tilde{\rho }:= \lambda \rho +(1-\lambda ){\rho }^{\prime }$ is given by:

A proof of this statement can be found in [WW01] (in the proof of proposition 1). Note that for centralised states with ${d}^{\rho }=0$ and ${d}^{\rho ^{\prime} }=0$, a convex combination of states translates to a convex combination of covariance matrices. Since in particular, $2\lambda {(1-\lambda )({d}^{\rho }-{d}^{\rho ^{\prime} })({d}^{\rho }-{d}^{\rho ^{\prime} })}^{T}\geqslant 0$, any convex combination of ρ and ${\rho }^{\prime }$ is on the level of covariance matrices equivalent to

  • centring the states (no change in the covariance matrices),
  • taking a convex combination of the states (resulting in a convex combination of covariance matrices),
  • performing a Weyl translation to undo the centralization in the first step (no change in the covariance matrix).
  • Adding noise $2\lambda (1-\lambda )({d}^{\rho }-{d}^{\rho ^{\prime} }){({d}^{\rho }-{d}^{\rho ^{\prime} })}^{T}\geqslant 0$.

This implies that the effect of any convex combination of states (operation 4) on the covariance matrix can equivalently be obtained from operations (O2), (O5) and (O6). Finally, we consider measurements. Homodyne detection is the measurement of Q or P in one of the modes, which corresponds to the measurement of an infinitely squeezed pure state in lemma D.2. A broader class of measurements known as heterodyne detection measures arbitrary coherent states [Wee+12]. Let us focus our attention on the even broader class of projections onto Gaussian pure states.

Lemma D.2. Let $\rho $ be an $(n+1)$-mode quantum state with covariance matrix $\gamma $ and $| {\gamma }_{G},d\rangle \langle {\gamma }_{G},d| $ be a pure single-mode Gaussian state with covariance matrix ${\gamma }_{G}\in {{\mathbb{R}}}^{2\times 2}$ and displacement $d$. Let

then the selective measurement of $| {\gamma }_{G},d\rangle $ in the last mode results in a change of the covariance matrix of $\rho $ according to:

Equation (68)

where MP denotes the Moore–Penrose pseudoinverse. Homodyne detection corresponds to the case where ${\gamma }_{G}$ is an infinitely squeezed state.

This can most easily be seen on the level of Wigner functions, as demonstrated in [ESP02, GIC02]. The generalisation to multiple modes is straightforward.

Since the covariance matrix of a Gaussian pure state is a symplectic matrix (see proposition 2.2), using the Euler decomposition we can implement a selective Gaussian measurement by

  • (1)  
    a passive symplectic transformation $S\in K(n+1)$,
  • (2)  
    a measurement in the Gaussian state $\mathrm{diag}(d,1/d)$ for some $d\in {{\mathbb{R}}}_{+}$ according to lemma D.2.

A non-selective measurement (forgetting the information obtained from measurement) would then be a convex combination of such projected states. A measurement of a multi-mode state can be seen as successive measurements of single-mode states since the Gaussian states we measure are diagonal.

For homodyne detection, since an infinitely squeezed single-mode state is given by the covariance matrix ${\mathrm{lim}}_{d\to \infty }\mathrm{diag}(1/d,d)$, we have

Equation (69)

where $\pi =\mathrm{diag}(1,0)$ is a projection and MP denotes the Moore–Penrose-pseudoinverse. It has been shown (see [Wee+12] E.2 and E.3 as well as [ESP02, GIC02]) that any (partial or total) Gaussian measurement is a combination of passive transformations, discarding subsystems, projection onto Gaussian states and homodyne detection.

Therefore, we should also allow to discard part of the system, i.e. taking the partial trace. However, this can be expressed as a combination of operations (O1)–(O6) and homodyne detection:

Lemma D.3. Given a covariance matrix $\gamma =\left(\begin{array}{cc}A & C\\ {C}^{T} & B\end{array}\right)$ a partial trace on the second system translates to a map $\gamma \mapsto A$. The partial trace can then be implemented by measurements and adding noise.

Proof. When measuring the modes B, we note that since $C{(\pi B\pi )}^{{\rm{MP}}}{C}^{T}\geqslant 0$ in equation (69), a partial trace is equivalent to first performing a homodyne detection on the B-modes of the system and then adding noise. □

Given the discussion above, lemmas D.2 and D.3 put together imply: on the level of covariance matrices, in order to allow for general Gaussian measurements, it suffices to consider Gaussian measurements of the state $| {\gamma }_{d},0\rangle \langle {\gamma }_{d},0| $ with covariance matrix ${\gamma }_{d}=\mathrm{diag}(1/d,d)$ for $d\in {{\mathbb{R}}}_{+}\cup \{+\infty \}$. All Gaussian measurements are then just combinations of these special measurements and operations (O1)–(O6).

Appendix E.: Numerical implementation and documentation

Here, we provide a short documentation to the programme written in Matlab, Version R2014a, and used for the numerical computations in section 6. The source Code can be found at GitHub https://github.com/Martin-Idel/operationalsqueezing.

The programme tries to minimise the function f defined in equation (28) over the set ${ \mathcal H }$. Throughout, suppose we are given a covariance matrix γ.

Let us first describe the implementation of f: as parameterisation of ${ \mathcal H }$, we choose the simplest parameterisation such that for matrices with symplectic eigenvalues larger than one, the set of feasible points has non-empty interior: we parameterise $A,B$ via matrix units ${E}_{i},{E}_{{jk}}$ with $i\in \{1,\ldots ,n\}$, $k\in \{1,\ldots ,n-1\}$ and $j\lt k$, where ${({E}_{i})}_{{jk}}={\delta }_{{ij}}{\delta }_{{ik}}$ and ${({E}_{{jk}})}_{{lm}}={\delta }_{{jl}}{\delta }_{{km}}+\delta {jm}{\delta }_{{kl}}$. This parameterisation might not be very robust, but it is good enough for our purpose. Instead of working with complex parameters, we compute ${s}_{i}(A+{\rm{i}}B)$ as ${\lambda }_{i}^{\downarrow }(H)$ for the matrix

Equation (70)

The evaluation of f is done in function objective.m. Since f is not convex for (A, B) with the corresponding H having eigenvalues $\geqslant 1$ or $\leqslant -1$, the function first checks, whether this constraint is satisfied and outputs a value that is 107-times larger than the value of the objective function at the starting point otherwise.

The constraints are implemented in function maxresidual.m. Via symmetry, it is enough to check that for any H tested, ${\lambda }_{2n}^{\downarrow }(H)\geqslant 1$. The second constraint is given by ${{ \mathcal C }}^{-1}(\gamma )\geqslant H$ and this is tested by computing the smallest eigenvalue of the difference.

The function which is most important for users is minimum.m, which takes a covariance matrix ${\rm{\Gamma }}\geqslant {\rm{i}}{J}$, its dimensions n and a number of options as arguments and outputs the minimum. Note that the programme checks whether the covariance matrix is valid. For the minimisation, we use the Matlab-based solver SolvOpt ([KK97], latest version 1.1). SolvOpt uses a subgradient based method and the method of exact penalization to compute (local) minima. For convex programming, any minimum found by the solver is therefore an absolute minimum. In order to work, the objective function may not be differentiable on a set of measure zero and it is allowed to be non-differentiable at the minimum. Since f is differentiable for all H with non-degenerate eigenvalues, this condition is met. In addition, SolvOpt needs f to be defined everywhere, as it is not an interior point method. Since f is well-defined but not convex for $H\notin { \mathcal H }$ and $\mathrm{spec}(H)\cup \{1\}=\varnothing $, we remedy this by changing the output of objective.m to be very large when $H\notin { \mathcal H }$ as described above. Constraints are handled via the method of exact penalisation. We used SolvOpt's algorithm to compute the penalisation functions on its own.

It is possible (and for speed purposes advisable) to implement analytical gradients of both the objective and the constraint functions. Following [Mag85], for diagonalisable matrices A with no eigenvalue multiplicities, the derivative of an eigenvalue ${\lambda }_{i}(A)$ is given by:

Equation (71)

where vi(A) is the eigenvector corresponding to ${\lambda }_{i}(A)$ and ${\partial }_{v}(A)={\mathrm{lim}}_{h\to 0}(A+{hE}-A)/h=E$. Luckily, if A is not differentiable, this provides at least one subgradient. An easy calculation shows that a subgradient of the objective function f for matrices H with $-{\mathbb{1}}\lt H\lt {\mathbb{1}}$ in the parameterisation of the matrix units Eij is given by

Equation (72)

with F being the matrices corresponding to the chosen parameterisation. The gradient of the constraint function is very similar and given by equation (71) for $A=\gamma -H$ or $A=2{\mathbb{1}}-H$ depending on which constraint is violated. This is implemented in functions objectivegrad.m and maxresidualgrad.m.

SolvOpt needs a starting point. Given Γ, via Williamson's theorem, ${\rm{\Gamma }}={S}^{T}{DS}\geqslant {S}^{T}S$, hence STS provides a good starting point. The function williamson.m computes the Williamson normal form for γ and returns S, D and STS, the latter of which is used as starting point. It computes S and D essentially by computing the Schur decomposition of ${{\rm{\Gamma }}}^{-1/2}J{{\rm{\Gamma }}}^{-1/2}$ (in the σ-basis instead of the J-basis). S is then given by ${S}^{T}={\gamma }_{1/2}{{KD}}^{-1/2}$ (see the proof of [SCS99]), where K is the Schur transformation matrix.

A number of comments are in order:

  • (1)  
    All functions use global variables instead of function handles. This is required by the fact that SolvOpt has not been adapted to the use of function handles. The user should therefore always reset all variables before running the programme.
  • (2)  
    SolvOpt is not an interior point method, i.e. the results can at times violate constraints. We use the default value for the accuracy of constraints, which is 10−8 and can be modified by option six. The preparation error should be of the same order than the accuracy of constraints as long as the largest eigenvalue of the minimising symplectic matrix is of order one.
  • (3)  
    For our numerical tests, we used bounds on the minimal step-size and the minimal error in f (SolvOpt options two and three) of the order 10−6 and 10−8, which seemed sufficient.
  • (4)  
    All functions called by SolvOpt (the functions objective.m, objectivegrad.m, maxresidual,m, maxresidualgrad.m and xtoH.m) are properly vectorised to ensure maximal speed.

Finally, bounds.m contains all lower- and upper bounds described in section 4.5. The semidefinite programme was solved using CVX (version SDPT3 4.0), a toolbox developed in Matlab for disciplined convex programming including semidefinite programming [GB08], [GB14]. The third bound is not described in section 4.5—it is an iteration of corollary 4.8 assuming superadditivity, hence in principle it could be violated. If it were violated, this would immediately disprove superadditivity, which has never been observed in our tests.

Issues and further suggestions: It occurs sometimes that the algorithm does not converge to a minimum inside or near the feasible set. We believe that this is due to instabilities in the parameterisation and implementation. The behaviour can occur while using numerical as well as analytical subgradients, although it occurs more often with analytical ones. For every example where we could observe a failure with either numerical or analytical subgradients, one other method (using numerical subgradients, using analytical subgradients or a mixture thereof) worked fine. In cases of failure, the routine issued several warnings and the result usually lies below the lower bound. A different type of implementation might lead to an algorithm that is more stable, but we did not pursue this any further. It might also be worth to consider trying to compute the penalty function analytically.

In terms of performance times, the algorithm is generally fast for small numbers of modes. When analytical subgradients are not implemented, the performance bottleneck is given by the functions xtoH.m, which is called most often. When analytical subgradients are provided, the performance is naturally much faster. This is particularly important when the number of modes increases. While for five modes, the calculation is done within seconds, already for ten modes and depending on the matrix, it can take a minute on a usual laptop (the algorithm now takes the most amount of time for eigenvalue computations, which seems unavoidable). For even larger matrices, it might be advisable to switch from using the Matlab function Eig to Eigf, but for our examples this did not lead to a time gain.

Please wait… references are loading.