Paper The following article is Open access

Higher-order interference and single-system postulates characterizing quantum theory

, and

Published 10 December 2014 © 2014 IOP Publishing Ltd and Deutsche Physikalische Gesellschaft
, , Citation Howard Barnum et al 2014 New J. Phys. 16 123029 DOI 10.1088/1367-2630/16/12/123029

1367-2630/16/12/123029

Abstract

We present a new characterization of quantum theory in terms of simple physical principles that is different from previous ones in two important respects: first, it only refers to properties of single systems without any assumptions on the composition of many systems; and second, it is closer to experiment by having absence of higher-order interference as a postulate, which is currently the subject of experimental investigation. We give three postulates—no higher-order interference, classical decomposability of states, and strong symmetry—and prove that the only non-classical operational probabilistic theories satisfying them are real, complex, and quaternionic quantum theory, together with three-level octonionic quantum theory and ball state spaces of arbitrary dimension. Then we show that adding observability of energy as a fourth postulate yields complex quantum theory as the unique solution, relating the emergence of the complex numbers to the possibility of Hamiltonian dynamics. We also show that there may be interesting non-quantum theories satisfying only the first two of our postulates, which would allow for higher-order interference in experiments while still respecting the contextuality analogue of the local orthogonality principle.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Quantum theory currently underpins much of modern physics and is essential in many other scientific fields and countless technological applications. However, by most accounts quantum phenomena remain rather mysterious: there is no generally accepted intuitive picture of the underlying reality, and the standard textbook introductions of the mathematical formalism lack a simple conceptual motivation.

With the rise of quantum information processing and the ever more refined control of quantum phenomena, there has recently been a surge of diverse attempts to tackle such foundational questions. These range from studies of the information processing capabilities of theories similar to quantum theory [10, 15, 3537], to reconstructions of the formalism from information-theoretic principles [42, 43, 48, 50, 51, 71], to no-go theorems regarding interpretations and generalizations of the formalism [8, 9, 58], to novel experiments testing various predictions of the theory [57].

In this paper we give several closely related reconstructions of the mathematical structure—Hilbert space, Hermitian observables, positive operator-valued measures—of finite-dimensional quantum theory from simple postulates with clear physical significance and generality.

Providing such an explanation for the Hilbert space structure of quantum theory in terms of physically (not just mathematically) natural postulates is important for several reasons. First, deeper and more reasonable principles can help to dissolve the mysteries of quantum phenomena and make them more intelligible and easier to teach. Two well-known examples of this approach are Keplerʼs laws of planetary motion and their explanation through Newtonʼs laws of motion and gravitation, and the Lorentz transformations and their explanation in Einsteinʼs two relativity postulates. Second, it can be argued that this approach will be essential in making progress on problems such as formulating a theory unifying quantum and gravitational physics, as well as for developing potentially more accurate and more fundamental theories. In the absence of a picture of the underlying reality, we can use first principles to proceed toward the next physical theory in a careful, conceptual fashion. More practically, this approach can shed light on what is responsible for the power of quantum information processing and cryptography.

Because quantum theory applies to an extremely broad range of physical systems and phenomena, and its probabilistic structure seems essential, we work within a broad framework for studying probabilistic physical theories (usually called operational probabilistic theories). These are theories that succinctly describe sets of experiments and assign probabilities to measurement outcomes. More precisely, we imagine that physicists, or nature, prepare physical systems in various states, and then observe these systems in various ways. The outcomes of these observations occur with certain probabilities, which are predicted by the theory. It is important to emphasize that we do not assume that these probabilities are described by quantum theory; instead our postulates will allow us to derive their structure as represented by quantum theory.

Our postulates are as follows:

  • 1  
    Classical decomposability: every state of a physical system can be represented as a probabilistic mixture of perfectly distinguishable states of maximal knowledge ('pure states').
  • 2  
    Strong symmetry: every set of perfectly distinguishable pure states of a given size can be reversibly transformed to any other such set of the same size.
  • 3  
    No higher-order interference: the interference pattern between mutually exclusive 'paths' in an experiment is exactly the sum of the patterns which would be observed in all two-path sub-experiments, corrected for overlaps.
  • 4  
    Observability of energy: there is non-trivial continuous reversible time evolution, and the generator of every such evolution can be associated to an observable ('energy') which is a conserved quantity.

Before discussing their physical interpretation and motivation in more detail, we point out that all of our postulates refer to single systems only. This is in contrast to earlier reconstructions of quantum theory [42, 43, 48, 50] which rely heavily on properties of composite systems. Our motivation to rely on single systems is as follows. It is not clear that the notion of subsystems and their composition, as it is often used in information-theoretic circuit diagrams and category-theoretic considerations, applies to physics without change in its full operational interpretation. For example, if a composite quantum system consists of spacelike separated subsystems, then the causal spacetime structure of special relativity imposes additional complications when describing the possible joint measurements on the composite system [77]. These additional restrictions are usually not captured by operational approaches, which just declare a set of states and measurements for the composite system, and postulate that these can in principle be implemented to arbitrary accuracy. Therefore, a safe strategy for an operational approach seems to be to avoid making assumptions about the state space structure of composite systems, and to talk only about stand-alone systems. These may or may not correspond to effective physical subsystems that can be controlled by an agent in a laboratory.

Moreover, there has recently been a surge of interest in finding compelling physical principles that explain the specific contextuality behavior of quantum theory as compared to other probabilistic theories. This line of research aims at analyzing the single-system analogue of quantum non-locality, and understanding its specific characteristics in terms of principles such as 'consistent exclusivity' [54]. Our results also contribute to this line of research by showing that postulates 1 and 2 are sufficient to guarantee that systems satisfy consistent exclusivity.

We do not claim that our postulates are the only reasonable ones, but we think that they—like other recent reconstructions—are more natural than the usual abstract formulations which simply presume Hilbert spaces, complex numbers, and operators. Moreover, as we discuss below, we think that our formulation is especially suitable for the search for interesting and physically reasonable modifications of quantum theory; that is, state spaces that are not described by the Hilbert space formalism but are otherwise consistent and physically plausible.

Comparison to other reconstructions can help uncover logical relations between various physical structures of our world. For example, our fourth postulate (observability of energy) is used to rule out non-complex Hilbert spaces in this work, while in other reconstructions this role is usually played by the the postulate of tomographic locality, which states that joint states on composite systems are uniquely determined by local measurement statistics and their correlations. Thus, one may argue that there is a logical relationship between tomographic locality and observability of energy, and thus ultimately with the fact that we observe Hamiltonian mechanics in our world.

We will now give a short discussion of the interpretation of our postulates. To clarify the terms in postulate 1, a set of states is perfectly distinguishable if there is a measurement whose outcomes can be paired one-to-one with the states so that each measurement outcome has probability one when its corresponding state has been prepared, and probability zero when any of the other states have been prepared. A state of maximal knowledge (a 'pure state') is a state ω which cannot be written as a nontrivial convex combination of states, i.e. as $\omega =p\sigma +q\tau $ where $p+q=1$, $p,q\gt 0$, and $\sigma \ne \tau $. That is, it cannot be viewed as arising from a lack of knowledge about which of two distinct states has been prepared.

Postulate 1 can be viewed as a generalization of the spectral decomposition of every quantum density matrix as a convex combination of orthogonal rank-one projectors onto orthogonal eigenstates of the density matrix. However, our postulate is stated purely in terms of the convex structures of the set of states and of measurement outcomes; the notion of spectrum of an operator is not involved. An important part of the physical significance of this postulate is that it appears likely to be needed for an information theory and probably a statistical mechanics that share desirable and physically fundamental properties with those supported by quantum theory. In particular, it is a plausible conjecture that this postulate implies the correspondence of two natural ways of defining entropies for states in generalized probabilistic theories [66, 67]: the first as the minimal entropy of the outcomes of a fine-grained measurement made on the state, and the second as the minimal entropy of a preparation of the state as a mixture of pure states.

Postulate 2 expresses a fundamental symmetry: given any integer n, all n-level systems are informationally equivalent. That is, we can transmit (not necessarily copy) the state of any n-level system to any other system without losing information, at least in principle. This implies a certain minimal amount of possible reversible dynamics or computational power.

Postulate 3, that the system exhibits at most 'second-order interference,' is based on the notion of multi-slit interference introduced by Rafael Sorkin [4]. This is a manifestly physical assumption which is currently under experimental investigation [5, 6]. The precise notion of an interference experiment will be defined in section 5 below; an illustration is given in figure 1.

Figure 1.

Figure 1. Higher-order interference. Consider a particle which can pass one of M (here: M = 4) slits, where some of the slits may be blocked by the experimenter (indicated by the black bars). After passing the multi-slit setup, the particle may trigger a certain event, for example the click of a detector localized in a certain area of the screen. We are interested in the probability pJ of this event, given that slits $J\subset \{1,2,\ldots ,M\}$ are open (for example p23 in the depicted setup). Classically, the probability of such an event given that all four slits are open, p1234, equals ${{p}_{1}}+{{p}_{2}}+{{p}_{3}}+{{p}_{4}}$, where pi is the probability assuming than only slit i is open. This is violated in quantum theory due to interference. However, even in quantum theory, the total probability can be computed from contributions of pairs of slits only: we have ${{p}_{1234}}={{p}_{12}}+{{p}_{13}}+{{p}_{14}}+{{p}_{23}}+{{p}_{24}}+{{p}_{34}}-2{{p}_{1}}-2{{p}_{2}}-2{{p}_{3}}-2{{p}_{4}}$. It is in this sense that quantum theory has second-, but no third- or higher-order interference. The definition of interference that we use is not restricted to spatially arranged slits, but is formulated generally for any set of M perfectly distinguishable alternatives in a probabilistic theory.

Standard image High-resolution image

This postulate suggests a possible route towards obtaining concrete predictions for conceivable third-order interference in experiments: drop the third postulate, and work out the new set of theories that satisfy only postulates 1 and 2 (and possibly 4). As we will show, any system of this kind—if it exists—has a set of 'filtering' operations that represent an orthomodular lattice known from quantum logic [38], but these filters do not necessarily preserve the purity of states as they do in quantum theory (equivalently, the lattice does not satisfy the 'covering law'). However, these systems still satisfy the principle of 'consistent exclusivity' [54], bringing their contextuality behavior close to quantum theory, despite the appearance of (non-quantum) third-order interference.

In this way, our results hint at possible physical properties of conceivable alternative theories against which quantum theory can be tested in interference experiments, and which may be of independent mathematical interest. In particular, the existence of theories exhibiting higher-order interference and containing quantum theory as a subtheory has been conjectured for several years. Preliminary results indicate interesting physical properties of those theories [56], but the concrete construction of the corresponding state spaces is still an open problem. We hope that our approach can help to make progress on this question.

We obtain our main result by first showing that the first three postulates bring us very close to quantum theory: they imply that systems are described by finite-dimensional irreducible (simple) formally real Jordan algebras, or are classical. Moreover, these three postulates precisely characterize this class of theories, since classical systems and irreducible Jordan algebras all satisfy postulates 1–3. As Jordan et al [13] showed, the formally real irreducible Jordan algebras are the real, complex, and quaternionic quantum theories (for all finite dimensions), one exceptional case (the 3 × 3 octonionic 'density matrices') and the spin factors (ball-shaped state spaces) of all finite dimensions. Standard complex quantum theory is the only one among these which also satisfies the fourth postulate.

The association of energy with a conserved physical quantity is an important principle of both quantum and classical theory, exhibited for example in the Lagrangian formulation of classical mechanics in the guise of Noetherʼs theorem; this provides some motivation for our energy observability postulate.

Further, postulates 1, 2 and 4 seem likely to be necessary—or at least sufficient—to run standard statistical mechanics arguments, a possibility we will explore in further work. We have already mentioned the conjecture that postulate 1 implies the equivalence of measurement and preparation entropy, which likely has relevance to thermodynamic processes and Maxwellʼs demon arguments. Reversible processes, the subject of postulate 2, are even more crucial in classical and quantum thermodynamics.

2. Operational probabilistic theories

In this section, we summarize the standard mathematical framework for operational probabilistic theories, and give needed definitions and facts about convexity and cones. References for the mathematics include [14] and [34]. More details on the framework can be found in e.g. [16, 3537, 50]; also, [75, 76] offer accessible introductions. This review is primarily to fix notation and clarify the specific version used here.

The primitive elements of operational probabilistic theories are experimental devices and probabilities. In particular, experimental devices can be classified into preparations, transformations, and measurements. With each use, a preparation device (such as an oven, antenna, or laser) outputs an instance of a physical system, denoted by A, in some state ω specified by the type of device and its various settings. The system then passes through a transformation device (such as a beam splitter, or Stern–Gerlach magnet) which modifies the state of the system, in a potentially non-deterministic fashion. Finally, a measurement device takes in the system, and one of a distinct set of outputs (such as a light flashing, or a pointer being in some range of possible positions) signals the measurement outcome. Even though we motivate the formalism by example of such laboratory devices, the resulting operational framework is not restricted to this setting and may also be used to describe other physical processes.

A main purpose of a physical theory in this framework is to specify the probabilities of the outcomes of any measurement made on a system that has been prepared in a given state. To this end, single measurement outcomes, called effects, will be denoted by lowercase letters such as e. The probability of obtaining an outcome e, given state ω, will be denoted $e(\omega )$.

By standard arguments, each state can be specified by a minimal list of measurement outcome probabilities, which contains sufficient information to predict the probabilities of all measurements that can be in principle performed on the system. Using this idea and a further convexity argument, states can therefore be represented as elements of a real linear space of some finite dimension KA, which we denote also by A. Further, for each system A there is a convex compact subset, ${{\Omega }_{A}}\subset A$, of normalized states in a real affine space of dimension ${{K}_{A}}-1$ which is embedded in A as an affine plane not intersecting the origin. The non-negative multiples of elements of ${{\Omega }_{A}}$ form a cone ${{A}_{+}}\subset A$, of unnormalized states. This cone has several useful properties: first, it is topologically closed; second, it has full dimension, i.e. its linear span is all of A; and third, it is pointed, which means that the only linear subspace it contains is $\{0\}$. Cones with these three properties are also called regular.

Effects then become linear functionals from A to $\mathbb{R}$ such that $0\leqslant e(\omega )\leqslant 1$ for all $\omega \in {{\Omega }_{A}},$ i.e. they give valid probabilities on normalized states. As linear functionals from the vector space A to the field $\mathbb{R}$ over which it is defined, effects are elements of the dual space A*, which is the vector space of all such functionals. The non-negative multiples of effects constitute the dual cone $A_{+}^{*}:=\{e\in {{A}^{*}}:\forall \gamma \in {{A}_{+}}\;e(\gamma )\geqslant 0\}$. Given our embedding of ${{\Omega }_{A}}$ in A, there is a unique unit functional ${{u}_{A}}\in {{A}^{*}}$ that evaluates to 1 on every element of ${{\Omega }_{A}}$. The set of all effects is the unit order interval, $[0,{{u}_{A}}]\;:=\;\{e\in A_{+}^{*}:0\leqslant e\leqslant {{u}_{A}}\}\subset A_{+}^{*}$. This notation uses the ordering obtained from the regular cone $A_{+}^{*}$, writing $x\leqslant y$ for $y-x\in A_{+}^{*}$.

For a given system, not all mathematically valid effects may be 'operationally possible' measurement outcomes, so we define a subset $\mathcal{E}$ of the full set of effects $[0,{{u}_{A}}]$, which we call the allowed effects. Thus we are not making the assumption sometimes called the 'no-restriction hypothesis' [41, 50, 63] or 'local saturation' [64], nor the equivalent dual requirement (discussed, e.g., in [65], where it is considered as a kind of analogue, for effect algebras, of Gleasonʼs theorem) that the set of states be the full set of mathematically consistent states on the set of effects. The reader should bear in mind that some authors use just 'effects' to refer to what we call 'allowed effects', and say something like 'mathematically consistent effects' to refer to what we are just calling effects. We make weak, operationally natural assumptions on the subset $\mathcal{E}$: it is convex and topologically closed, contains uA, and for every $x\in \mathcal{E}$, ${{u}_{A}}-x$ is also in $\mathcal{E}$ (so that x can be part of at least one complete measurement, namely $\{x,{{u}_{A}}-x\})$. We also assume that $\mathcal{E}$ has full dimension (otherwise, there would be states $\varphi \ne \omega $ that give the same outcome probabilities for all allowed measurements, which means that we would not have called them 'different states' to start with).

We define a measurement as any collection of allowed effects ei such that ${{\sum }_{i}}{{e}_{i}}={{u}_{A}}$.5 Since we can imagine post-processing the output of such a measurement such that a chosen pair ei and ej of outcomes are grouped together as a single outcome (a 'coarse-graining' of the measurement), we also assume that ${{e}_{i}}+{{e}_{j}}$ is allowed. In brief, we assume that whenever ${{e}_{i}},{{e}_{j}}$ are allowed effects with ${{e}_{i}}+{{e}_{j}}\leqslant {{u}_{A}}$, ${{e}_{i}}+{{e}_{j}}$ is allowed. From our assumptions, it follows that the set of allowed effects is the unit order interval $[0,{{u}_{A}}]$ in a regular subcone $A_{+}^{\sharp }$ (containing uA) of the dual cone. If $A_{+}^{\sharp }=A_{+}^{*}$, we say that all effects are allowed; in our framework, this is equivalent to the 'no-restriction hypothesis', or 'local saturation', mentioned above.

We will need the notion, standard in linear algebra, of the dual (sometimes called adjoint) T* of a linear map $T:A\to A$. This is the linear map $T*:A*\to A*$ defined by the condition $(f,Tx)=(T*f,x)$, where (.,.): $A*\times A\to \mathbb{R}$ is the canonical 'dual pairing' of A* and A, sometimes called the 'evaluation map': $(f,x)\;:=\;f(x)$.

Associated with every system there is also a set of allowed transformations, which are linear maps $T:A\to A$, taking states to states, i.e. satisfying $T({{A}_{+}})\subseteq {{A}_{+}}$ (a property called positivity). Transformations are required to be normalization-nonincreasing, i.e. ${{u}_{A}}(T(\omega ))\leqslant 1$ for all $\omega \in {{\Omega }_{A}}$. The set of allowed transformations is also closed topologically and under composition. If all effects are allowed, it follows from positivity and normalization that $e\;\circ \;T\in \mathcal{E}$ for all allowed effects e (all elements of $\mathcal{E}$); otherwise we explicitly require this (i.e., that ${{T}^{*}}(\mathcal{E})\subseteq \mathcal{E}$). Since $\mathcal{E}$ is the unit order interval in $A_{+}^{\sharp }$, it is equivalent (for normalization-nonincreasing T) to require that ${{T}^{*}}(A_{+}^{\sharp })\subseteq A_{+}^{\sharp }$. We note also that the normalization-nonincrease condition is equivalent to the dual condition ${{T}^{*}}({{u}_{A}})\leqslant {{u}_{A}}$. An allowed transformation T is called reversible if its inverse ${{T}^{-1}}$ exists and is also an allowed transformation. It follows that reversible transformations T preserve normalization: ${{u}_{A}}(T(\omega ))={{u}_{A}}(\omega )$ for all $\omega \in {{A}_{+}}$ (though these are not in general the only normalization-preserving transformations). The set of all reversible transformations on a system A is a compact group ${{\mathcal{G}}_{A}}$ with Lie algebra ${{\mathfrak{g}}_{A}}$. For a transformation T, the number ${{u}_{A}}(T(\omega ))$ can be interpreted as the probability of transformation T occurring, if a system prepared in state ω is subjected to a process that has as a possible outcome the occurrence of T. In other words, transformations can be part of an instrument in the sense of [72].

A system described by standard complex n-dimensional quantum theory fits into this framework. Its ambient real vector space A is the n2-dimensional space of complex Hermitian $n\times n$-matrices, the cone of states ${{A}_{+}}$ is the set of positive semidefinite matrices, ${{\Omega }_{A}}$ is the set of density matrices (the intersection of ${{A}_{+}}$ with the affine plane $\{\rho :{\rm tr}\rho =1\}$), the order unit is the functional ${\bf 1}\rho \mapsto {\rm tr}\rho $, and the allowed effects are the unit order interval in the dual cone, i.e., the functionals $\rho \mapsto {\rm tr}(E\rho )$ where $0\leqslant E\leqslant {\bf 1}$. The allowed transformations are the trace-nonincreasing completely positive maps $A\to A$, and the reversible transformations are the maps $\rho \mapsto U\rho {{U}^{\dagger }}$ for unitary matrices U.

We now describe some further important notions and facts about this type of theory and the relevant mathematical structures that will be used in our discussion.

A cone ${{A}_{+}}$ is reducible if the ambient space decomposes into two nontrivial subspaces such that every extremal ray of the cone lies in one or the other of these subspaces. A system is called reducible if its cone of unnormalized states is reducible. Intuitively, information about which of these two summands the state is in, is classical information. Every cone in finite dimension has a decomposition as a finite sum $\oplus _{i=1}^{n}{{A}_{i}}$ of irreducible cones, and if these irreducible components are all one-dimensional any base for the cone is affinely isomorphic to the simplex of probability measures over n outcomes, so we say the system is classical. Its faces are the subsimplices generated by the subsets of outcomes, its reversible transformations are the permutations of the vertices, and more general transformations are given by substochastic matrices.

One can identify A* with A by introducing an inner product $\langle .,.\rangle $ on A, and interpreting the inner product as functional evaluation: $e(\omega )=\langle e,\omega \rangle $. Via this isomorphism the dual cone $A_{+}^{*}$ is identified with the 'internal dual cone' relative to the given inner product, $A_{+}^{*\operatorname{int}}:=\{y\in A:\forall x\in {{A}_{+}}\;\langle y,x\rangle \geqslant 0\}$. Often, such an inner-product-space formulation is used as the basic framework for presenting probabilistic systems and theories; see for example [48, 70]. If an inner product can be introduced in such a way that $A_{+}^{*\operatorname{int}}={{A}_{+}}$, the cone is said to be self-dual and the inner product self-dualizing; a cone in an inner product space is said to be manifestly self-dual if the inner product is one that identifies the cone with its dual.

A set of states ${{\omega }_{1}},\ldots ,{{\omega }_{n}}\in {{\Omega }_{A}}$ is called perfectly distinguishable if there are allowed effects ${{e}_{1}},\ldots ,{{e}_{n}}\in A_{+}^{\sharp }$ which can appear in a common measurement, i.e. ${{e}_{1}}+\ldots +{{e}_{n}}\leqslant {{u}_{A}}$, such that ${{e}_{i}}({{\omega }_{j}})={{\delta }_{ij}}$ that is, 1 if i = j and 0 otherwise6 .

A face F of a convex set C is a convex subset of C such that $\alpha \in F$ and $\alpha ={{\sum }_{i}}{{\lambda }_{i}}{{\omega }_{i}}$, ${{\omega }_{i}}\in C,{{\lambda }_{i}}\gt 0,{{\sum }_{i}}{{\lambda }_{i}}=1$ implies that all ${{\omega }_{i}}\in F$. In other words F is closed under inclusion of anything that can appear in a convex decomposition of an element of F. An exposed face of a convex set is the intersection of a supporting hyperplane with the set, easily seen to be a face.

The faces of ${{A}_{+}}$ and those of ${{\Omega }_{A}}$ are in 1-1 correspondence: the face of ${{A}_{+}}$ corresponding to face F of ${{\Omega }_{A}}$ is just $\{\lambda \omega :\omega \in F,\lambda \geqslant 0\}$. The relation 'is a face of' is transitive: if G is a face of C, and F is a face of G, then F is a face of C. The orderings of the set of faces and of the set of exposed faces by subset inclusion each form a lattice, with greatest lower bound $F\wedge G=F\cap G$, and least upper bound $F\vee G$, which is the smallest face containing both F and G. The face generated by a subset S of a convex set is the smallest face containing S. If a lattice has an upper bound, this is conventionally called 1, and a lower bound is called 0; for ${{\Omega }_{A}}$ we have $1={{\Omega }_{A}}$ and $0=\varnothing $, while for ${{A}_{+}}$, $1={{A}_{+}}$ and $0=\{0\}$, where 0 is the 0 of the vector space A. (We adopt the convention that the empty set ∅ is not counted as a face of ${{A}_{+}}$.) An atom is a minimal non-zero element of the lattice; the atoms of the face lattice of a regular finite-dimensional cone are the extremal rays, ${\rm Ray}(\omega )\;:=\;\{\lambda \omega :\lambda \geqslant 0\}$ for ω extremal in ${{\Omega }_{A}}$. An element of ${{A}_{+}}$ may be called ray-extremal if it is a non-negative multiple of a pure state of ${{\Omega }_{A}}$.

Quantum systems are self-dual, with all effects allowed, and with the self-dualizing inner product usually chosen to be $\langle X,Y\rangle ={\rm tr}(XY)$. (For this reason, the dual cone is often identified with the positive semidefinite operators, and the effects with operators E such that $0\leqslant E\leqslant {\bf 1}$, rather than with the functionals $\rho \mapsto {\rm tr}E\rho $ associated with such operators.) The faces of a quantum system, which are all exposed, correspond to the subspaces S of the underlying Hilbert space: the face FS of Ω corresponding to such a subspace S consists of the density matrices ρ whose images, when viewed as linear operators on that Hilbert space, are contained in S. Equivalently, they are those density matrices whose convex decompositions into rank-one projectors involve nonzero probabilities only for projectors onto subspaces of S.

3. Consequences of postulates 1 + 2

We call a list of n perfectly distinguishable pure states a frame, of size n, or n-frame. The convex hull of such a set of states is a simplex, isomorphic to the space of probability measures on n alternatives, which we call a 'classical subspace' of the state space. For every finite-dimensional system A, there is a largest frame size NA; frames of this size are called maximal. In quantum theory, a frame corresponds to a set of mutually orthogonal pure states, and it is maximal if the corresponding state vectors are an orthonormal basis of the underlying Hilbert space.

Using the concepts we have introduced, our first two postulates can be stated as follows:

Postulate 1. Every state $\omega \in \Omega $ has a decomposition of the form $\omega ={{\sum }_{i}}{{p}_{i}}{{\omega }_{i}}$, for some probabilities ${{p}_{i}}\geqslant 0$, ${{\sum }_{i}}{{p}_{i}}=1$, and some n-frame ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$, for some $n\in \mathbb{N}$.

Postulate 2. If ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ and ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}}$ are n-frames for some $n\in \mathbb{N}$, then there is a reversible transformation T such that $T{{\omega }_{i}}={{\varphi }_{i}}$ for all i.

We could paraphrase postulate 1 as 'every state lies in some classical subspace', and postulate 2 as 'all classical subspaces of a given size are equivalent'.

Proposition 1. Postulates 1 and 2 imply that all effects are allowed.

Proof. We show that every effect $e\in A_{+}^{*}$ that generates an exposed ray of $A_{+}^{*}$ is allowed, i.e. an element of $A_{+}^{\sharp }$. It follows that all effects are allowed, since the exposed rays generate $A_{+}^{*}$ via convex combinations and closure.

Thus, let $e\in A_{+}^{*}$ be an effect with ${{{\rm max} }_{\omega \in {{\Omega }_{A}}}}e(\omega )=1$ such that the set of non-negative multiples of e is an exposed ray of $A_{+}^{*}$. By the definition of exposed ray, there is an $x\in {{A}_{+}}$ such that every effect $f\in A_{+}^{*}$ with $f(x)=0$ must be a non-negative multiple of e; consequently, if $f(x)=0$, $f\in A_{+}^{*}$ and ${{{\rm max} }_{\omega \in {{\Omega }_{A}}}}f(\omega )=1$ then f = e. We may choose x to be normalized.

According to postulate 1, there is some $n\in \mathbb{N}$ and some frame ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ such that $x=\sum _{j=1}^{n}{{\lambda }_{j}}{{\omega }_{j}}$; we may choose the ${{\lambda }_{j}}$ to be non-zero. The corresponding effects will be denoted ${{e}_{1}},\ldots ,{{e}_{n}}$, i.e. ${{e}_{i}}({{\omega }_{j}})={{\delta }_{ij}}$. Since $e(x)=0$ we have $e({{\omega }_{j}})=0$ for all j = 1,..., n.

We define the maximally mixed state μ by integrating with the Haar measure over the group of reversible transformations; that is, choose any pure state ω, and set $\mu :={{\int }_{{{\mathcal{G}}_{A}}}}G\omega \;{\rm d}G$. This state also has a frame decomposition $\mu =\sum _{i=1}^{N}{{\eta }_{i}}{{\varphi }_{i}}$ with $N\in \mathbb{N}$, ${{\eta }_{i}}\gt 0$, and ${{\varphi }_{1}},\ldots ,{{\varphi }_{N}}$ a frame with corresponding effects ${{f}_{1}},\ldots ,{{f}_{N}}$ such that ${{f}_{i}}({{\varphi }_{j}})={{\delta }_{ij}}$.

According to postulate 2, there is a reversible transformation $T\in {{\mathcal{G}}_{A}}$ such that $T{{\varphi }_{i}}={{\omega }_{i}}$ for all $i=1,\ldots ,{\rm min} \{n,N\}$. Suppose that $n\geqslant N$, then

hence $e(\mu )=0={{\int }_{{{\mathcal{G}}_{A}}}}e(G\omega ){\rm d}G$. Since $G\mapsto e(G\omega )$ is a continuous non-negative function on ${{\mathcal{G}}_{A}}$, we must have $e(G\omega )=0$ for all $G\in {{\mathcal{G}}_{A}}$, and thus $e(\omega ^{\prime} )=0$ for all pure states $\omega ^{\prime} $. Since the pure states span the full linear space, we obtain e = 0, which is a contradiction.

Thus we have $n\lt N$. Consider the allowed effect ${{f}_{N}}\;\circ \;{{T}^{-1}}$. It satisfies

and since ${{{\rm max} }_{\omega \in {{\Omega }_{A}}}}{{f}_{N}}\;\circ \;{{T}^{-1}}(\omega )=1$, we have ${{f}_{N}}\;\circ \;{{T}^{-1}}=e$; in particular, e is an allowed effect.

For the following proposition, recall that a set of states is said to generate a face F if F is the smallest face that contains these states.

Proposition 2. Postulates 1 and 2 imply that every face of Ω is generated by a frame. Any two frames that generate the same face F have the same size, called the rank of F, and denoted $|F|$. Moreover, if $G\;\subsetneq \;F$ then $|G|\lt |F|$, and every frame of size $|F|$ in F generates F.

Proof. A face is generated by any element of its relative interior. By postulate 1, such an element is in the convex hull of a frame; this frame also generates the face.

Let F be any face, and suppose there are two frames ${{\varphi }_{1}},\ldots ,{{\varphi }_{m}}$ and ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ with $m\lt n$ that both generate F, and ${{e}_{1}},\ldots ,{{e}_{n}}$ effects such that ${{e}_{i}}({{\omega }_{j}})={{\delta }_{ij}}$ and ${{\sum }_{i}}{{e}_{i}}\leqslant u$. Let $F^{\prime} $ be the face generated by ${{\omega }_{1}},\ldots ,{{\omega }_{m}}$, then $G:=\{x\in \Omega \;\;|\;\;{{e}_{n}}(x)=0\}$ is a face of Ω containing $F^{\prime} $ but not containing ${{\omega }_{n}}$, so $F^{\prime} \;\subsetneq \;F$. Due to postulate 2, there is a reversible transformation T with $T{{\varphi }_{i}}={{\omega }_{i}}$ for i = 1,..., m, so $TF\subseteq F^{\prime} \;\subsetneq \;F$. Since TF is a proper face of F, it must have smaller dimension, which contradicts the invertibility and thus reversibility of T. Similarly, if we had $G\;\subsetneq \;F$ and $|G|\geqslant |F|$, then a reversible transformation could map F into G, which is a contradiction, too.

If ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ is any frame on F, and G the face that it generates, then $G\subseteq F$, and some reversible transformation T will map it to some other frame of the same size that generates F. Hence TG = F, and this contradicts $G\;\subsetneq \;F$.

Proposition 3. Postulates 1 and 2 imply that ${{A}_{+}}$ is self-dual, with a corresponding self-dualizing inner product that satisfies $\langle T\varphi ,T\omega \rangle =\langle \varphi ,\omega \rangle $ for all reversible transformations T, i.e. such that all reversible transformations are orthogonal .The inner product can be chosen so that the corresponding norm $\parallel \omega \parallel :=\sqrt{\langle \omega ,\omega \rangle }$ attains the value 1 on all pure states, and is strictly less than 1 for all mixed states.

Proof. Reference [32] shows that bit symmetry and the fact that all effects are allowed imply this proposition. Bit symmetry is the two-frame case of postulate 2, and we have shown that all effects are allowed in proposition 1.

Henceforth, except when we explicitly state otherwise, we identify A* with A via an inner product satisfying the conditions in the above proposition. Since reversible transformations T are normalized, we have ${{T}^{*}}({{u}_{A}})={{u}_{A}}$. Moreover, ${{T}^{*}}={{T}^{-1}}$ by orthogonality. T* is also a reversible transformation; thus, if we regard uA now as an element of A, we obtain that ${{T}^{-1}}{{u}_{A}}={{u}_{A}}$ for all ${{T}^{-1}}$. This proves the following:

Proposition 4. Postulates 1 and 2 imply that uA is invariant under all reversible transformations.

Proposition 5. Postulates 1 and 2 imply that every frame ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ can be extended to a frame ${{\omega }_{1}},\ldots ,{{\omega }_{n}},\ldots ,{{\omega }_{N}}$ which generates ${{A}_{+}}$, i.e. $N=|{{A}_{+}}|$.

Proof. Let ${{\varphi }_{1}},\ldots ,{{\varphi }_{N}}$ be any frame that generates all of ${{A}_{+}}$, with effects ${{e}_{1}},\ldots ,{{e}_{N}}$ such that ${{e}_{j}}({{\varphi }_{i}})={{\delta }_{ij}}$ and ${{\sum }_{j}}{{e}_{j}}={{u}_{A}}$. Then ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}}$ is itself a frame of size n; thus, according to postulate 2, there is a reversible transformation T with $T{{\varphi }_{i}}={{\omega }_{i}}$ for i = 1, ..., n. For $i\gt n$, define ${{\omega }_{i}}:=T{{\varphi }_{i}}$. Set $e_{j}^{\prime }:={{e}_{j}}\;\circ \;{{T}^{-1}}$, then $e_{j}^{\prime }({{\omega }_{i}})={{\delta }_{ij}}$ and ${{\sum }_{j}}e_{j}^{\prime }={{u}_{A}}$, and so we have extended ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ to a frame with N elements.

The following proposition will turn out to be useful in several proofs.

Proposition 6. Postulates 1 and 2 imply that if ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ are mutually orthogonal pure states, then they are a frame, and $\sum _{i=1}^{n}{{\omega }_{i}}\leqslant {{u}_{A}}$.

Proof. We have to find effects ${{e}_{1}},\ldots ,{{e}_{n}}$ with ${{e}_{i}}({{\omega }_{j}})={{\delta }_{ij}}$ and $\sum _{i=1}^{n}{{e}_{i}}\leqslant {{u}_{A}}$. To this end, we will first construct a decomposition of the order unit. By self-duality and proposition 3, $\varphi :={{u}_{A}}/\langle {{u}_{A}},{{u}_{A}}\rangle $ is a state in Ω, hence there is a frame ${{\varphi }_{1}},\ldots ,{{\varphi }_{N}}$ with $N=|{{A}_{+}}|$ and ${{\lambda }_{i}}\geqslant 0$ such that ${{u}_{A}}=\langle {{u}_{A}},{{u}_{A}}\rangle \varphi =\parallel {{u}_{A}}{{\parallel }^{2}}\sum _{i=1}^{N}{{\lambda }_{i}}{{\varphi }_{i}}$. For any permutation $\pi :\{1,\ldots ,N\}\to \{1,\ldots ,N\}$, the states ${{\varphi }_{\pi (1)}},\ldots ,{{\varphi }_{\pi (N)}}$ are again a frame; thus, there is a reversible transformation ${{T}_{\pi }}$ with ${{T}_{\pi }}{{\varphi }_{i}}={{\varphi }_{\pi (i)}}$. Hence (using the invariance of uA under reversible transformations)

Taking the inner product with ${{\varphi }_{j}}$ shows that ${{\lambda }_{{{\pi }^{-1}}(j)}}={{\lambda }_{j}}$; since this is true for all permutations, all ${{\lambda }_{j}}$ are equal to some $\lambda \gt 0$. Finally, $1=\langle {{u}_{A}},{{\varphi }_{1}}\rangle =\parallel {{u}_{A}}{{\parallel }^{2}}\lambda $, and so ${{u}_{A}}=\sum _{i=1}^{N}{{\varphi }_{i}}$. If ${{\omega }_{1}},\ldots ,{{\omega }_{N}}$ is any other frame of size N, then postulate 2 implies that there is a reversible transformation T such that $T{{\varphi }_{i}}={{\omega }_{i}}$, hence ${{u}_{A}}=T{{u}_{A}}=T\sum _{i=1}^{N}{{\varphi }_{i}}=\sum _{i=1}^{N}{{\omega }_{i}}$. Thus, we have shown that every maximal frame adds up to the order unit.

Now we show the statement of the proposition by induction on n. Start with n = 1. Any pure state ${{\omega }_{1}}$ is by definition a frame of size 1. Moreover, if $\varphi \in \Omega $, then the Cauchy–Schwarz inequality yields

hence ${{\omega }_{1}}\leqslant {{u}_{A}}$. Now suppose the statement of the proposition is true for some n, and consider pure mutually orthogonal states ${{\omega }_{1}},\ldots ,{{\omega }_{n+1}}$. Set ${{e}_{1}}:={{\omega }_{1}},\ldots ,{{e}_{n}}:={{\omega }_{n}}$, and ${{e}_{n+1}}:={{u}_{A}}-\sum _{i=1}^{n}{{e}_{i}}$. By the induction hypothesis, ${{e}_{n+1}}\geqslant 0$, and so ${{e}_{1}},\ldots ,{{e}_{n+1}}$ is a measurement with ${{e}_{i}}({{\omega }_{j}})={{\delta }_{ij}}$ for $1\leqslant i,j\leqslant n+1$. Thus, ${{\omega }_{1}},\ldots ,{{\omega }_{n+1}}$ is a frame. According to proposition 5, it can be extended to a maximal frame ${{\omega }_{1}},\ldots ,{{\omega }_{N}}$, and then $\sum _{i=1}^{N}{{\omega }_{i}}={{u}_{A}}$ shows that $\sum _{i=1}^{n+1}{{\omega }_{i}}\leqslant {{u}_{A}}$.

Recall that for any subset S of an inner product space V its orthogonal complement ${{S}^{\bot }}$ is defined by ${{S}^{\bot }}:=\{x\in V:\forall y\in S\;\langle x,y\rangle =0\}$.

Proposition 7. Postulates 1 and 2 imply that for every face F of ${{A}_{+}}$, the set $F^{\prime} :={{F}^{\bot }}\cap {{A}_{+}}$ is a face of ${{A}_{+}}$ of rank $|F^{\prime} |=N-|F|$, where $N=|{{A}_{+}}|$, and we have $(F^{\prime} )^{\prime} =F$. Furthermore, if ${{\varphi }_{1}},\ldots ,\;{{\varphi }_{n}}$ is any frame that is contained in some face F, then it can be extended to a frame ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}},\ldots ,{{\varphi }_{|F|}}$ that generates F.

Proof. Let $\omega \in F^{\prime} $ be any element, and $0\lt \lambda \lt 1$, ${{\omega }_{1}},{{\omega }_{2}}\in {{A}_{+}}$ such that $\omega =\lambda {{\omega }_{1}}+(1-\lambda ){{\omega }_{2}}$. Then, for every $f\in F$, we have $0=\langle f,\omega \rangle =\lambda \langle f,{{\omega }_{1}}\rangle +(1-\lambda )\langle f,{{\omega }_{2}}\rangle $. Due to self-duality, we have $\langle f,{{\omega }_{i}}\rangle \geqslant 0$ for i = 1,2, hence $\langle f,{{\omega }_{1}}\rangle =\langle f,{{\omega }_{2}}\rangle =0$ for all $f\in F$. This shows that ${{\omega }_{1}},{{\omega }_{2}}\in F^{\prime} $, hence $F^{\prime} $ is a face.

Now we determine the rank of $F^{\prime} $. Let ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ be any frame that generates F, and ${{\varphi }_{1}},\ldots ,{{\varphi }_{|F^{\prime} |}}$ be a frame that generates $F^{\prime} $. Then $\langle {{\omega }_{i}},{{\varphi }_{j}}\rangle =0$ for all $i,j$, and so proposition 6 tells us that both frames taken together are a frame in ${{A}_{+}}$, proving that $|F|+|F^{\prime} |\leqslant N$. Extend ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ to a frame on ${{A}_{+}}$, then ${{\omega }_{|F|+1}},\ldots ,{{\omega }_{N}}$ are orthogonal to F and thus a frame in $F^{\prime} $, showing that $|F^{\prime} |\geqslant N-|F|$, so $|F^{\prime} |=N-|F|$, and the extension is actually a generating frame of $F^{\prime} $. Consequently, ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}\in (F^{\prime} )^{\prime} $, and since $|(F^{\prime} )^{\prime} |=N-|F^{\prime} |=N-(N-|F|)=|F|$, these states generate $(F^{\prime} )^{\prime} $. Since they also generate F, we must have $F=(F^{\prime} )^{\prime} $.

Now suppose that ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}}$ is any frame contained in F; let ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ be any frame that generates F. According to proposition 5, we can extend it to a frame ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}},\ldots ,{{\omega }_{N}}$ that generates all of ${{A}_{+}}$; moreover, the ${{\omega }_{i}}$ with $i\geqslant |F|+1$ generate $F^{\prime} $. But then $\langle \omega ,{{\omega }_{i}}\rangle =0$ for all $i\geqslant |F|+1$ and $\omega \in F$. Thus, the set of states ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}},{{\omega }_{|F|+1}},\ldots ,{{\omega }_{N}}$ is a set of mutually orthogonal pure states and thus, due to proposition 6, a frame. Using proposition 5 again, we can find states ${{\varphi }_{n+1}},\ldots ,{{\varphi }_{|F|}}$ such that ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}},{{\varphi }_{n+1}},\ldots ,{{\varphi }_{|F|}},{{\omega }_{|F|+1}},\ldots ,{{\omega }_{N}}$ isa frame generating ${{A}_{+}}$. For $i\geqslant |F|+1$ and j arbitrary, we have $\langle {{\varphi }_{j}},{{\omega }_{i}}\rangle =0$, and since these ${{\omega }_{i}}$ generate $F^{\prime} $, we have $\langle {{\varphi }_{j}},\omega \rangle =0$ for all $\omega \in F^{\prime} $. Thus ${{\varphi }_{j}}\in (F^{\prime} )^{\prime} =F$, and we have extended ${{\varphi }_{1}},\ldots ,{{\varphi }_{n}}$ to a frame generating F.

As mentioned in section 1, postulates 1 and 2 imply that there is a special transformation called a filter associated with each face of the state space. The next theorem shows that certain projections are positive (recall that a linear map is positive if it maps the cone ${{A}_{+}}$ into itself), and in section 4 we will further show that these projections have the additional properties required of filters.

Theorem 8. Postulates 1 and 2 imply that for every face F of ${{A}_{+}}$, the orthogonal projection PF onto the linear span of F is positive.

Proof. Iochum ([30], see also [31]) has shown that positivity of all PF is equivalent to perfection. (For the readerʼs convenience, and the authors' peace of mind, a proof is included in appendix A.) A cone is called perfect if all faces F of ${{A}_{+}}$, regarded as cones in the linear span lin F, are themselves self-dual with respect to the inner product inherited from A. We will therefore show this property, establishing the claim.

So let F be any face of ${{A}_{+}}$, and $F*\;\subset $ lin F be the dual cone with respect to the inner product inherited from A. Since $F\subseteq {{A}_{+}}=A_{+}^{*}$, for $\omega \in F$ we have $\langle \omega ,\varphi \rangle \geqslant 0$ for all $\varphi \in F$, and so $\omega \in F*$. This proves that $F\subseteq F*$. To see the converse inclusion, let e be any normalized element of F* (i.e. $\langle {{u}_{A}},e\rangle =1$) that generates an exposed ray of F*. This means there exists $\omega \in F$ (which we may choose normalized) with $\langle e,\omega \rangle =0$ such that $f\in {{F}^{*}}$ and $\langle f,\omega \rangle =0$ implies $f=\lambda e$ with $\lambda \in \mathbb{R}$. But $\omega ={{\sum }_{i}}{{\lambda }_{i}}{{\omega }_{i}}$ for some frame ${{\omega }_{1}},\ldots ,{{\omega }_{k}}\in F$ and ${{\lambda }_{i}}\gt 0$. Since ω is in the face $\{\varphi \in F\;\;|\;\;\langle e,\varphi \rangle =0\}\;\subsetneq \;F$, we have $k\lt |F|$, and extending to a frame ${{\omega }_{1}},\ldots ,{{\omega }_{k}},\ldots ,{{\omega }_{|F|}}$ on F gives ${{\omega }_{|F|}}\in F\subseteq {{F}^{*}}$ as well as $\langle {{\omega }_{|F|}},\omega \rangle =0$, hence $e={{\omega }_{|F|}}\in F$. Since the exposed rays generate F*, this proves that ${{F}^{*}}\subseteq F$.

The properties that we have proven so far turn out to give an interesting structure known from the field of quantum logic, indeed sometimes taken as a definition of a quantum logic [52]. As noted above, the set of faces ordered by subset inclusion is a bounded lattice. However, from postulates 1 and 2, we recover more of the logical structure of quantum theory:

Theorem 9. Postulates 1 and 2 imply that the lattice of faces of ${{A}_{+}}$ is an orthomodular lattice.

Before giving the proof, recall that orthomodularity is the property that

Equation (1)

Note that in [33] it is shown that for self-dual cones, orthomodularity of the face lattice in the above sense is equivalent to the property of perfection mentioned in the proof of theorem 8. Furthermore, in [19] it is shown that orthomodularity of the face lattice, according to an orthocomplementation which agrees with ours in case postulates 1 and 2 hold, follows from a property called projectivity. In the next section we will define projectivity and establish that state spaces satisfying postulates 1 and 2 are projective, giving us an alternative proof of orthomodularity. Here, we proceed with the direct proof.

Proof. Constructing $F^{\prime} $ as the face generated by the extension of a frame generating F shows easily that $(F^{\prime} )^{\prime} =F$ (as already shown in proposition 7), and that $F\subseteq G$ implies $F^{\prime} \supseteq G^{\prime} $, as well as $F\vee F^{\prime} ={\bf 1}\equiv {{A}_{+}}$ and $F\wedge F^{\prime} \equiv {\bf 0}\equiv \{0\}$. These properties mean that the operation ' is an orthocomplementation on the lattice of faces. It remains to show that this orthocomplemented lattice satisfies the orthomodular law, equation (1). To this end, assume $F\subseteq G$, and let ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ be a frame on F. Extend this to a frame on G, and further extend the result to a frame on ${{A}_{+}}$, yielding ${{\omega }_{1}},\ldots ,{{\omega }_{N}}$. Then ${{\omega }_{|F|+1}},\ldots ,{{\omega }_{|G|}}$ is a frame on $G\cap F^{\prime} $; if it did not generate $G\cap F^{\prime} $, Then ${{\omega }_{|F|+1}},\ldots ,{{\omega }_{|G|}}$ is a frame on $G\cap F^{\prime} $; if it did not generate $G\cap F^{\prime} $, it could be extended in $G\cap F^{\prime} $, and to this extension we could append ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ to obtain a frame of size larger than $|G|$ in G, which is a contradiction. Hence $H:=G\cap F^{\prime} $ is generated by ${{\omega }_{|F|+1}},\ldots ,{{\omega }_{|G|}}$. Since $F\vee H$ is the smallest face containing F and H, it is the smallest face containing ${{\omega }_{1}},\ldots ,{{\omega }_{|G|}}$, hence equal to G.

Systems that satisfy postulates 1 and 2 are operationally close to quantum theory also with respect to their contextuality behavior: they satisfy the principle of consistent exclusivity [54], the single-system generalization of the recently introduced postulate of local orthogonality [55]. This is also called Speckerʼs principle [57], and comes in slightly different versions, depending on assumptions of the validity of the principle in situations where one has more than one copy of a state. Here we are interested in the single-system version that is called ${\rm C}{{{\rm e}}^{1}}$ in [54].

In order to talk about contextuality, we need a notion of 'sharp measurements': the analogs of projective measurements in quantum theory. Following [58], we call an effect $0\leqslant e\leqslant {{u}_{A}}$ sharp if it can be written as a sum of normalized ray-extremal effects; that is, if there are pure states ${{\omega }_{1}},\ldots ,{{\omega }_{n}}$ such that

and if an analogous decomposition exists for ${{u}_{A}}-e$. This definition does not assume that the ${{\omega }_{i}}$ are mutually orthogonal; however, they have to be as a consequence of postulates 1 and 2. To see this, note that for all j

hence $\langle {{\omega }_{i}},{{\omega }_{j}}\rangle =0$ for all $i\ne j$. The corresponding effects e can also be characterized in two further ways, namely as projective units and as the extremal points of the unit order interval, giving further weight to the interpretation as the analogue of orthogonal projectors in quantum theory. This is the content of the next lemma. We start with a definition. (Projective units).

Definition 10 Let A be any system satisfying postulates 1 and 2. Then, for every face F of ${{A}_{+}}$, define the projective unit uF as

where PF is the orthogonal projection onto the linear span of F. A projective unit uF is called atomic if $|F|=1$.

This is now used in the following lemma:

Lemma 11. Let A be any system satisfying postulates 1 and 2. Then, for every face F of ${{A}_{+}}$, there is a unique effect uF with $0\leqslant {{u}_{F}}\leqslant {{u}_{A}}$ such that ${{u}_{F}}(\omega )=1$ for every $\omega \in F\cap {{\Omega }_{A}}$, and ${{u}_{F}}(\varphi )=0$ for all $\varphi \in F^{\prime} \cap {{\Omega }_{A}}$, namely the projective unit from definition 10. If ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ is any frame that generates F, then

Equation (2)

Furthermore, every effect e ∈ A+ with $0\leqslant e\leqslant {{u}_{A}}$ is a convex combination of projective units, and we have ${{u}_{F}}+{{u}_{G}}\leqslant {{u}_{A}}$ if and only if $F\;\bot \;G$, in which case ${{u}_{F}}+{{u}_{G}}={{u}_{F\vee G}}$.

Proof. As in definition 10, set ${{u}_{F}}:={{P}_{F}}{{u}_{A}}$. Due to theorem 8, ${{u}_{F}}\in {{A}_{+}}$. Thus, $\omega \in F$ implies

If $\varphi \in F^{\prime} $, then ${{P}_{F}}\varphi =0$, and an analogous computation shows that $\langle {{u}_{F}},\varphi \rangle =0$. Set ${{\mu }_{F}}:={{u}_{F}}/\langle {{u}_{A}},{{u}_{F}}\rangle $, then ${{\mu }_{F}}\in F\cap {{\Omega }_{A}}$, and so there is a frame ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ of F such that ${{\mu }_{F}}=\sum _{i=1}^{|F|}{{\lambda }_{i}}{{\omega }_{i}}$ with ${{\lambda }_{i}}\geqslant 0$, ${{\sum }_{i}}{{\lambda }_{i}}=1$. For every $j=1,\ldots ,|F|$, we have ${{\omega }_{j}}\in F$, and so

so all ${{\lambda }_{j}}$ are equal to ${{\langle {{u}_{A}},{{u}_{F}}\rangle }^{-1}}$, proving that there exists some frame ${{\omega }_{1}},\ldots ,{{\omega }_{|F|}}$ with decomposition (2) of uF, and showing the inequality $0\leqslant {{u}_{F}}\leqslant {{u}_{A}}$. If ${{\varphi }_{1}},\ldots ,{{\varphi }_{|F|}}$ is any other frame on F, then there exists a reversible transformation T with $T{{\omega }_{i}}={{\varphi }_{i}}$. Since both frames generate F, T must preserve the face F (and also its orthogonal complement because T is orthogonal). Hence

Thus ${{u}_{F}}=T{{u}_{F}}=T\sum _{i=1}^{|F|}{{\omega }_{i}}=\sum _{i=1}^{|F|}{{\varphi }_{i}}$, proving that uF can be decomposed into any frame in the claimed way. If $0\leqslant e\leqslant {{u}_{A}}$ is any effect, then it has a frame decomposition $e=\sum _{i=1}^{|{{A}_{+}}|}{{\lambda }_{i}}{{\omega }_{i}}$, where ${{\omega }_{i}}\in {{\Omega }_{A}}$ are mutually orthogonal pure states, and $0\leqslant {{\lambda }_{i}}\leqslant 1$. Thus, the vector $\lambda :=\;({{\lambda }_{1}},\ldots ,{{\lambda }_{|{{A}_{+}}|}})$ is an element of the $|{{A}_{+}}|$-dimensional unit cube, and can thus be written as a convex combination of extremal points of the (convex) cube, corresponding to vectors $\mu =({{\mu }_{1}},\ldots ,{{\mu }_{|{{A}_{+}}|}})$ where all ${{\mu }_{i}}\in \{0,1\}$. Hence e can correspondingly be decomposed into effects of the form $\sum _{i=1}^{|{{A}_{+}}|}{{\mu }_{i}}{{\omega }_{i}}$, which are projective units. This also shows that the uF are the unique effects with the propertiesstated in the lemma. If $F\bot G$ then ${{u}_{F}}+{{u}_{G}}={{u}_{F\vee G}}\leqslant {{u}_{A}}$ is clear from the sum representation of projective units; conversely, if ${{u}_{F}}+{{u}_{G}}\leqslant {{u}_{A}}$, then ${{u}_{F}}+{{u}_{F^{\prime} }}={{u}_{A}}$ implies that ${{u}_{F}}+{{u}_{G}}\leqslant {{u}_{F}}+{{u}_{F^{\prime} }}$, and so ${{u}_{G}}\leqslant {{u}_{F^{\prime} }}$. Thus, if $\omega \in G\cap {{\Omega }_{A}}$, then $1=\langle {{u}_{G}},\omega \rangle \leqslant \langle {{u}_{F^{\prime} }},\omega \rangle \leqslant 1$, and so $0=\langle {{u}_{A}}-{{u}_{F^{\prime} }},\omega \rangle =\langle {{u}_{F}},\omega \rangle =\langle {{P}_{F}}{{u}_{A}},\omega \rangle =\langle {{u}_{A}},{{P}_{F}}\omega \rangle $, which implies that ${{P}_{F}}\omega =0$ and $\omega \;\bot \;F$. Hence $F\;\bot \;G$.

Following the definition of [58], expressed in the language of [54], every system satisfying postulates 1 and 2 defines a contextuality scenario given by a hypergraph H, where the vertices of H are the projective units uF ($F\ne \{0\}$ any face of ${{A}_{+}}$), and the edges are collections of effects ${{u}_{{{F}_{1}}}},\ldots ,{{u}_{{{F}_{n}}}}$ with $\sum _{i=1}^{n}{{u}_{{{F}_{i}}}}={{u}_{A}}$. These edges describe contexts, i.e. sharp measurements (given by sets of projective units) that are compatible (i.e. jointly measurable).

Theorem 12. Any system satisfying postulates 1 and 2 also satisfies the principle of consistent exclusivity ${\rm C}{{{\rm e}}^{1}}$ as given in [54, definition 7.1.1] and [59].

Proof. We have to show the following: if I is any set of vertices of the hypergraph H such that every two elements of I belong to a common edge, then ${{\sum }_{e\in I}}e(\omega )\leqslant 1$ for all $\omega \in \Omega $. In the context of postulates 1 and 2, I is then a set of projective units $I=\{{{u}_{{{F}_{1}}}},\ldots ,{{u}_{{{F}_{n}}}}\}$ such that ${{u}_{{{F}_{i}}}}+{{u}_{{{F}_{j}}}}\leqslant u$ for $i\ne j$. But lemma 11 implies that ${{F}_{i}}\bot {{F}_{j}}$. So if ${{\mathcal{F}}_{i}}$ is any frame for Fi, then ${{\mathcal{F}}_{i}}\;\bot \;{{\mathcal{F}}_{j}}$ for $i\ne j$, hence the disjoint union $\mathcal{F}\;:=\;{{\bigcup }_{i}}{{\mathcal{F}}_{i}}$ is a frame on ${{A}_{+}}$, generating some face F. Thus

This proves the claim.

As mentioned in section 1, the classification of the set of all state spaces that satisfy postulates 1 and 2 remains an open problem with interesting physical and mathematical implications. Now we show that one additional assumption brings us into the realm of Jordan algebra state spaces. Before postulating the absence of third-order interference, we study another postulate which turns out to be equivalent in our context.

4. Jordan systems from postulates 1 + 2 and purity preservation by filters

In this section, we show that a system satisfying postulates 1 and 2 and a third postulate, that the positive projections of theorem 8 take pure states to multiples of pure states, is either an irreducible Jordan algebraic system or classical.

Jordan algebras were introduced around 1932 by Jordan [1], as a potentially useful algebraic abstraction of the space of observables, i.e. Hermitian operators on a Hilbert space, in the newly minted quantum theory. Since the usual matrix or operator multiplication does not preserve Hermiticity, its physical significance was unclear; Jordan focused on abstracting properties of the symmetrized product $A\;\bullet \;B:=\;(AB+BA)/2$ which does preserve Hermiticity. Like the space of Hermitian operators, a Jordan algebra (as initially defined by Jordan and studied by him, von Neumann, and Wigner) is a real vector space, closed under a commutative bilinear product •. Since the symmetrized product of Hermitian operators is not associative but does satisfy the special case $({{a}^{2}}\;\bullet \;b)\bullet a={{a}^{2}}\;\bullet \;(b\bullet \;a)$ (where ${{a}^{2}}:=a\;\bullet \;a$) of associativity, a Jordan algebra is not assumed associative, but only to satisfy this special case, the 'Jordan property'. For a finite-dimensional Jordan algebra A, at least, the squares (elements of the form a2 for some $a\in A$) form a closed cone of full dimension. Jordan et al investigated the formally real finite-dimensional Jordan algebras, which are precisely those whose cones of squares are pointed. Like the quantum observables, formally real Jordan algebras have a well-behaved spectral theory (see [11, section III.1]), with real spectra and an associated real-valued trace function7 . In these algebras, squares have non-negative spectra, and the unit-trace squares form a closed compact convex set as required to be the normalized state space of a system in our context. As mentioned in the introduction, the finite-dimensional formally real Jordan algebras are already quite close to quantum theory: besides standard quantum theory over the complex numbers they are quantum-like systems over the reals and over the quaternions, systems whose state spaces are balls ('spin factors') and what can be thought of as three-dimensional quantum theory over the octonions [13]. They are also of interest because they are precisely the finite-dimensional systems whose cones of unnormalized states are self-dual and homogeneous [2, 3].

The key tools we will use to establish the main result of this section are theorem 8 and a characterization of the state spaces of certain Jordan algebras by Alfsen and Shultz [19, theorem 9.33], first published in [20]. To state this result requires introducing several somewhat technical notions, which are, however, of considerable physical interest in their own right. These are the notions of a filter on the state space A (and its dual, the notion of a compression on the effect space A*), with its associated notion of a projective state space, and the property of symmetry of transition probabilities.

We first define filters, and begin by introducing some notions used in that definition.

Definition 13. Let A be any state space with cone ${{A}_{+}}$. Projections are linear operators $P:A\to A$ with ${{P}^{2}}=P$; they are positive if $P({{A}_{+}})\subseteq {{A}_{+}}$. Positive projections P and Q are called complementary if ${\rm i}{{{\rm m}}_{+}}P={{{\rm ker} }_{+}}Q$ and vice versa, where ${\rm i}{{{\rm m}}_{+}}P:={\rm im}P\cap {{A}_{+}}$ and ${{{\rm ker} }_{+}}\;Q:={\rm ker} \;Q\cap {{A}_{+}}$. A positive projection P is complemented if there exists a positive projection Q such that P and Q are complementary.

(Filters and projectivity).

Definition 14 A filter is a positive linear projection $P:A\to A$ which (i) is complemented, (ii) has a complemented dual P*, and (iii) is normalized, i.e. satisfies ${{u}_{A}}(P\omega )\leqslant {{u}_{A}}(\omega )$ for all ωA+.8

The state space A is called projective if every face of ${{A}_{+}}$ is the positive part, ${\rm i}{{{\rm m}}_{+}}\;P$, of the image of a filter P.

We define filters in order to make use of the results in [19], but they are also of great interest in their own right. Actually Alfsen and Shultz define [19, definition 7.22] compressions, acting on the effect space A*. The finite-dimensional specialization of Alfsen and Shultz' notion of compression is just a positive projection $Q:{{A}^{*}}\to {{A}^{*}}$ which is complemented, whose dual is complemented, and whose dual is normalized; it is obvious that a linear map $Q:{{A}^{*}}\to {{A}^{*}}$ is a compression iff ${{Q}^{*}}:A\to A$ is a filter, and similarly P is a filter iff P* is a compression. We defined filters because we are most interested in the transformations that act on the state space A. In fact, in the context of postulates 1 and 2 with A and A* being identified via an appropriate self-dualizing inner product, filters and compressions are represented by precisely the same linear operators.

As described above, in standard quantum theory the face associated with a subspace S of Hilbert space consists of the density matrices whose support is contained in S. Quantum state spaces are projective: there is a filter onto each face, namely the linear map $\rho \mapsto {{P}_{S}}\rho {{P}_{S}}$, where PS is the orthogonal projector onto S. The complementary projection is $\rho \mapsto {{P}_{{{S}^{\bot }}}}\rho {{P}_{{{S}^{\bot }}}}$.

One of several reasons that filters are of great interest for physics and information-processing is that they share with the maps $\rho \mapsto P\rho P$ the property of neutrality [19, definition 7.19]: if a state $\omega \in \Omega $ 'passes the filter with probability 1', i.e. ${{u}_{A}}(P\omega )={{u}_{A}}(\omega )$, then it 'passes the filter undisturbed', i.e. $P\omega =\omega $. (This is immediate from definition 7.19, proposition 7.21, and definition 7.22 of [19].)

We now turn to symmetry of transition probabilities, a notion which is defined for systems which are projective in the sense of definition 14.

Observe that in a projective system, for each atomic projective unit p, which is associated [19, proposition 7.28] with a unique filter P for which ${{P}^{*}}u=p$, the associated face $\{x\;\;|\;\;p(x)=1\}$ of Ω contains a single pure state. Call this state $\hat{p}$. The map $p\mapsto \hat{p}$ is a one-to-one map from the set of atoms of the lattice of projective units onto the set of extremal points of Ω. The system is said to satisfy symmetry of transition probabilities [19, definition 9.2(iii)] if for all pairs $a,b$ of atoms of the lattice of projective units, $a(\hat{b})=b(\hat{a})$.

Lemma 15. If a system satisfies postulates 1 and 2, it satisfies symmetry of transition probabilities.

Proof. In the context of postulates 1 and 2, atomic projective units are uF for $|F|=1$, where F is generated by a pure state (frame of size 1) ${{\omega }_{1}}$, such that ${{u}_{F}}(\varphi )=\langle {{\omega }_{1}},\varphi \rangle $ according to lemma 11, so ${{\hat{u}}_{F}}={{\omega }_{1}}={{u}_{F}}$ in the notation just introduced. Thus $a(\hat{b})=\langle \hat{a},\hat{b}\rangle =\langle \hat{b},\hat{a}\rangle =b(\hat{a})$.

We can now state a version of a theorem from [19] that we will use in proving the main result of this section. One of the conditions in this theorem will be important in its own right in what follows, and we therefore call it postulate 3 $^{\prime }$.

Theorem 16. Let a finite-dimensional system ${{A}_{+}}$ satisfy

  • (a)  
    projectivity,
  • (b)  
    symmetry of transition probabilities, and
  • (c)  
    Postulate 3 $^{\prime }$ : filters P preserve purity. That is, if ω is a pure state, then $P\omega $ is a non-negative multiple of a pure state.

Then ${{A}_{+}}$ is the state space of a formally real Jordan algebra.

The original theorem in [19, 20] is formulated in terms of compressions, with similar results in finite dimensions given by Gunson [44] and by Guz [4547]. Theorem 16 above is an adaptation to our language and to finite dimension, using the notion of filters instead of compressions. The conjunction of (b) and (c) is what Alfsen and Shultz [19, definition 9.2] call the 'pure state properties' (their (3)), while their (2) is a technical condition that is automatically satisfied in finite dimension, and their (1) follows from our (a).

Theorem 17. In finite dimension, postulates 1 and 2 imply that the system is projective. Assuming in addition postulate 3$^{\prime} $ implies that the system is either irreducible Jordan-algebraic, or classical.

Proof. In theorem 8, we have already shown that the orthogonal projection PF onto the linear span of the face F is positive, for every face F. Now we show that it is a filter, which establishes that ${{A}_{+}}$ is projective. For any face F, the corresponding projection PF satisfies ${\rm im}{{P}_{F}}={\rm lin}\;F$, ${\rm i}{{{\rm m}}_{+}}\;{{P}_{F}}=F$, and ${{{\rm ker} }_{+}}\;{{P}_{F}}=F^{\prime} ={{F}^{\bot }}\cap {{A}_{+}}$. So also ${\rm i}{{{\rm m}}_{+}}\;{{P}_{F^{\prime} }}=F^{\prime} $ and ${{{\rm ker} }_{+}}\;{{P}_{F^{\prime} }}=F^{\prime\prime} =F$, and we see that PF and ${{P}_{F^{\prime} }}$ are complements, establishing property (i) in the definition of filter. Since ${{P}_{F}}=P_{F}^{*}$, PF has complemented adjoint, property (ii). PF and ${{P}_{F^{\prime} }}$ are positive by theorem 8. To see property (iii), i.e. normalization of PF, recall from lemma 11 that ${{u}_{A}}({{P}_{F}}\omega )={{u}_{F}}(\omega )\leqslant {{u}_{A}}(\omega )$. Hence for every face F the projection PF is a filter, so the system is projective.

Projectivity is (a) of theorem 16. Lemma 15 states that condition (b) of theorem 16 follows from postulates 1 and 2. So (a) and (b) of that theorem follow from postulates 1 and 2, whence by the theorem, postulates 1, 2, and purity preservation by filters imply that a system is Jordan algebraic.

To see that the only reducible Jordan-algebraic cones this allows are the classical ones (corresponding to direct sums of the one-dimensional formally real Jordan algebra), note that the cone of a direct sum of Jordan algebras $\mathcal{A}:=\oplus _{i=1}^{n}{{\mathcal{A}}_{i}}$ is the direct sum $C=\oplus _{i=1}^{n}{{C}_{i}}$ of their cones. This is because every $a\in \mathcal{A}$ can then be written $a=({{a}_{1}},\ldots ,{{a}_{n}})$, and the elements of C are the squares ${{a}^{2}}=(a_{1}^{2},\ldots ,a_{n}^{2})$, where the single ai2 entries range over all of Ci. Suppose one of the summands, say Cj, is not one-dimensional. The face generated by two ray-extremal points, ${{\omega }_{j}}\in {{C}_{j}}$ and ${{\omega }_{k}}\in {{C}_{k}}$, with $k\ne j$, is a direct sum of one-dimensional cones, i.e. a classical bit. Since Cj is irreducible and not one-dimensional, it is not classical, so it contains perfectly distinguishable pure states ${{\omega }_{j}}$ and ${{\omega }_{i}}$ that generate a face that is not a direct sum. Since we have another rank-2 face that is a direct sum, in light of proposition 2 this violates postulate 2. Hence either the cone is irreducible, or all summands are one-dimensional (i.e. it is classical).

The following proposition will be needed later.

Proposition 18. Assume postulates 1 and 2. Then, to every face F1 of ${{A}_{+}}$ with complementary face ${{F}_{2}}\equiv F_{1}^{\prime }$ and corresponding projections P1 and P2, the space A has an orthogonal decomposition

where ${{A}_{i}}:={\rm im}{{P}_{i}}$, $A_{12}^{c}:={\rm ker} \;{{P}_{1}}\cap {\rm ker} \;{{P}_{2}}$.

Proof. By construction, ${{A}_{1}}={\rm lin}\;{{F}_{1}}\;\bot \;{\rm lin}\;F_{1}^{\prime }={{A}_{2}}$, and by elementary linear algebra, ${{({{A}_{1}}\oplus {{A}_{2}})}^{\bot }}=A_{1}^{\bot }\cap A_{2}^{\bot }={\rm ker} \;{{P}_{1}}\cap {\rm ker} \;{{P}_{2}}=A_{12}^{c}$.

5. Third-order interference

Rafael Sorkin defined a notion of kth order interference [4], which can be manifested in analogues of the two-slit experiment involving k or more slits. This notion was adapted to projective convex systems in [16, 17], and the k = 3 case explored in [18]. Quantum theory exhibits k = 2 interference, but no higher interference. In this section, we show that postulates 1 and 2, plus the assumption of no third-order interference, characterize irreducible Jordan algebraic systems.

We formalize the assumption of no third-order interference using a mathematical definition of M-slit interference experiment given in terms of experimental probabilities. This is motivated by, and abstracted from, specific concrete experimental interference experiments such as those in which a photon passes through physical slits in a barrier, but the probabilistic definition gives a conceptual account of the notion of interference that applies (as does the usual quantum-mechanical concept of interference) far more broadly. Consider the setup depicted in figure 1; we would like to give a formal description of the experimental behavior, given that a certain subset of the slits is open or blocked. First, imagine the case that all slits are open, and consider the state ω of the particle immediately after it has passed the slit arrangement. By preparing the particle in different ways, we can obtain different states ω. Part of the state ω contains the 'which-slit information', encoding through which slit the particle has just passed the arrangement (possibly in a probabilistic mixture or generalized superposition). In an ideal M-slit experiment we would in principle be able to measure through which slit the particle passes, if we put suitable detectors behind the slits.

For every slit $j\in \{1,2,\ldots ,M\}$, there should exist states ω such that the particle is definitely found at slit j, if measured. In our mathematical setting, this means that there is a face Fj of the state space, such that all states $\omega \in {{F}_{j}}$ give unit probability for the 'yes'-outcome of the two-outcome measurement 'is the particle at slit j'? Moreover, the slits should be perfectly distinguishable—if a particle is definitely at slit j, then it is definitely not at slit i for all $i\ne j$. Mathematically, this means that ${{F}_{i}}\;\bot \;{{F}_{j}}$ for $i\ne j$.

We can also ask coarse-grained questions like 'Is the particle found among slits 1 and 2 (rather than somewhere else?) The set of those states ω that give unit probability for the 'yes'-outcome must contain both F1 and F2; therefore, it must contain ${{F}_{1}}\vee {{F}_{2}}$, the smallest face of the state space that contains both F1 and F2 as subsets. Furthermore, it should be the smallest such face, since we do not want to include further possibilities. Thus, this set will be ${{F}_{12}}:={{F}_{1}}\vee {{F}_{2}}$. More generally, for every subset of slits $J\subseteq \{1,2,\ldots ,M\}$, we have a face ${{F}_{J}}={{\bigvee }_{j\in J}}{{F}_{j}}$, containing those states that describe a particle that will definitely be found to be somewhere among the slits in J if the corresponding effect is measured. If the setup is 'complete' in the sense that every particle must definitely be found at one of the slits if measured, the face ${{F}_{12\ldots M}}$ must be the full state space.

Now imagine an additional detector following the slit arrangement, as depicted in figure 1. It may click or not click; the probability to click in the case that all slits are open is described by some effect e. Suppose we block all slits, except for a subset $J\subset \{1,\ldots ,M\}$ of the slits which are left open. The combination of blockings and detector defines a new measurement, given by some other effect eJ, with click probability ${{e}_{J}}(\omega )$ if the state right before the blockings is ω.

If the slits do what we intuitively expect them to do, as they do to a good approximation in quantum-mechanical multi-slit experiments, then the click probabilities should behave as follows. If ω is a state of a particle that would definitely be found at one of the slits among J, i.e. $\omega \in {{F}_{J}}$, then the blockings should have no effect (because the slits J are all open), and the click probability should remain the same: ${{e}_{J}}(\omega )=e(\omega )$. On the other hand, if the particle would definitely not be found among the open slits J, i.e. $\omega \in F_{J}^{\prime }$, then the particle should be blocked and there should definitely be no detector click, and ${{e}_{J}}(\omega )=0$.

These considerations lead to the following definition, which abstracts probabilistic properties of an interference experiment from particular physical realizations involving slits, spatial paths, and so forth. We will soon see that the orthogonal projections PJ onto the faces FJ are of paramount importance, which is why we introduce a name for them as well.

(M-slit experiment).

Definition 19 A set of effects eJ and faces FJ, $J\subseteq \{1,2,...,M\}$, with ${{F}_{J}}={{\bigvee}_{j\in J}}{{F}_{j}}$ and ${{F}_{i}}\;\bot \;{{F}_{j}}$ for $i\ne j$ is called an M-slit experiment if there is an effect $e\in [0,u]$ such that

  • ${{e}_{J}}(\omega )=e(\omega )$ for all $\omega \in {{F}_{J}}$,
  • ${{e}_{J}}(\varphi )=0$ for all $\varphi \in F_{J}^{\prime }$.

For any given set of faces with the properties stated above, the corresponding set of orthogonal projections ${{P}_{J}}:={{P}_{{{F}_{J}}}}$ will be called an M-slit mask. It is called complete if ${{F}_{12\cdots M}}={{\Omega }_{A}}$, that is, if ${{P}_{12\cdots M}}={\bf 1}$.

Such an experiment exhibits second-order interference (say, for M = 2) if the overall interference pattern ${{e}_{12}}(\omega )$ fails to be the sum of the one-slit patterns ${{e}_{1}}(\omega )$, ${{e}_{2}}(\omega )$. If it exhibits second-order interference, it may in addition exhibit irreducibly third-order interference. Third-order interference occurs if the overall pattern ${{e}_{123}}(\omega )$ fails to be the sum of the double-slit patterns ${{e}_{ij}}(\omega )$, corrected for overcounting by subtracting suitable multiples of the single-slit patterns ${{e}_{i}}(\omega )$. Unless otherwise specified we use the notation ${{\sum }_{i\lt j}}$ to mean the double sum ${{\sum }_{i}}{{\sum }_{j\gt i}}$.

(Third-order interference).

Definition 20 We say that a state space exhibits third-order interference if there exists an M-slit experiment (for some $M\geqslant 3$) and a state ω such that

Equation (3)

In particular, for M = 3, the condition is

The second term in (3) corrects for the overlaps of the sets $\{i,j\}$ as each index occurs $M-1$ times in pairs $i\lt j$. Sorkinʼs [4] original definition, and the discussion in [17, 18], used the M = 3 case as their definition of third-order interference, but the two can straightforwardly if somewhat tediously be shown to be equivalent. Sorkin showed that if a scenario lacks kth order interference, it cannot have lth order interference for any $l\gt k$.

With the previous definition, we can give a concise formal statement of postulate 3:

Postulate 3. State spaces do not exhibit third-order interference, as introduced in definition 20.

Now we show that M-slit experiments are closely related to the positive orthogonal projections introduced in theorem 8.

Proposition 21. Assume postulates 1 and 2. Then, given any M-slit experiment with effects $e,{{e}_{J}},$ we have ${{e}_{J}}={{P}_{J}}e$, where the PJ are the elements of the corresponding M-slit mask. Conversely, given any set of faces FJ, $J\subseteq \{1,2,\ldots ,M\}$, with ${{F}_{J}}={{\bigvee}_{j\in J}}{{F}_{j}}$ and ${{F}_{i}}\;\bot \;{{F}_{j}}$ for $i\ne j$, and any effect e, the set of effects ${{e}_{J}}:={{P}_{J}}e$ defines an M-slit experiment.

Proof. Since $\langle e-{{e}_{J}},\omega \rangle =0$ for all $\omega \in {{F}_{J}}$, we have $e-{{e}_{J}}\in F_{J}^{\bot }$ (not necessarily in $F_{J}^{\prime }$, because we do not yet know whether $e-{{e}_{J}}$ is positive). Similarly, $\langle {{e}_{J}},\varphi \rangle =0$ for all $\varphi \in F_{J}^{\prime }$ and ${{e}_{J}}\geqslant 0$ implies that ${{e}_{J}}\in {{F}_{J}}$. Thus

The converse can be checked by direct calculation.

According to this proposition, absence of third-order interference can be expressed in terms of the orthogonal projections only:

Lemma 22. Consider a state space satisfying postulates 1 and 2. It has no third-order interference if and only if for any M-slit mask PJ, $J\subseteq \{1,\ldots ,M\}$, it holds that

Equation (4)

Proof. We have absence of third-order interference if for any choice of faces (as described in the statement of the lemma) and choice of effect e as well as state ω, (3) holds with equality. Since the states span the space, this is equivalent to the statement

and, due to proposition 21, to

As this must hold for all effects e, and the effects span the space, we obtain the statement of the lemma.

Now we are ready to prove one of our main results about the absence of third-order interference together with postulates 1 and 2:

Theorem 23. A system satisfies postulates 1, 2 and 3 if and only if it is an irreducible Jordan system or a classical system.

Proof. We begin with the 'if' direction: irreducible Jordan systems and classical systems satisfy postulates 1, 2 and 3. For classical systems it is well-known and easy to see that postulates 1 and 2 are satisfied: indeed, finite-dimensional classical state spaces Ω are often defined as those for which every state has a unique decomposition into extremal points, and in this case postulate 2 follows from the fact that any permutation of the extreme points in this unique maximal frame is an affine automorphism of Ω. Classical systems do not even have second-order interference [4] (the first level that is actually interference), so they cannot have any higher order of interference. It follows directly from a fairly standard orthogonal decomposition in formally real Jordan algebras (see e.g. [11]) that finite-dimensional Jordan systems satisfy postulate 1; and it is also well-known that the Jordan algebra automorphisms are affine automorphisms of the normalized state space, and act transitively on the set of ordered sets of orthogonal extremal states in the irreducible case [11]. In proposition 29 below, we show that in the context of postulates 1 and 2, absence of third-order interference is equivalent to the property that filters preserve purity of states. Since the latter property is well-known for a class of Jordan systems including the finite-dimensional ones [19, theorem 9.38], this shows that they also satisfy postulate 3.

The 'only if' direction is an immediate consequence of proposition 29—to be proved in the remainder of this section—which states that the absence of third-order interference implies that all filters preserve purity, together with theorem 17, which states that postulates 1, 2, and purity-preservation by filters imply that systems are irreducible Jordan, or classical.

We could have defined an M-slit experiment directly in terms of the positive projections PJ onto the faces. These describe the action of the slits on the state. However, referring to the corresponding effects eJ in definition 19 has the advantage that we know for sure that the effects can be implemented (due to proposition 1). On the other hand, there is no analogous statement that guarantees that the projections PJ themselves can actually be implemented as physical transformations. Thus, not referring to positive projections in the definition of an M-slit experiment means that we make fewer assumptions.

The proof of the crucial proposition 29 proceeds via several other propositions and lemmas. The following property is also mentioned in [16] and [18].

Lemma 24. It follows from postulates 1 and 2 that ${{P}_{J}}{{P}_{K}}={{P}_{J\cap K}}$ for any M-slit mask.

Proof. First note that if F, G and H are faces such that $F\;\bot \;H$ and $G\;\bot \;H$, then $(F\vee G)\bot \;H$. This is because $(F\vee G)\cap {{H}^{\bot }}$ is a face which contains F and G, and is also a subset of $F\vee G$, hence equal to $F\vee G$.

Defining the projective units ${{u}_{j}}:={{u}_{{{F}_{j}}}}$ and ${{u}_{J}}:={{u}_{{{F}_{J}}}}$, it follows from lemma 11 that ${{u}_{K}}={{\sum }_{k\in K}}{{u}_{k}}$. Hence

For $j\in J$ and $l\in K\backslash J$ we have ${{F}_{j}}\;\bot \;{{F}_{l}}$, thus ${{F}_{J}}={{\vee }_{j\in J}}{{F}_{j}}\;\bot \;{{F}_{l}}$, and so ${{P}_{J}}{{u}_{l}}=0$. On the other hand, if $k\in K\cap J$ then ${{P}_{J}}{{u}_{k}}={{u}_{k}}$, so

According to [19, proposition 7.39], this implies that ${{P}_{J}}{{P}_{K}}={{P}_{K}}{{P}_{J}}$, which in turn implies [19, theorem 8.3] that ${{P}_{J}}{{P}_{K}}={{P}_{J}}\wedge {{P}_{K}}={{P}_{J\cap K}}$.

The next proposition uses the decomposition described in proposition 18 to derive a similar decomposition corresponding to a complete M-slit mask.

Proposition 25. Let Pi with $i\in \{1,\ldots ,m\}$ be a complete M-slit mask on a system A satisfying postulates 1 and 2. Then there is an orthogonal decomposition

Equation (5)

where ${{A}_{i}}:={\rm im}{{P}_{i}},A_{ij}^{c}:={\rm ker} \;{{P}_{i}}\cap {\rm ker} \;{{P}_{j}}\cap {\rm im}{{P}_{ij}}$ and ${{A}^{(3)}}:=\;{{\bigcap }_{i\lt j}}{\rm ker} \;{{P}_{ij}}$.

Proof. Using proposition 18 and the fact that each face is itself a system satisfying postulates 1 and 2, we decompose each ${\rm im}{{P}_{ij}}$ as ${{A}_{i}}\oplus {{A}_{j}}\oplus A_{ij}^{c}$. (Note that we still have Aijc orthogonal to ${{A}_{i}}\oplus {{A}_{j}}$ because it is contained in ${\rm ker} \;{{P}_{i}}\cap {\rm ker} \;{{P}_{j}}$.) For $k,l\notin \{i,j\}$ we have $A_{ij}^{c}\bot \;A_{kl}^{c}$, since ${\rm im}{{P}_{ij}}\bot {\rm im}{{P}_{kl}}$. Furthermore, for $i\ne k$, $A_{ij}^{c}\bot \;A_{jk}^{c}$, because for $x\in A_{ij}^{c},y\in A_{jk}^{c}$

where the first equality follows from $x\in {\rm im}{{P}_{ij}},y\in {\rm im}{{P}_{jk}}$ due to the definitions of $A_{ij}^{c},A_{jk}^{c}$, the last equality from $y\in {\rm ker} \;{{P}_{j}}$ due to the definition of Acjk, and the second last equality from lemma 24. Now we just have to show that ${{A}^{(3)}}:=\;{{\bigcap }_{i\lt j}}{\rm ker} \;{{P}_{ij}}$ is the orthogonal complement of ${{\oplus }_{i}}{{A}_{i}}{{\oplus }_{i\lt j}}A_{ij}^{c}$. Since ${{\oplus }_{i}}{{A}_{i}}{{\oplus }_{i\lt j}}A_{ij}^{c}={\rm lin}\{{{\bigcup }_{i\lt j}}{\rm im}{{P}_{ij}}\}$, ${{({{\oplus }_{i}}{{A}_{i}}{{\oplus }_{i\lt j}}A_{ij}^{c})}^{\bot }}={{\bigcap }_{i\lt j}}{\rm ker} \;{{P}_{ij}}$, and we are done.

It is interesting to note that the pairwise intersections ${\rm ker} \;{{P}_{i}}\cap {\rm ker} \;P_{i}^{\prime }$ represent 'coherences' associated with the two-slit experiment ${{P}_{i}},P_{i}^{\prime }$ [18], and that intersecting this with ${\rm im}{{P}_{ij}}$ gives the part associated with the two-slit experiment ${{P}_{i}},{{P}_{j}}$. As an example, consider a quantum 3-level system with orthonormal basis $\{|i\rangle {{\}}_{i=1,2,3}}$, and let i = 1, j = 2 so we have positive projections ${{P}_{i}}={{P}_{1}}:\rho \mapsto \pi \rho \pi $ with $\pi =|1\rangle \langle 1|$, as well as ${{P}_{i}}^{\prime} =P_{1}^{\prime }:\rho \mapsto \pi ^{\prime} \rho \pi ^{\prime} $, where $\pi ^{\prime} =|2\rangle \langle 2|+|3\rangle \langle 3|$. The action of these on a 3 × 3 density matrix ρ is to set specific entries of the matrix to zero. More explicitly, ${\rm ker} \;{{P}^{1}}$ is the set of Hermitian matrices of the form $\left( \begin{array}{ccccccccccccccc} 0 & \bullet & \bullet \\ \bullet & \bullet & \bullet \\ \bullet & \bullet & \bullet \\ \end{array} \right)$, where • denotes an arbitrary entry. The •ʼs correspond to theentries set to zero by P1; interchanging the •ʼs with the 0ʼs would give the form of the matrices in ${\rm im}$ P1. Similarly ${\rm ker} \;P_{1}^{\prime }$ is the set of Hermitian matrices of the form $\left( \begin{array}{ccccccccccccccc} \bullet & \bullet & \bullet \\ \bullet & 0 & 0 \\ \bullet & 0 & 0 \\ \end{array} \right)$. So ${\rm ker} \;{{P}_{1}}\cap {\rm ker} \;{{P}_{1}}^{\prime} $ is the set of all Hermitian matrices of the form $\left( \begin{array}{ccccccccccccccc} 0 & \bullet & \bullet \\ \bullet & 0 & 0 \\ \bullet & 0 & 0 \\ \end{array} \right)$. Since ${{P}_{j}}={{P}_{2}}$ analogously projects onto the span of $|2\rangle $, then ${\rm im}{{P}_{ij}}\equiv {\rm im}{{P}_{12}}$ is the set of Hermitian matrices of the form $\left( \begin{array}{ccccccccccccccc} \bullet & \bullet & 0 \\ \bullet & \bullet & 0 \\ 0 & 0 & 0 \\ \end{array} \right)$, and the intersection ${\rm ker} \;{{P}_{1}}\cap {\rm ker} \;{{P}_{1}}^{\prime} \cap {\rm i}{{{\rm m}}_{+}}\;{{P}_{12}}$ yields the offdiagonal elements corresponding to the two-slit experiment Pi, Pj as claimed, i.e. the Hermitian matrices of the form $\left( \begin{array}{ccccccccccccccc} 0 & \bullet & 0 \\ \bullet & 0 & 0 \\ 0 & 0 & 0 \\ \end{array} \right)$.

The decomposition of proposition 25 is thus into the spans of the faces ${\rm i}{{{\rm m}}_{+}}\;{{P}_{i}}$, $M(M-1)$ spaces associated with interference between these faces, and a further space, which as the next proposition shows, is associated with three-way interference.

Proposition 25 is stated as a decomposition of the vector space A. However, note that every face of ${{A}_{+}}$ (with group of reversible transformations given by the restriction of those global reversible transformations that preserve that face) is itself a state space satisfying postulates 1 and 2. Thus, if we have an incomplete M-slit mask with $F:={\rm im}{{P}_{12...M}}$ and corresponding face ${{F}_{+}}:=F\cap {{A}_{+}}$, we obtain a decomposition

Equation (6)

where ${{F}_{i}}={\rm i}{{{\rm m}}_{+}}{{P}_{i}}\subseteq F$, $F_{ij}^{c}={\rm ker} \;{{P}_{i}}\cap {\rm ker} \;{{P}_{j}}\cap {\rm i}{{{\rm m}}_{+}}{{P}_{ij}}\subseteq F$, and ${{F}^{(3)}}={{\bigcap }_{i\lt j}}{\rm ker} \;{{P}_{ij}}\cap F$. This is used in the following proposition.

Proposition 26. Let A be a state space satisfying postulates 1 and 2. Then there is no third-order interference on A if and only if for every M-slit mask PJ, $J\subset \{1,\ldots ,M\}$ with $M\geqslant 2$, and every pure state $\omega \in {\rm im}{{P}_{12...M}}$, the component ${{\omega }^{(3)}}$ of ω in ${{F}^{(3)}}$ in (6) is zero.

Proof. From lemma 22, the absence of third order interference is equivalent to

Equation (7)

for all $x\in A$. However, since ${{P}_{ij}}={{P}_{ij}}{{P}_{12\cdots M}}$ and ${{P}_{i}}={{P}_{i}}{{P}_{12\cdots M}}$, this is equivalent to (7) holding for all $x\in {\rm im}{{P}_{12\cdots M}}=:F$. Since the pure states in F span F, this is equivalent to

for all pure states $\omega \in F$. By proposition 25 and its consequence (6), ${{P}_{ij}}\omega ={{P}_{i}}\omega +{{P}_{j}}\omega +\omega _{ij}^{c}$, where $\omega _{ij}^{c}$ is the component of ω in Fijc. So absence of third-order interference is equivalent to

for all pure states $\omega \in F$. Noting that ${{\sum }_{i\lt j}}({{P}_{i}}+{{P}_{j}})$ contains, for each fixed value of k, $M-1$ occurrences of Pk, this becomes:

In other words, ${{\omega }^{(3)}}=0$ in ${{F}^{(3)}}$ in (6).

Definition 27. The impurity $I(\omega )$ of any unnormalized state $\omega \geqslant 0$ is defined as:

For normalized states $\omega \in \Omega $, we have $u(\omega )=1$, and $\parallel \omega \parallel \leqslant 1$, with equality if and only if ω is a pure state. Extending this to the unnormalized states by multiplication $\omega \mapsto \lambda \omega $ with $\lambda \geqslant 0$ shows that $I(\omega )\geqslant 0$ for all $\omega \geqslant 0$, with equality if and only if ω is ray-extremal.

Proposition 28. Let Pi with $i\in \{1,\ldots ,M\}$ be an M-slit mask on a system satisfying postulates 1 and 2. Then for any state $\omega \in F:={\rm im}{{P}_{12...M}}$ (not necessarily pure or normalized)

Equation (8)

where ${{\omega }^{(3)}}$ is the component of ω in ${{F}^{(3)}}$ in (6).

While we use this equation directly in what follows, its significance is underlined by noting its immediate corollary: that if ω and each of the ${{P}_{i}}\omega $ are pure, and there is no third-order interference, then (by the nonnegativity of impurity) each of the ${{P}_{ij}}\omega $ is also pure. In other words: in the absence of third-order interference, if the Pi are each purity-preserving, so also are the Pij.

Proof of proposition 28. First we expand $\omega \in F$ via (6):

Taking squared norms and using orthogonality of the decomposition, we get

Equation (9)

In order to get results about the purity of ${{P}_{i}}\omega $ and ${{P}_{ij}}\omega $, we use ${{P}_{ij}}\omega ={{P}_{i}}\omega +{{P}_{j}}\omega +\omega _{ij}^{c}$ to eliminate $\omega _{ij}^{c}$ by substituting $||\omega _{ij}^{c}|{{|}^{2}}=||{{P}_{ij}}\omega |{{|}^{2}}-||{{P}_{i}}\omega |{{|}^{2}}-||{{P}_{j}}\omega |{{|}^{2}}$ in (9), obtaining:

Since a given k appears (as i or j) in $M-1$ of the pairs $i\lt j$, and the last sum in the above expression has a $||{{P}_{k}}\omega ||$ for each such appearance, this becomes

Note that $\parallel {{P}_{ij}}\omega {{\parallel }^{2}}=u{{({{P}_{ij}}\omega )}^{2}}-I({{P}_{ij}}\omega )$, so

Equation (10)

Using ${{u}_{ij}}={{u}_{i}}+{{u}_{j}}$,

Again using the fact that a given i appears in $M-1$ of the pairs $i\lt j$, and writing ${{\sum }_{i}}{{\sum }_{j\ne i}}$ in place of $2{{\sum }_{i\lt j}}$, this becomes:

Now, since ${{u}_{F}}(\omega )=\langle {{P}_{F}}u,\omega \rangle =\langle u,{{P}_{F}}\omega \rangle =u(\omega )$ and ${{\sum }_{j\ne i}}{{u}_{j}}(\omega )={{u}_{F}}(\omega )-{{u}_{i}}(\omega )$, we get

Substituting this into (10) and rearranging gives (8).

We will use this result several times in an inductive argument to establish that all filters are purity preserving.

Proposition 29. Let a system A satisfy postulates 1 and 2. Then it has no third-order interference if and only if all its filters are purity-preserving.

Proof. Suppose that all filters are purity-preserving. Then, if Pi, $i\in \{1,\ldots ,M\}$ is any M-slit mask and ω is a pure state in ${\rm im}{{P}_{12...M}}$, we have $I(\omega )=I({{P}_{i}}\omega )=I({{P}_{ij}}\omega )=0$, and so (8) implies that the component ${{\omega }^{(3)}}$ of ω in ${{F}^{(3)}}$ in (6) is zero. Then proposition 26 implies that there is no third-order interference.

To show the converse direction, note first that it follows from [19, proposition 7.28] in the context of postulates 1 and 2 that all filters are of the form PF for some face F; thus, we only have to show that these orthogonal projections are purity-preserving. Let N be the size of Aʼs largest frame. The proof that all filters are purity-preserving will be inductive on the rank of filters. The base case is rank-1 filters, which holds because a rank-1 filter projects the state onto the span of an extremal ray of ${{A}_{+}}$.

We now prove the induction step, which states that if for some fixed rank $k\leqslant N-1$ all filters are purity-preserving, then all filters of rank $k+1$ are purity preserving. Suppose filters of rank k are purity-preserving and consider any mask consisting of a rank-k filter P1 and $N-k$ rank-1 filters ${{P}_{i}},i\in \{2,\ldots ,N-k+1\}$. Then for any pure state ω each ${{P}_{i}}\omega $ is pure. So with $\parallel {{\omega }^{(3)}}{{\parallel }^{2}}=0$ by the absence of third-order interference, (8) becomes:

Since impurity is non-negative, each of the ${{P}_{ij}}\omega $ is pure too. So all the ${{P}_{i}}\vee {{P}_{j}}$, and in particular the rank-$(k+1)$ filters ${{P}_{1}}\vee {{P}_{i}}$, $i\in \{2,\ldots ,N-k+1\}$, are purity-preserving. Since every rank-$(k+1)$ filter on A has the form $P\vee Q$ for some rank-k P and some rank-1 Q orthogonal to P, all rank-$(k+1)$ filters on A are purity-preserving, and the induction step is established for $k\leqslant N-1$. Hence all filters of rank up to $N-1$ are purity-preserving.

In the context of assumptions (a) and (b) of theorem 16, postulate 3' is known [19, 45] to be equivalent to another postulate: that the lattice of exposed faces has the covering property. We say that an element F of a lattice covers another element G if G is below F and there is nothing between them. Hence an atom is an element that covers 0. By definition, a lattice has the covering property if for every element F and atom a, either $F\vee a=F$ or $F\vee a$ covers F.

In the context of postulates 1 and 2, the covering property can be formulated as follows: if F is any face of ${{A}_{+}}$, and ω a pure state, then the face G generated by both has rank $|G|\leqslant |F|+1$. Since we have shown that (a) and (b) of theorem 16 follow from postulates 1 and 2, the covering property can replace the absence of third order interference (or postulate 3$^{\prime }$).

6. Standard quantum theory from observability of energy

In standard quantum mechanics, we are used to treating the generator of time evolution as an observable: evolution of any closed quantum system with initial state ${{\rho }_{0}}$ is given by

where $H={{H}^{\dagger }}$ is the systemʼs Hamiltonian. The right-hand side, as a one-parameter group acting on ρ, is generated by the superoperator $X:\rho \mapsto -i[H,\rho ]$, so that $\rho (t)={{{\rm e}}^{tX}}{{\rho }_{0}}$. We are used to associating the observable $E:\rho \mapsto {\rm tr}(H\rho )$ with this generator, and call it the 'expectation value of energy'.

It is an interesting question why such an association is possible—what is the operational relation between E and X? The following properties characterize this relation:

  • if X and $X^{\prime} $ are two different generators, then the corresponding observables satisfy $E^{\prime} \ne E.$ That is, the observable determines the generator uniquely.
  • The observable E is a conserved quantity of the time evolution generated by X: $E(\rho (t))=E({{\rho }_{0}})$.
  • If time evolution is not trivial (i.e. $\rho (t)$ not constant), then E is also not a trivial observable: there are at least two states $\rho ,\sigma $ such that $E(\rho )\ne E(\sigma )$.
  • The map $X\mapsto E$ is linear—in particular, larger values of E correspond to 'faster' time evolution.

These properties allow us to define a notion of 'observability of energy' for arbitrary probabilistic theories, which will turn out to be a rather restrictive property.

Definition 30. Let A be any state space with a group of reversible transformations ${{\mathcal{G}}_{A}}$. An energy observable assignment is an injective linear map $\phi :{{\mathfrak{g}}_{A}}\to A*$ such that the observable $\phi (X)$ is conserved under the time evolution generated by X, but not under all time evolutions unless $X=\phi (X)=0$. We say that 'energy is observable' on system A if ${{\mathfrak{g}}_{A}}\ne \{0\}$ and if there exists an energy observable assignment.

Our fourth postulate is thus

Postulate 4. Energy is observable on every system.

Writing the time evolution starting with initial state ${{\omega }_{0}}$ explicitly as

a conserved quantity $E\in {{A}^{*}}$ is a linear functional with $E(\omega (t))=E({{\omega }_{0}})$. It is easy to check that this is equivalent to $E\;\circ \;X=0$, where '○' is for composition of linear maps. If E were equal to the order unit, i.e. $E={{u}_{A}}$, then $E(\omega (t))={{u}_{A}}(\omega (t))=1$ for all t and all time evolutions, since all elements of ${{\mathcal{G}}_{A}}$ preserve the normalization. Thus, definition 28 implies the conditions

Our notion is related to Alfsen and Shultzʼs notion of a 'dynamical correspondence' [19], except that they require an injection of observables into dynamical generators, rather than vice versa, and in addition to a conservation condition, impose a condition relating reversible transformations to general automorphisms of the cone of states which is formulated in the Jordan-algebraic setting. Our setting is more general, and we impose no such relation between the reversible transformations and cone automorphisms. Connes [23] used a notion of orientation related to dynamical correspondence to characterize the state spaces of von Neumann algebras (one of the infinite-dimensional generalizations of standard quantum systems) among those of JBW-algebras (one infinite-dimensional generalization of finite-dimensional formally real Jordan algebras). Other work making use of similar notions to characterize quantum and classical theory in different settings, can be found in [2427]. References [24, 25], and [26] all derive relations between energy and observables, and thence that the theory must essentially be standard quantum or classical, from considerations involving dynamics on composites, so our work is complementary to theirs in that we avoid assumptions about composite systems.

The identification of dynamical generators with conserved observables that exists in classical and quantum theories is central to many physical phenomena and arguments, providing motivation for our postulate. We mention in particular that standard formulations of the statistical mechanics underlying thermodynamics use a conserved energy observable in the definition of free energy.

Our goal is to show the following:

Theorem 31. Postulates 1, 2, 3, and 4 imply that the state space is an N-level state space of standard complex quantum theory, for some $N\in \mathbb{N}$, and all conjugations $\rho \mapsto U\rho {{U}^{\dagger }}$ with $U\in SU(N)$ are contained in the group of reversible transformations.

Proof. We show that complex quantum N-level state spaces are the only finite-dimensional irreducible formally real Jordan algebra state spaces that have observability of energy. This is enough due to theorem 23.

First, consider the d-dimensional ball state spaces ('spin factors')

(The qubit appears again in this class of systems, since the d = 3 case is the Bloch ball.) The Lie algebra is non-trivial only for $d\geqslant 2$. Consider the case that the group $\mathcal{G}$d of reversible transformations contains the full orthogonal group, such that ${{\mathfrak{g}}_{d}}=so(d)$. If postulate 4 holds, then there must be an injective linear map ϕ from the Lie algebra ${{\mathfrak{g}}_{d}}$ of $\mathcal{G}$d to $\mathbb{R}$d+1. But ${\rm dim}(so(d))=d(d-1)/2$ which is larger than $d+1$ for $d\geqslant 4$, so no such map can exist for $d\geqslant 4$. If d = 2, we have

and calling this matrix X, it is easy to see that $\phi (X)\;\circ \;X=0$ implies that $\phi (X)=c\cdot {{u}_{d}}$ for the normalization functional ${{u}_{d}}({{x}_{1}},{{x}_{2}},{{x}_{3}})={{x}_{1}}$. This contradicts the definition of an energy observable assignment.

If d is even or d = 7, there are compact connected subgroups of SO(d) that are transitive on the pure states of ${{\Omega }_{d}}$, and thus satisfy postulate 2, (see [49] for the list of groups; they have been classified in [39, 40]). As we show in appendix B, all of these cases except for one can be ruled out by dimension counting, exactly as the cases $d\geqslant 4$ above; the only case where this does not work is d = 4 with transformation group ${{\mathcal{G}}_{2}}=SU(2)$. But there, it can be shown that there are time evolutions which only have the normalization as their conserved observable, contradicting definition 30.

Now let A be the state space of the 3 × 3 octonionic matrices. Due to postulate 2 (in the special case of 1-frames), the group of reversible transformations ${{\mathcal{G}}_{A}}$ acts transitively on the pure state manifold, which is the Cayley plane ${{P}^{2}}(\mathbb{O})$, hence so does its connected component at the identity [50]. According to [22] and [21], the only compact connected Lie group which acts transitively and effectively on it is the exceptional Lie group F4. But ${\rm dim}({{F}_{4}})=52\gt {\rm dim}(A)=27$, so there is no injective linear map from ${{\mathfrak{g}}_{A}}$ to A*.

For $N\geqslant 3$, consider the N-level state space AN of quaternionic quantum mechanics, with any group of reversible transformations ${{\mathcal{G}}_{N}}$ satisfying postulate 2. Then ${\rm dim}({{A}_{N}})=2{{N}^{2}}-N$. The pure states define the quaternionic projective space ${{P}^{N-1}}(\mathbb{H})$, and so ${{\mathcal{G}}_{N}}$ must act transitively on it. According to [21], the only possibility is ${{\mathfrak{g}}_{N}}\supseteq sp(N)$, and ${\rm dim}(sp(N))=N(2N+1)$, which is larger than ${\rm dim}({{A}_{N}})$.

The only remaining cases are the N-level state spaces AN of real quantum mechanics for $N\geqslant 3$, which are more difficult to rule out—dimension counting does not work. First, it can be shown from the classification results of [22] that postulate 2 implies that the group of reversible transformations contains all maps of the form $\rho \mapsto O\rho {{O}^{T}}$ with $O\in SO(N)$; consequently, every map $X(\rho )\;:=\rho \mapsto [M,\rho ]$ with $M\in so(N)$ is a valid generator. An energy observable assignment ϕ maps these generators (respectively the matrices M) to observables (that is, symmetric matrices $\bar{M}$) such that $[\phi (X)](\rho )={\rm tr}(\bar{M}\rho )$; the conservation condition $\phi (X)\;\circ \;X=0$ becomes $[M,\bar{M}]=0$. However, as we show in the appendix by considering certain special generators X, all maps of this kind must have $\bar{M}={\bf 1}$ in their range, yielding the normalization functional, which contradicts the definition of an energy observable assignment.

In the standard case of complex N-level quantum theory, it remains to show that the group of reversible transformations ${{\mathcal{G}}_{N}}$ contains all unitaries (it might also contain anti-unitaries; due to Wignerʼs theorem [28, 29], these are the only possibilities). Postulate 2 implies transitivity of the connected subgroup of ${{\mathcal{G}}_{N}}$ on the pure states, hence on the projective space ${{P}^{N-1}}(\mathbb{C})$; according to [21], for odd N, the only possibility is the projective action of SU(N); but if N is even, say $N=2n$, there is a second possibility, which is the projective action of Sp(N). But consider two N-frames $|{{e}_{1}}\rangle \langle {{e}_{1}}|,\ldots ,|{{e}_{N}}\rangle ,\langle {{e}_{N}}|$ and $|{{f}_{1}}\rangle \langle {{f}_{1}}|,\ldots ,|{{f}_{N}}\rangle ,\langle {{f}_{N}}|$, where ${{e}_{1}},\ldots ,{{e}_{N}}$ are defining vectors of the basis in which $J=\left( \begin{array}{ccccccccccccccc} 0 & -{\bf 1} \\ {\bf 1} & 0 \\ \end{array} \right)$, such that a unitary U is in Sp(N) if and only if ${{U}^{T}}JU=J$. Moreover, suppose that ${{f}_{1}}={{e}_{1}}$. If postulate 2 is satisfied, there is $U\in Sp(N)$ such that $U|{{e}_{i}}\rangle \langle {{e}_{i}}|{{U}^{\dagger }}=|{{f}_{i}}\rangle \langle {{f}_{i}}|$ for all i, so $U{{e}_{1}}={{e}^{i\varphi }}{{e}_{1}}$ for some $\varphi \in \mathbb{R}$. Since $J{{e}_{1}}={{e}_{n+1}}$, it is easy to see that the symplectic constraint on U, together with ${{U}^{\dagger }}{{e}_{n+1}}=\overline{({{U}^{T}}{{e}_{n+1}})}$, implies that $U{{e}_{n+1}}={{{\rm e}}^{-{\rm i}\varphi }}{{e}_{n+1}}$, so $|{{f}_{n+1}}\rangle \langle {{f}_{n+1}}|=|{{e}_{n+1}}\rangle \langle {{e}_{n+1}}|$, which contradicts frame transitivity, i.e. postulate 2.

The fact that energy observability rules out classical systems in this theorem is a consequence of our finite-dimensional setting, for which classical reversible dynamics are a discrete group. The probabilistic representation of phase-space classical mechanics involves an infinite-dimensional space of Liouville distributions, and does, of course, have continuously parametrized reversible dynamics.

7. Discussion and conclusions

We have given four principles that we argue have, to various degrees, the virtues of conceptual clarity, important physical implications, intuitive appeal, and interesting experimental consequences. We have shown that while they are formulated in the setting of an extremely broad class of probabilistically described systems together they constrain the abstract structure of such a system to be that of the usual Hilbert space quantum theory over the complex field. Our demonstration was limited to finite dimension, a limitation which we believe to be primarily technical. This reconstruction of quantum theory differs interestingly from several previous ones in avoiding any postulates concerning the structure or even existence of composite systems.

Another desirable feature of our reconstruction is its stepwise structure, in which conceptually and often physically significant properties appear even as a consequence of the first postulate, and additional such properties appear at each step.

Postulates 1 and 2 together further have very strong consequences: they imply that all effects are allowed, that every face of the state space is the image of a filter, i.e., that the state space is projective, and also that it is self-dual. Filters allow one to verify that a state is in a claimed face of the state space without (if the claim is true) disturbing the state. They are likely to be important ingredients of both information-processing and thermodynamical protocols; possibilities which are under investigation. Filters can also be used to equip a system with operations destroying coherence between any set of mutually orthogonal faces. In other words, the existence of filters ensures the possibility of a process of decoherence similar to the one in quantum theory.

Self-duality is another strong property of state spaces that is independent of projectivity. Self-duality introduces a correspondence between atomic measurement outcomes and pure states that is exploited in quantum steering and teleportation, for example. It is also known to be linked, in some special contexts such as polygonal state spaces, to correlations satisfying the Tsirelʼson bound on violations of Bell locality [62].

The lattice of faces given postulates 1 and 2 is orthomodular–as is implied, indeed, by projectivity. This expresses a kind of 'local classicality', which one sees also in the topos-theoretic approach of e.g. [61], and also relates our work to the classic 'quantum logic' approach initiated by Birkhoff and von Neumann [38]. Postulate 2 imposes a high degree of symmetry on this lattice—it would be interesting to investigate lattices with such high symmetry using purely lattice-theoretic methods.

There is a close connection between postulate 2 and certain properties of the circuit model for quantum computation. In this model it is standard to start with an input n-level system in a particular state, as well as a number of other n-level systems which can without loss of generality be taken to be in the $|0\rangle $ state. Then we implement the circuit representing the computation we wish to carry out, and at the end we must measure a specific observable to determine the (probability of the) output of the computation. This last measurement step can be done without loss of generality by first reversibly transforming the (generally entangled) logical n-level system of interest into an individual physical n-level system, and then doing the desired measurement on this system alone. This transfer is possible because quantum theory satisfies postulate 2. Postulate 1 and 2 together can be understood as generalizing this idea by demanding that every state (not just pure ones) of a system can be transferred to any other system (with the same or larger number of distinguishable states) by a suitable reversible interaction, provided both are subsystems of a common larger system.

Our third postulate provides, in the context set by the first two postulates, a perhaps surprising link between the absence of irreducibly three-slit interference, currently under experimental scrutiny, and mathematical notions: the Jordan algebraic structure of quantum theory on the one hand, and the satisfaction of the covering law by its lattice of faces, on the other. In the context of our first two postulates, these are all equivalent. The known equivalence (even in the broader context of projective systems) of the latter two with the requirement that filters preserve purity is further food for thought. An interesting question is whether the equivalence of no higher-order interference with either of these two principles still holds in the broader projective context. Looking to operational consequences, perhaps the failure of purity preservation might give rise to an extra source of noise or irreversibility in information processing or thermodynamical protocols—though this might be circumvented if the protocols are designed so the states being filtered are 'compatible' with the filters.

Most interesting, perhaps, is the possibility that there exist families of systems satisfying our first two postulates but not the third: these would still have an extremely regular structure and likely support interesting information processing, but so far no examples are known. Should they be shown not to exist, we would then know that Jordan systems are singled out by postulates 1 and 2 alone.

The final step, narrowing things down from Jordan systems to complex quantum systems via energy observability, is not so surprising. Similar postulates have been used for this purpose by Connes and by Alfsen and Shultz. We require an injection of dynamical generators into the space of observables, each injected generator conserved by the dynamics it generates, whereas Alfsen and Shultz require the converse and also impose ancillary conditions. In contrast, our condition, though applied only to Jordan algebraic systems, is formulated in greater generality where the ancillary conditions do not make sense. It is likely that in the Jordan-algebraic setting, the ancillary conditions, as well as a bijection, are obtained automatically. Exploration of conditions of this type—either ours, or abstractions of Connes' or Alfsen and Shultzʼs—in a broader context are desirable. Indeed, as we have mentioned, others have explored similar principles, though some of these investigations (e.g. [24, 25]) have made use of composite systems which appear to us to be required to satisfy local tomography. In the context of our postulates 1 and 2, locally tomographic composites and the existence of stand-alone two-level systems would imply that the systems are standard quantum systems; indeed one reason for our interest in energy observability is as an alternative to local tomography.

The fact that energy observability rules out classical systems in this theorem is an artifact of our finite-dimensional setting, for which classical reversible dynamics are a discrete group. Since infinite-dimensional classical systems do have continuous one-parameter groups of reversible transformations, however, it is important to point out that there are numerous alternative assumptions which would allow us to rule out classical systems in the finite-dimensional case without assuming the existence of continuous reversible dynamics. Such alternatives are likely to retain their usefulness in infinite dimensions. For example, we could postulate the existence of a tradeoff between information gained in a measurement and disturbance to the measured state [70], or the existence of at least one state that has two distinct convex decompositions into pure states, or the existence of interference; the existence of nonclonable or nonbroadcastable sets of states [36, 37] might also work.

Although we are not aware of work using the set of postulates we use, several authors have used one or more related principles. In Wilceʼs characterization in [68], a symmetry principle reminiscent of our postulate 2 (but concerning test spaces rather than state spaces) was used, along with reversible transitivity on pure states (a special case of postulate 2). In his most recent reconstruction, Hardy [69] uses a postulate ('filters are non-flattening') which relies on a definition of filters that is equivalent to ours (at least in the context of our postulates 1 and 2), and which implies postulate 3' (that filters are purity-preserving). Niestegge has also used the absence of higher-order interference as one ingredient in deriving Jordan algebraic systems [12]. In [17] it was established that finite-dimensional Jordan systems do not have higher-order interference, a result also found by Niestegge in [12].

Dakić and Brukner [43] have used postulate 1 and the fact that all pure states are connected by reversible transformations to derive the ball shape of two-level state spaces (a fact that carries over to all two-level systems satisfying our postulates 1 and 2). In their reconstruction of quantum theory, Chiribella et al [42] have proven several lemmas that are close to some of ours (such as statements on positive projections, or a sum representation of projective units), but obtained them from different assumptions. We have already mentioned other work postulating connections between observables and dynamical generators. More work understanding the connections between the various approaches would likely be fruitful.

Besides providing an understanding of the Hilbert space structure of quantum theory from first principles, our reconstruction suggests a variety of open questions, such as the existence of systems with strong symmetry and classical decomposability, but also with higher-order interference. Furthermore, we think that the naturalness of our postulates allows us to make closer contact with other aspects of physics, a direction we consider important to pursue.

This is evident from the postulates themselves—postulate 3 considers a property that is under direct experimental investigation, and so solving the aforementioned open problem might provide concrete consistent models that can be tested against quantum theory in experiments. Postulate 4 relates the probabilistic structure to the existence of a notion of energy of the form physicists are used to. Furthermore, consequences of the postulates—such as projectivity—seem crucial for thermodynamic reasoning. In fact, weaker versions of postulates 1 and 2, in conjunction with local tomography, are enough to make sense of the general-probabilistic thermodynamics results in [73, 74].

In this sense, our result is part of a broader research program: analyze the structure of physics—that is, the way that the different parts of physics fit together—by rigorously assessing the consequences of changing some of its parts. One part of physics is quantum theory, and seeing how a more general probabilistic theory could still harmonize with thermodynamics or Hamiltonian mechanics is one of many ways to gain insights into the way our world works. Given the current quest for a theory that unifies quantum and gravitational physics, in a situation where conclusive experimental results are mostly absent, it seems particularly promising to rigorously analyze the logical and conceptual structure of what is known, hoping thereby to glimpse a path towards the unknown.

Acknowledgments

Some of this work was done while the authors were employed by or visiting the Perimeter Institute for Theoretical Physics, Waterloo, Ontario, Canada. Research at Perimeter Institute is supported in part by the Government of Canada through NSERC and by the Province of Ontario through MRI. Also, part of the work was done while HB was a Fellow of the Stellenbosch Institute for Advanced Studies at the Wallenberg Research Center at Stellenbosch University, in 2012. Furthermore, we would like to thank two anonymous referees for the thorough review of the manuscript and for helpful suggestions, and one in particular for pointing out a mistake in an earlier version of proposition 1 and for suggesting how to fix it.

Appendix A.: Perfection and positive projections

In this section, we give a proof of the following proposition which is originally due to Iochum [30, 31].

Proposition 32. Let ${{A}_{+}}$ be a regular self-dual cone in A. ${{A}_{+}}$ is perfect if and only if each orthogonal (with respect to the self-dualizing inner product) projection PF onto the linear span F of a face ${{F}_{+}}$, is positive.

Proof. We write $F_{+}^{*}$ for the dual of ${{F}_{+}}$ in F, according to the restriction of the self-dualizing inner product for ${{A}_{+}}$; thus perfection means that $F_{+}^{*}={{F}_{+}}$ for every face.

We begin with 'only if'. Let P be the orthogonal projector onto F, $x\in {{A}_{+}}$, $y\in {{F}_{+}}$. Now $\langle y,Px\rangle =\langle {{P}^{*}}y,x\rangle $; since P is Hermitian this equals $\langle Py,x\rangle =\langle y,x\rangle $. The latter is non-negative because both y and x are in ${{A}_{+}}$, which is self-dual. So we have shown $\forall y\in {{F}_{+}}\;\langle y,Px\rangle \geqslant 0$, i.e. $Px\in F_{+}^{*}$. But by perfection $F_{+}^{*}={{F}_{+}}$. Thus $Px\in {{F}_{+}}$ for any $x\in {{A}_{+}}$, i.e. P is positive.

For 'if', we begin by observing that given positivity of P, $P{{A}_{+}}={{F}_{+}}$. This is because $Px=x$ for any $x\in F$, so $P{{F}_{+}}={{F}_{+}}$, whence $P{{A}_{+}}\supseteq {{F}_{+}}$; on the other hand $P{{A}_{+}}\subseteq {{F}_{+}}$ by positivity.

Note that ${{F}_{+}}\subseteq F_{+}^{*}$ as a consequence of self-duality of ${{A}_{+}}$: since everything in ${{F}_{+}}$ is in ${{A}_{+}}$, it must have non-negative inner product with everything in ${{A}_{+}}$, hence with everything in ${{F}_{+}}$, and since it is in addition in F, it is in $F_{+}^{*}$. Recall that $y\in F_{+}^{*}$ is defined as $y\in F$ and satisfying $\forall x\in {{F}_{+}}\;\langle y,x\rangle \geqslant 0$. Since $P{{A}_{+}}={{F}_{+}}$, the latter part of this condition is equivalent to $\forall z\in {{A}_{+}}\;\langle y,Pz\rangle \geqslant 0$. Again moving the projector to act on y, using its Hermiticity and that $y\in F$ so Py = y, this is equivalent to $\forall z\in {{A}_{+}}\;\langle y,z\rangle \geqslant 0$, i.e. $y\in A_{+}^{*}$. Since $A_{+}^{*}={{A}_{+}}$ and y was also assumed in F, $y\in {{F}_{+}}$, establishing that $F_{+}^{*}\subseteq {{F}_{+}}$. We have now shown $F_{+}^{*}={{F}_{+}}$, i.e. perfection.

Appendix B.: Calculations for observability of energy

The goal of this section is to show the following:

Lemma 33. The possible state spaces satisfying postulates 1, 2 and 3 which have a non-trivial connected component $\mathcal{G}$0 of their reversible transformation groups are the following:

  • The d-dimensional ball state spaces ${{\Omega }_{d}}:=\{{{(1,r)}^{T}}\;\;|\;\ r\in {{\mathbb{R}}^{d}},\;\parallel r\parallel \leqslant 1\}$ with $d\geqslant 2$, and either ${{\mathcal{G}}_{0}}=SO(d)$, or ${{\mathcal{G}}_{0}}=SU(d/2)$ if $d=4,6,8,\ldots $, or ${{\mathcal{G}}_{0}}=U(d/2)$ if $d=2,4,6,8,\ldots $, or ${{\mathcal{G}}_{0}}=Sp(d/4)$ if $d=8,12,16,\ldots $, or ${{\mathcal{G}}_{0}}=Sp(d/4)\times U(1)$ if $d=8,12,16,\ldots $, or ${{\mathcal{G}}_{0}}=Sp(d/4)\times SU(2)$ if $d=4,8,12,\ldots $, or ${{\mathcal{G}}_{0}}={{G}_{2}}$ if d = 7, or ${{\mathcal{G}}_{0}}=Spin(7)$ if d = 8, or ${{\mathcal{G}}_{0}}=Spin(9)$ if d = 16,
  • N-level real quantum theory with $N\geqslant 2$ and ${{\mathcal{G}}_{0}}=\{\rho \mapsto O\rho {{O}^{T}}\;\;|\;\;O\in SO(N)\}$,
  • N-level complex quantum theory with $N\geqslant 2$ and ${{\mathcal{G}}_{0}}=\{\rho \mapsto U\rho {{U}^{\dagger }}\;\;|\;\;U\in SU(N)\}$,
  • N-level quaternionic quantum theory with $N\geqslant 2$ and ${{\mathcal{G}}_{0}}\simeq Sp(N)/\{-{\bf 1},+{\bf 1}\}$ (see [21, 28]),
  • three-level octonionic quantum theory with ${{\mathcal{G}}_{0}}\simeq {{F}_{4}}$.

However, among those, only the complex quantum theory state spaces (including ${{\Omega }_{3}}$, the qubit) satisfy postulate 4, that is, observability of energy.

In complex quantum theory, the group of reversible transformations $\mathcal{G}$ can actually be larger: it may also contain the antiunitary transformations according to Wignerʼs theorem (but not more). Similarly, real quantum theory may also contain the conjugations with $O\in O(N)$ (which yields additional transformations for even N), but for quaternionic quantum theory with $N\geqslant 3$, we have $\mathcal{G}={{\mathcal{G}}_{0}}$ [60]. As pointed out in [28, 60], the case N = 2 is exceptional in the quaternionic case. Since the state space is in this case a five-dimensional unit ball, $\mathcal{G}$ may contain reflections in adition to the rotations $SO(5)\simeq Sp(2)/\{-{\bf 1},+{\bf 1}\}$. We do not know whether octonionic 3 × 3 quantum theory may contain additional elements in its transformation group, and we do not know the complete classifications of possible compact transformation groups $\mathcal{G}\supset \mathcal{G}$0 for the ball state spaces (except that obviously $\mathcal{G}\subset O(d)$).

Lemma 33 will be proven step by step. We start by showing that the only ball state space with transitive group of reversible transformations that has observability of energy is the qubit.

Lemma 34. For $d\geqslant 2$, consider the d-dimensional ball state space ${{\Omega }_{d}}:=\{{{(1,r)}^{T}}\;\;|\;\ r\in {{\mathbb{R}}^{d}},\;\parallel r\parallel \leqslant 1\}$, and let $\mathcal{G}$d be any compact group of reversible transformations that acts transitively on the pure states. Then energy is observable (in the sense of definition 30) if and only if d = 3.

Proof. If $\mathcal{G}$d acts transitively on the pure states, then so does its connected component at the identity [50]. According to [49], the list of groups is the following. Since the group action is locally effective [21], the dimensions of ${{\mathfrak{g}}_{d}}$ are just the dimensions of the corresponding groups.

  • For all $d\geqslant 2$: SO(d). We have shown in the main text that an energy observable assignment only exists if d = 3.
  • For $d=4,6,8,\ldots :$ $SU(d/2)$. We have ${\rm dim}(su(d/2))={{(d/2)}^{2}}-1$, and this is larger than $d+1$ if $d\geqslant 6$. Thus, no injective map $\phi :su(d/2)\to {{\mathbb{R}}^{d+1}}$ defining an energy observable assignment can exist. However, we have to treat d = 4 separately. In this case, the transformation group is (up to similarity)
    such that the Lie algebra is at least
    Let $X\in {{\mathfrak{g}}_{4}}$ be a generator corresponding to the choice of parameters a = 1 and $b=c=0$. If ϕ is any energy observable assignment, we can write the functional $\phi (X)$ as a vector $\varphi \in \;\mathbb{R}$5 such that $[\phi (X)](y)=\langle \varphi ,y\rangle $ for all y $\in \;\mathbb{R}$5, and the condition $\phi (X)\;\circ \;X=0$ translates into ${{X}^{T}}\varphi =0$. The kernel of XT is one-dimensional, with unique solution (up to some factor) of $\varphi =\lambda \cdot {{(1,0,0,0,0)}^{T}}$, $\lambda \in \mathbb{R}$. But this represents the normalization functional: $\langle \varphi ,y\rangle ={{u}_{4}}(y)$ for all y, so $\phi (X)={{u}_{4}}$, contradicting the definition of an energy observable assignment.
  • For $d=2,4,6,8,\ldots $: $U(d/2)$. The case d = 2 is already covered in the main text; in all other cases, this representation contains the corresponding representation of $SU(d/2)$ as a subgroup, and this has already been treated.
  • For $d=8,12,16,\ldots $: $Sp(d/4)$. Dimension counting rules out these cases: we have ${\rm dim}(sp(d/4))=d/4(2\cdot d/4+1)$, and this is larger than $d+1$ for the relevant dimensions.
  • For $d=8,12,16,\ldots $: $Sp(d/4)\times U(1)$. This representation contains the representation of $Sp(d/4)$ as a subgroup; thus, it is ruled out by the previous case.
  • For $d=4,8,12,\ldots $: $Sp(d/4)\times SU(2)$. If $d\geqslant 8$, this too contains $Sp(d/4)$ as a subgroup. If d = 4 then the dimension of the group $Sp(1)\times SU(2)$ is 9, which is larger than $d+1=5$.
  • For d = 7: the exceptional Lie group G2. Dimension counting again: ${\rm dim}{{g}_{2}}=14\gt 7+1$.
  • For d = 8: spin(7). ${\rm dim}{\rm spin}(7)=7(7-1)/2=21\gt 8+1$.
  • For d = 16: spin(9). ${\rm dim}{\rm spin}(9)=9(9-1)/2=36\gt 16+1$.

This proves the claim.

As mentioned in the main text, it is more difficult to rule out N-level real quantum mechanics for $N\geqslant 3$. This needs a sequence of lemmas.

Lemma 35. Let $J=\left( \begin{array}{ccccccccccccccc} 0 & -1 \\ 1 & 0 \\ \end{array} \right)\in {{\mathbb{R}}^{2\,\times \,2}}$, and let $S\in {{\mathbb{R}}^{2\,\times \,2}}$ such that $JS=\alpha SJ\;{\rm for}\;{\rm some}\;\alpha \in \mathbb{R}.$Then $\alpha \in \{-1,+1\}$ or S = 0. Furthermore, if $S={{S}^{T}}$ and $\alpha =1$, then $S=c\cdot {\bf 1}$ for some $c\in \mathbb{R}$.

We omit the proof; it is a simple exercise in linear algebra.

Lemma 36. Consider any antisymmetric matrix of the form

(all other entries zero). Let $S={{S}^{T}}\in {{\mathbb{R}}^{(2k)\times (2k)}}$ be any symmetric matrix that commutes with Y, i.e. $[Y,S]=0$. Then S is a diagonal matrix of the form

Proof. Define the 2 × 2 block matrices ${{\Lambda }_{i}}\;:=\;\left( \begin{array}{ccccccccccccccc} 0 & {{\lambda }_{i}} \\ -{{\lambda }_{i}} & 0 \\ \end{array} \right)$, and divide S into 2 × 2 block matrices ${{S}_{i,j}}$:

Then the commutator is the symmetric matrix

If this is the zero matrix, then $0=[{{\Lambda }_{i}},{{S}_{i,i}}]=-{{\lambda }_{i}}[J,{{S}_{i,i}}]$ for all i. It follows from lemma 35 that there exists ${{s}_{i}}\in \mathbb{R}$ such that ${{S}_{i,i}}={{s}_{i}}\cdot {\bf 1}$. Similarly, for all $i\ne j$, we have ${{\Lambda }_{i}}{{S}_{i,j}}=-{{\lambda }_{i}}J{{S}_{i,j}}={{S}_{i,j}}{{\Lambda }_{j}}=-{{S}_{i,j}}J{{\lambda }_{j}}$, hence $J{{S}_{i,j}}=\alpha {{S}_{i,j}}J$ with $\alpha =({{\lambda }_{j}}/{{\lambda }_{i}})\;\slashed{\in }\;\{-1,+1\}$. Thus, lemma 35 yields that ${{S}_{i,j}}=0$.

We show that an analogue of this remains true in odd dimensions:

Lemma 37. Consider any antisymmetric matrix of the form

(all other entries zero). Let $S={{S}^{T}}\in {{\mathbb{R}}^{(2k+1)\times (2k+1)}}$ be any symmetric matrix that commutes with Y, i.e. $[Y,S]=0$. Then S is a diagonal matrix of the form

Proof. Divide Y and S into block matrices:

Then

If this is the zero matrix, then $[\bar{Y},{{S}_{1,1}}]=0$, and the diagonal form of ${{S}_{1,1}}$ with all entries repeated twice follows from lemma 36. Furthermore, ${{S}_{1,2}}\in {\rm ker} (\bar{Y})=\{0\}$. Finally, set ${{s}_{k+1}}:={{S}_{2,2}}$.

Before applying this, we need to show that real quantum mechanics is necessarily equipped with all reversible transformations (conjugations with orthogonal matrices) to comply with postulate 2:

Lemma 38. For $N\geqslant 3$, let ${{\Omega }_{N}}$ be the state space of N-level real quantum mechanics, and ${{\mathcal{G}}_{N}}$ be a group of reversible transformations on it such that postulate 2 is satisfied. Then

where either $\mathcal{G}=SO(N)$ or $\mathcal{G}=O(N)$. In particular, ${{\mathfrak{g}}_{N}}=\left\{ \rho \mapsto [M,\rho ]\;\;|\;\;M\in so(N) \right\}$.

Proof. Every $G\in {{\mathcal{G}}_{N}}$ is an automorphism of the cone of positive semidefinite symmetric real matrices, and thus of the form $\rho \mapsto Q\rho {{Q}^{T}}$ [53]; preservation of the trace implies that ${{Q}^{T}}Q={\bf 1}$, i.e. that Q is orthogonal. Define $\mathcal{G}$ as the set of all orthogonal Q such that the map $\rho \mapsto Q\rho {{Q}^{T}}$ is contained in ${{\mathcal{G}}_{N}}$. Clearly $\mathcal{G}$ is a subgroup of O(N); since ${{\mathcal{G}}_{N}}$ is topologically closed, so is $\mathcal{G}$.

Now we show that $\mathcal{G}$ contains all of SO(N). Let $a,b\in \mathbb{R}$ be irrational numbers such that their difference $a-b$ is also irrational. Define the unit vectors ${{e}_{i}}:=\;{{(0,\ldots ,{\underbrace{1}_{i}},0,\ldots ,0)}^{T}}$, and

Then the sets of vectors $\{{{v}_{1}},\ldots ,{{v}_{N}}\}$ and $\{{{w}_{1}},\ldots ,{{w}_{N}}\}$ are both orthonormal bases of ${{\mathbb{R}}^{N}}$, and so the sets of pure states $\{|{{v}_{1}}\rangle \langle {{v}_{1}}|,\ldots ,|{{v}_{N}}\rangle \langle {{v}_{N}}|\}$ and $\{|{{w}_{1}}\rangle \langle {{w}_{1}}|,\ldots ,|{{w}_{N}}\rangle \langle {{w}_{N}}|\}$ are both N-frames in N-level real quantum mechanics, and so is $\{|{{e}_{1}}\rangle \langle {{e}_{1}}|,\ldots ,|{{e}_{N}}\rangle \langle {{e}_{N}}|\}$. Thus, according to postulate 2, there are two orthogonal matrices $V,W\in \mathcal{G}$ such that

It follows that there are signs ${{\sigma }_{1}},\ldots ,{{\sigma }_{N}},{{\tau }_{1}},\ldots ,{{\tau }_{N}}\in \{-1,+1\}$ such that $V|{{e}_{i}}\rangle ={{\sigma }_{i}}|{{v}_{i}}\rangle $ and $W|{{e}_{i}}\rangle ={{\tau }_{i}}|{{w}_{i}}\rangle $. Hence

Now we consider two different cases. As the first case, suppose that ${{\sigma }_{1}}={{\sigma }_{2}}$ or ${{\tau }_{1}}={{\tau }_{2}}$. Then

As the second case, suppose that ${{\sigma }_{1}}\ne {{\sigma }_{2}}$ and ${{\tau }_{1}}\ne {{\tau }_{2}}$. Then $\sigma :={{\sigma }_{1}}=-{{\sigma }_{2}}$ and $\tau :={{\tau }_{1}}=-{{\tau }_{2}}$, and

In both cases, we have established the existence of a matrix in $\mathcal{G}$ that acts as $\left( \begin{array}{ccccccccccccccc} {\rm cos} \theta & {\rm sin} \theta \\ -{\rm sin} \theta & {\rm cos} \theta \\ \end{array} \right)$ in the ${{e}_{1}}-{{e}_{2}}$-subspace, where θ is an irrational multiple of π. But any matrix of this form generates all of $SO(2)$ by composition and closure. We can argue similarly for all other ${{e}_{i}}-{{e}_{j}}$-subspaces. The corresponding $SO(2)$ rotations in all these planes generate all special orthogonal matrices, hence $SO(N)\subseteq \mathcal{G}$.

Theorem 39. Energy is not observable on any N-level real quantum mechanics state space.

Proof. The case N = 1 is trivial; N = 2 is shown in the main text, so let $N\geqslant 3$. First, consider the case that N is even. Let $H\subset so(N)$ be the subspace of matrices

and $h\subset {{\mathfrak{g}}_{N}}$ be the corresponding subspace of maps of the form $\rho \mapsto [\Lambda ,\rho ]$ with $\Lambda \in H$. Moreover, let $H^{\prime} $ be the set of all $\Lambda \in H$ where the corresponding ${{\lambda }_{i}}$ satisfy ${{\lambda }_{i}}\ne 0$ and ${{\lambda }_{i}}\ne \pm {{\lambda }_{j}}$ for $i\ne j$. Then $H^{\prime} $ is dense in H. Similarly, by $h^{\prime} $, denote the set of maps $\rho \mapsto [M,\rho ]$ with $M\in H^{\prime} $; then $h^{\prime} $ is dense in h.

Consider any energy observable assignment ϕ. Any matrix $M\in so(n)$ defines a generator $X\in {{\mathfrak{g}}_{N}}$ by $X(\rho )\;:=\;[M,\rho ]$ and vice versa. This generator is mapped by ϕ to some map $\rho \mapsto {\rm tr}(\bar{M}\rho )$, where $\bar{M}={{\bar{M}}^{T}}$. Denote the map $M\mapsto \bar{M}$ by $\bar{\phi }$, such that

Then we have the equivalences

and so $\bar{\phi }(M)$ is a symmetric matrix that must commute with M. Suppose that $M\in H^{\prime} $, then lemma 37 shows that $\bar{\phi }(M)={\rm diag}({{s}_{1}},{{s}_{1}},{{s}_{2}},{{s}_{2}},\ldots ,{{s}_{N/2}},{{s}_{N/2}})$. Denote by $\mathcal{H}$ the linear space of all diagonal $(N\times N)$-matrices of that form. We have shown that $\bar{\phi }(H^{\prime} )\subset \mathcal{H}$. Since $H^{\prime} $ is dense in H, this implies that $\bar{\phi }(H)\subset \mathcal{H}$. Since ${\rm dim}\;H={\rm dim}\;\mathcal{H}=N/2$, and since $\bar{\phi }$ is injective, this implies that $\bar{\phi }(H)=\mathcal{H}$. In particular, there is $0\ne M\in H$ such that $\bar{\phi }(M)={\bf 1}$, so the corresponding generator $X\in h$ satisfies $X(\rho )=[M,\rho ]$ which is not identically zero for all ρ, and $[\phi (X)](\rho )={\rm tr}(\rho )={{u}_{N}}(\rho )$, contradicting the definition of an energy observable assignment.

Now consider the case that N is odd, say, $N=2k+1$. Define the subspace H of antisymmetric matrices by

Similar argumentation as in the even case, now using lemma 37, shows that $\bar{\phi }(M)$ is a diagonal matrix for every $M\in H$; the same conclusion holds true if the subspace H is defined by appending the zero in the top-left corner instead of the bottom-right. But then, by linearity, the matrix

also has the property that $\bar{\phi }(M)$ is a diagonal matrix. Suppose that all ${{\lambda }_{i}}\ne 0$, then the only diagonal matrix S that commutes with M is of the form $S={\rm diag}({{s}_{1}},{{s}_{1}},{{s}_{2}},{{s}_{2}},\ldots ,{{s}_{k-1}},{{s}_{k-1}},{{s}_{k}},{{s}_{k}},{{s}_{k}})$. Again, arguing analogously to the even case, the subspace of all matrices M of the given form (dropping the condition ${{\lambda }_{i}}\ne 0$) is mapped by $\bar{\phi }$ injectively into the subspaces of all diagonal matrices S of that form. Since both are of dimension k, there is $M\ne 0$ such that $\bar{\phi }(M)={\bf 1}$, violating the definition of an energy observable assignment.

Footnotes

  • It is possible to imagine physical situations where there are further restrictions on which effects can occur together in an actual measurement; to model these situations, one would have to use an even more general mathematical framework. We are not considering such theories here.

  • It is equivalent to demand that ${{e}_{1}}+\ldots +{{e}_{n}}={{u}_{A}}$, because we can always redefine $e_{1}^{\prime }:={{e}_{1}},\ldots ,e{{^{\prime} }_{n-1}}:={{e}_{n-1}},e_{n}^{\prime }:={{u}_{A}}-\sum _{i=1}^{n-1}{{e}_{i}}$.

  • In finite dimensions, formal reality coincides with the notion of euclideanity, used in [11] and [19].

  • This condition is equivalent to base norm contractiveness, which is what Alfsen and Shultz use in their definition. In the appendix to [19], item A24, they define, for $\omega ,\sigma \in {{V}_{+}}$, V a base norm space, $\omega \;\bot \;\sigma $ by $||\omega -\sigma ||=||\omega ||+||\sigma ||$. A26 states that each $\rho \in V$ can be decomposed as a difference of two orthogonal positive components, i.e. there are $\omega ,\sigma \in {{V}_{+}}$ such that $\omega \;\bot \;\sigma $ and $\rho =\omega -\sigma $. From this we can see that base norm contractiveness ($||T\rho ||\leqslant ||\rho ||$) of a map T on ${{V}_{+}}$ implies contractiveness everywhere. Since $\parallel \omega \parallel ={{u}_{A}}(\omega )$ for all ωV+, we have equivalence of base norm contractiveness and normalization of filters.

Please wait… references are loading.