1 Introduction

With the pioneenering work of Wigner [1] and Bargman and Wigner [2] the important role played by the two Casimir operators of the Poincaré and Lorentz group became obvious, namely the four-momentum squared (equal to its mass squared for a massive particle) and the square of the Pauli-Lubański operator. They were identified on the basis of relativistic kinetics as the key for a classification of all possible states of particles with any spin. Yet, in the earlier derivation of his celebrated equation for a fermion of spin one half, Dirac [3] did not make use of these general insights but introduced what became to be known as Dirac matrices, which accommodate spin one half together with the particle-antiparticle doublet for a charged fermion, thus establishing its relativistic quantum mechanics based on the Clifford matrix algebra. Yet, the Casimir operators turned out to be relevant in the rather difficult effort to derive covariant and first-order (with respect to the derivatives) relativistic wave equations for massive charged particles of arbitrary spin. It was again Dirac himself [4] who started this field already in 1936.

Ever since that date the problem has been addressed by many physicists with varying success (see the historical review until 2012 by Esposito [5]). A rather mathematical treatment and comprehensive literature review (as of 1994) of this subject can be found in the book of Fushchich and Nikitin [6]. The work on this subject continues until today. We do not intend to give here justice to the past (in the last thirty years or so) and modern research activity in this field, but just refer to some more recent papers (in 2015) of Simulik [7], and (in 2016) of Kryuchkov, Lanfear and Suslov [8], and (in 2017) of Marsch [9], and the references therein. All these authors were guided in their calculations at the outset by the two Casimir operators of the Lorentz group. The two Casimir operators will also be our starting point.

Here we present a generalization of the Dirac equation by making use of the squared Pauli-Lubański operator, which apparently contains hidden symmetries. They are interpreted as the potential origin of U(1) hypercharge and SU(3) color-charge symmetries of fermions. Namely, the resulting four additional degrees of freedom in the extended Dirac equation are associated with two independent subspaces of dimension 1 related to the U(1) and 3 related to SU(3), i.e. to the leptons and quarks. These results are the new findings of our paper, which is motivated by the wish to better understand for spin-one-half fermions the origin of their internal symmetries in relativistic quantum physics from Lorentz invariance together with spinor helicity.

However, it will turn out below that consistency of this approach can only be achieved for spin 1/2, but not for any larger spin value. Thus the concept of spin helicity can only for spin 1/2 be defined in compliance with the standard fermion anticommutation rules, and it then results in an extended Dirac equation that involves a new type of expanded spinor field including one lepton and three quarks as the four basic states of isospin helicity. The corresponding eigenfunctions of the isospin helicity operator are presented in the Appendix for the values \(s=1/2,1,3/2\).

Throughout the paper we use the conventional units of QFT, with \(c = \hbar = 1\). As usual, the particle mass is denoted as m and its spin quantum number as s. The quantum mechanical covariant four-momentum operator is \(P_\mu = (E, -{\mathbf {p}})\), yielding \(P_\mu = {\mathrm {i}}\partial _\mu \) with the covariant space-time derivative given as \(\partial _\mu = (\partial /\partial t, \partial /\partial {\mathbf {x}})\). The first main ingredient is the relativistic dispersion relation \(E(p)=\sqrt{p^2 + m^2}\), and the second key ingredient is the intrinsic angular momentum or spin three-vector operator \({\mathbf {S}}\) of a particle, the basic properties of which are described in any textbook on quantum mechanics, e.g., like the modern one of Weinberg [10]. We will make essential use of the general spin commutation relation, which can concisely be written as

$$\begin{aligned} {\mathbf {S}} \times {\mathbf {S}} = {\mathrm {i}} {\mathbf {S}}. \end{aligned}$$
(1)

We recall that \({\mathbf {S}}^2 = s(s+1)\mathsf{{1}_{2s+1}}\), i.e. the spin squared is proportional to the unit matrix of dimension \(2s+1\) in the standard irreducible matrix representation of any spin \({\mathbf {S}}\). For spin 1/2 we obtain \({\mathbf {S}} = 1/2{\varvec{\sigma }}\), with the matrix three-vector \({\varvec{\sigma }}=(\sigma _{\mathrm {x}},\sigma _{\mathrm {y}},\sigma _{\mathrm {z}}\)), having as components the three well known Pauli matrices.

2 The two Casimir operators of the Lorentz group, mass squared and Pauli-Lubański operator, and spin helicity

Let us at the outset state the first Casimir operator [2], which is the squared four momentum of a particle (equal to its mass squared if it is massive). We stay in Fourier space for the four-momentum operator \(P^\mu \) and obtain

$$\begin{aligned} p^\mu p_\mu = m^2. \end{aligned}$$
(2)

As is well known, this Casimir operator can also be written as the square of \(\gamma ^\mu p_\mu \), which when acting on a spinor field yields the famous Dirac equation [3]. It describes a fermion with spin of \(s = 1/2\) that is contained in the four Dirac \(4 \times 4\) gamma matrices [11, 12], which obey the Clifford algebra,

$$\begin{aligned} \gamma ^\mu \gamma ^\nu + \gamma ^\nu \gamma ^\mu = 2 g^{\mu \nu } \mathsf{{1}_4}, \end{aligned}$$
(3)

with the Minkowski metric \(g^{\mu \nu }\). Following Marsch [9], we can also introduce the spin explicitly into Eq. (2) by multiplying it times \({\mathbf {S}}^2 = s(s+1)\mathsf{{1}_{2\,s+1}}\) for any spin s. This makes it a matrix equation for the \(2s + 1\) multiplet of independent spin components, a procedure which at first sight seems trivial. Yet, on the basis of Eqs. (1) and (2) we thus can obtain a new method to derive the Pauli-Lubański [13, 14] operator. For that purpose, we exploit with the help of (1) that for any couple of three-vectors (\({\mathbf {a}}\) and \(\mathbf{b} \) say) and any spin denoted by \(\mathbf{S} \) the fundamental algebraic identity holds

$$\begin{aligned} ({\mathbf {S}} \cdot {\mathbf {a}})({\mathbf {S}} \cdot \mathbf{b} ) + ({\mathbf {S}} \times {\mathbf {a}}) \cdot ({\mathbf {S}} \times {\mathbf {a}}) = {\mathbf {S}}^2({\mathbf {a}} \cdot \mathbf{b} ) + {\mathrm {i}}{\mathbf {S}} \cdot ({\mathbf {a}} \times {\mathbf {a}}). \end{aligned}$$
(4)

For \({\mathbf {a}}=\mathbf{b} ={\mathbf {p}}\), which commutes with itself, the involved vector cross-product on the right hand side vanishes, and thus we obtain

$$\begin{aligned} ({\mathbf {S}} \cdot {\mathbf {p}})({\mathbf {S}} \cdot \mathbf{p} ) + ({\mathbf {S}} \times {\mathbf {p}}) \cdot ({\mathbf {S}} \times {\mathbf {p}}) = {\mathbf {S}}^2 {\mathbf {p}}^2. \end{aligned}$$
(5)

Using this identity, we get by multiplication of (2) with \({\mathbf {S}}^2\), whereby the r.h.s of (2) then becomes \(m^2\,s(s+1)\mathsf{{1}_{2\,s+1}}\) accordingly, an equation involving the spin explicitly as

$$\begin{aligned} (E^2-{\mathbf {p}}^2){\mathbf {S}}^2&= {\mathbf {S}}^2 E^2 - (({\mathbf {S}} \cdot {\mathbf {p}})({\mathbf {S}} \cdot \mathbf{p} ) + ({\mathbf {S}} \times {\mathbf {p}}) \cdot ({\mathbf {S}} \times {\mathbf {p}})) \nonumber \\ &= ({\mathbf {S}} E + {\mathrm {i}}{\mathbf {S}} \times {\mathbf {p}})^2 - ({\mathbf {S}} \cdot {\mathbf {p}})^2 = - W^\mu W_\mu , \end{aligned}$$
(6)

which is a covariant expression for the negative square of the Pauli-Lubański operator \(W^\mu = ({\mathbf {S}} \cdot {\mathbf {p}}, {\mathbf {S}}E + {\mathrm {i}}{\mathbf {S}}\times {\mathbf {p}})\) in its four-vector form. To derive it we used the identity \({\mathbf {S}} \cdot ({\mathbf {S}} \times {\mathbf {a}}) + ({\mathbf {S}} \times {\mathbf {a}}) \cdot {\mathbf {S}} =0 \). Therefore, Eq. (6) is Lorentz invariant and nothing but the second Casimir operator of the Lorentz group. For spin \(s=1/2\), one finds that the squared cross-product term in (6) is equal to twice the squared scalar product term, which gives \((\mathbf {\varvec{\sigma }} \times {\mathbf {p}}) \cdot (\mathbf {\varvec{\sigma }} \times {\mathbf {p}}) = 2 (\mathbf {\varvec{\sigma }} \cdot {\mathbf {p}})^2\), a relation that is only valid for the Pauli matrices yet not for any other matrices of larger spins. When we exploit that relation we obtain from (6) the equation

$$\begin{aligned} E^2-({\varvec{\sigma }} \cdot {\mathbf {p}})^2 = m^2, \end{aligned}$$
(7)

from which the Dirac equation in the Weyl basis readily follows by operating with (7) on a Pauli bi-spinor, and by subsequent factorization of the resulting equation in terms of first-order operators.

However, the general Eq. (6) should be valid for any spin. The key question then is can it, while being a second-order (in E and \({\mathbf {p}}\)) algebraic equation in Fourier space, also be factorized to the first order in a mathematically convenient and appealing form like that of the Dirac equation? The algebraic way to achieve that is to use spin-1/2 version of the general Eq. (5), which yields \(({\varvec{\sigma }}\cdot {\mathbf {p}})^2 = {\mathbf {p}}^2 \mathsf{{1}_2}\) and has as consequence that a three-vector scalar product can be replaced by a matrix product. This possibility reflects that in normal coordinate space the Pauli matrices provide the basic metric property, which can be expressed as \(\sigma _i\sigma _j +\sigma _j\sigma _i = 2 \delta _{ij} \mathsf{{1}_2}\). If one considers the general spin \({\mathbf {S}}\) of a particle, its non-commutativity according to (1) yields more complicated results. First note that for the Pauli matrices we obtain from (4) the special result

$$\begin{aligned} ({\varvec{\sigma }} \cdot {\mathbf {a}})({\varvec{\sigma }} \cdot {\mathbf {a}}) = ({\mathbf {a}} \cdot {\mathbf {a}})\mathsf{{1}_2} + {\mathrm {i}} {\varvec{\sigma }} \cdot ({\mathbf {a}} \times {\mathbf {a}}). \end{aligned}$$
(8)

Applying this relation to any spin three-vector operator \({\mathbf {S}}\) one finds

$$\begin{aligned} \mathsf{{1}_2} {\mathbf {S}}^2 = ({\varvec{\sigma }} \cdot {\mathbf {S}})(({\varvec{\sigma }} \cdot {\mathbf {S}}) + \mathsf{{1}_{2(2s+1)}}) = \mathsf{{1}_{2(2s+1)}}(s(s+1)). \end{aligned}$$
(9)

Here the dot symbol means a scalar product of the vectors but also implies tensor multiplication of the representation matrices of the spin with the Pauli matrices. This important relation was first obtained in 1936 by Dirac [4] in his early attempt to derive a relativistic wave equation for any spin. At this point it seems natural to introduce the concept of spin helicity, \({\varvec{\sigma }} \cdot {\mathbf {S}}=\sigma _{\mathrm {x}} \otimes S_{\mathrm {x}} + \sigma _{\mathrm {y}} \otimes S_{\mathrm {y}} + \sigma _{\mathrm {z}} \otimes S_{\mathrm {z}}\), since that is the important quantity in Eq. (9). Making use of it we can thus rewrite and expand Eq. (6) to attain the new matrix form containing the kinetic helicity, \({\varvec{\sigma }} \cdot {\mathbf {p}}\), and the spin helicity \({\varvec{\sigma }} \cdot {\mathbf {S}}\). This modified equation reads

$$\begin{aligned} (\mathsf{{1}_2}E^2 - ({\varvec{\sigma }} \cdot {\mathbf {p}} )^2)({\varvec{\sigma }} \cdot {\mathbf {S}}) ({\varvec{\sigma }} \cdot {\mathbf {S}} + \mathsf{{1}_{2(2s+1)}}) = m^2 s(s + 1) \mathsf{{1}_{2(2s+1)}}. \end{aligned}$$
(10)

It has the key advantage that there are no three-vector scalar products of either \({\mathbf {p}}\) or \({\mathbf {S}}\) with itself any more, but instead matrix multiplications, yet in addition we had to introduce two new degrees of freedom associated with the Pauli matrices, enabling us to use their above mentioned metric property in coordinate space.

Yet as a consequence of that approach, there arises a new algebraic problem here, because the Pauli matrices do not commute. Whereas, originally the vectors \({\mathbf {p}}\) and \({\mathbf {S}}\) commute with each other, their helicities do not any more when written in the above terms exploiting the Pauli matrices. So, in order to retain commutativity one needs two commuting representations of SU(2) instead of the fundamental one to solve that problem. Fortunately, there exist already two quite adequate representations, namely the ones given by the euclidean version of the Lorentz group SO(4) that is isomorphic to \(SU(2) \times SU(2)\). A specific natural representation was introduced by Marsch and Narita [15] making use of the original Lorentz transformation in Minkowski space, which yields the vector representation of the Lorentz group. But here we will instead use the equivalent \(4 \times 4\) matrices obtained as tensor products:

$$\begin{aligned} {\varvec{\Sigma }}_{1} = ({\varvec{\sigma }} \otimes \mathsf{{1}_2}), \;\; {\varvec{\Sigma }}_{2} = (\mathsf{{1}_{2}} \otimes {\varvec{\sigma }}). \end{aligned}$$
(11)

These matrix vectors have individually the same algebraic properties as the Pauli matrices, and commute by definition among them component-wise. Using the spin four-vector \(\sigma ^\mu =(\mathsf{{1}_2}, {\varvec{\sigma }})\), we can express the covariant energy-momentum four-vector \(p_\mu =(E,-{\mathbf {p}})\) as a bi-spinor in matrix form

$$\begin{aligned} \sigma ^\mu p_\mu = \left( \begin{array}{cc} E-p_{\mathrm {z}} &{} \quad -p_{\mathrm {x}} + {\mathrm {i}} p_{\mathrm {y}}\\ -p_{\mathrm {x}} - {\mathrm {i}} p_{\mathrm {y}} &{} \quad E + p_{\mathrm {z}} \end{array} \right) . \end{aligned}$$
(12)

It is straightforward to show that \(\det (\sigma ^\mu p_\mu ) = E^2-{\mathbf {p}}^2\), which is equal to \(m^2\) for a massive particle and Lorentz invariant according to (2). Similarly, we obtain with the definition \(\Sigma ^\mu _{1,2}=(\mathsf{{1}_4}, {\varvec{\Sigma }}_{1,2})\) that \(\det (\Sigma ^\mu _{1,2} p_\mu ) = (E^2-{\mathbf {p}}^2)^2\). Thus one can convert with the above kinetic helicity (12) from the four-vector representation to the \((\frac{1}{2}, \frac{1}{2})\) spinorial representation of the Lorentz group. This powerful procedure is referred to in modern quantum field theory as spinor-helicity formalism [12], used to evaluate effectively matrix elements in QCD perturbation theory.

The use of \({\varvec{\Sigma }}_{1,2}\) instead of \({\varvec{\sigma }}\) ensures that in Eq. (10) the helicity terms involving the quantum mechanical operators \({\mathbf {p}}\) and \({\mathbf {S}}\) do commute again, and guarantees rotational invariance of their modules. Trivially, they of course both commute with the energy operator, as the related term involves only a unit matrix. From Eqs. (10) and (11) it is clear that we have to deal with two key quantities, namely the well known kinetic helicity \({\varvec{\sigma }} \cdot {\mathbf {p}}\), which when being normalized to p has the two values \(\pm 1\) after Eq. (8), and the new spin helicity \({\varvec{\sigma }} \cdot {\mathbf {S}}\), which may have the two eigenvalues s and \(-(s+1)\) according to Dirac [4], a result that follows readily from the eigenvalue Eq. (9). See also the Eq. (25) and related discussion in the Appendix.

3 First-order equation for a massive particle with any spin

Returning to the second-order algebraic Eq. (10) in the four-momentum, we can now make use of the commuting Sigma matrices, with \({\varvec{\Sigma }}_1\) being used for \({\mathbf {p}}\) and \({\varvec{\Sigma }}_2\) being used for \({\mathbf {S}}\). Thus we obtain

$$\begin{aligned} (\mathsf{{1}_4}E - {\varvec{\Sigma _1}} \cdot {\mathbf {p}} ) ({\varvec{\Sigma }}_2 \cdot {\mathbf {S}}) (\mathsf{{1}_4}E + {\varvec{\Sigma _1}} \cdot {\mathbf {p}} ) ({\varvec{\Sigma }}_2 \cdot {\mathbf {S}} + \mathsf{{1}_{4(2s+1)}}) = m^2 s(s + 1) \mathsf{{1}_{4(2s+1)}}. \end{aligned}$$
(13)

Here we exploited that the Sigma matrices commute, and the related terms can be placed in any position of the above equation. We get a second-order wave equation if we let now the involved operator act on a spinor wave function \(\phi \), which has the dimension \(d=4(2s+1)\), reflecting the dimension of the Sigma matrices and the \(2s+1\) components of the spin \({\mathbf {S}}\) in its irreducible matrix representation. Equation (13) can be decomposed into two first-order equations reading

$$\begin{aligned} (\mathsf{{1}_4}E - {\varvec{\Sigma _1}} \cdot {\mathbf {p}} ) ({\varvec{\Sigma }}_2 \cdot {\mathbf {S}}) \phi _0 &= ms \phi _1, \\ (\mathsf{{1}_4}E + {\varvec{\Sigma _1}} \cdot {\mathbf {p}} )({\varvec{\Sigma }}_2 \cdot {\mathbf {S}} + \mathsf{{1}_{4(2s+1)}}) \phi _1 &= m(s + 1) \phi _0. \end{aligned}$$
(14)

At the first sight the separation into two equations as shown here seems arbitrary, e.g., one could select the terms with s or \(s+1\) on the right hand side and associate them either with the upper or lower of the two equations. Yet besides that arbitrariness (like that of the minus sign in front of the momentum term), the separation yields a set of coupled equation reminding of the standard Dirac equation in the Weyl basis. Indeed it is by intention constructed here that way, which is made possible by use of the concept of spin helicity.

As a convenient abbreviation we introduce for the spin operators the symbols

$$\begin{aligned} S_0 &= ({\varvec{\Sigma }}_2 \cdot {\mathbf {S}})/s, \\ S_1 &= ({\varvec{\Sigma }}_2 \cdot {\mathbf {S}} + \mathsf{{1}_{4(2s+1)}})/(s + 1). \end{aligned}$$
(15)

By their definitions and matrix multiplications they obey \(S_0S_1=S_1S_0=\mathsf{{1}_{4(2s+1)}}\), corresponding to Eq. (9). Consequently, we can rewrite Eq. (14) in the new \(2\times 2\)-matrix form

$$\begin{aligned} \left( \begin{array}{cc} 0 &{} (\mathsf{{1}_4}E + {\varvec{\Sigma _1}} \cdot {\mathbf {p}} ) S_1 \\ (\mathsf{{1}_4}E - {\varvec{\Sigma _1}} \cdot {\mathbf {p}} ) S_0 \quad &{} 0 \end{array} \right) \Psi = m \Psi , \end{aligned}$$
(16)

with the composed bi-spinor \(\Psi ^{\dag }=(\phi ^{\dag }_0, \phi ^{\dag }_1)\). This can be written in Fourier space in the concise standard Dirac form also as

$$\begin{aligned} \Gamma ^\mu p_\mu \Psi = m \Psi , \end{aligned}$$
(17)

where the new Gamma matrices that replace the Dirac gammas have the form \(\Gamma ^\mu =(\Gamma _0, {\varvec{\Gamma }})\), and \(\Gamma _5 = {\mathrm {i}} \Gamma _0\Gamma _{\mathrm {x}}\Gamma _{\mathrm {y}} \Gamma _{\mathrm {z}}\), which read explicitly as follows:

$$\begin{aligned} {\varvec{\Gamma }} = \left( \begin{array}{cc} 0 &{} \quad {\varvec{\Sigma _1}} S_1 \\ - {\varvec{\Sigma _1}} S_0 &{} \quad 0 \end{array} \right) , \;\; \Gamma _0 = \left( \begin{array}{cc} 0 &{} \quad S_1 \\ S_0 &{} \quad 0 \end{array} \right) , \;\; \Gamma _5 = \left( \begin{array}{cc} -\mathsf{{1}_{4(2s+1)}} &{} \quad 0 \\ 0 &{} \quad \mathsf{{1}_{4(2s+1)}} \end{array} \right) . \end{aligned}$$
(18)

These Gammas obey of course the Clifford algebra. Also the usual chiral projection operator based on \(\Gamma _5\) can be defined. We have \(\Gamma _0^2=\Gamma _5^2=\mathsf{{1}_{8(2s+1)}}\) and \(\Gamma _j^2=-\mathsf{{1}_{8(2s+1)}}\). Obviously, these Gamma matrices are close relatives of the standard Dirac gamma matrices in the Weyl basis, which are obtained by putting \(S_0=S_1=1\), replacing \({\varvec{\Sigma }}_1\) by \({\varvec{\sigma }}\) and by reducing the Gamma matrices dimensions to \(d=4\). The key difference is that the particle spin appears explicitly in the terms involving \({\mathbf {S}}\), whereas in the Dirac equation for a fermion of spin \(s=1/2\) it is represented by the Pauli matrices. Moreover, the above algebraic equation for any spin contains additional, spin-helicity related degrees of freedom, the interpretation of which is not so obvious.

However, we can somewhat alleviate this algebraic problem by using (11), and thus write our spin operators \(S_{0,1}\) in terms of the Pauli matrices defining the spin helicity H(s) and the two related operators \(H_0(s)\) and \(H_1(s)\) as follows

$$\begin{aligned} \begin{array}{c} S_0 = \mathsf{{1}_2} \otimes ({\varvec{\sigma }} \cdot {\mathbf {S}})/s = \mathsf{{1}_2} \otimes H(s)/s = \mathsf{{1}_2} \otimes H_0(s), \\ S_1 = \mathsf{{1}_2} \otimes (H(s) + \mathsf{{1}_{2(2s+1)}})/(s + 1) = \mathsf{{1}_2} \otimes H_1(s). \end{array} \end{aligned}$$
(19)

The spin variables obey the Eq. (9). The resulting Gamma matrices (still involving tensor products) now read explicitly as follows

$$\begin{aligned} \begin{array}{c} {\varvec{\Gamma }} = \left( \begin{array}{cc} 0 &{} \quad {\varvec{\sigma }} \otimes H_1(s) \\ -\varvec{\sigma } \otimes H_0(s) &{} \quad 0 \end{array} \right) , \\ \Gamma _0 = \left( \begin{array}{cc} 0 &{} \quad \mathsf{{1}_2} \otimes H_1(s) \\ \mathsf{{1}_2} \otimes H_0(s) &{} \quad 0 \end{array} \right) . \end{array} \end{aligned}$$
(20)

These Gamma matrices guarantee the basic property of the Clifford algebra (3), and the resulting \(\Gamma _5\) is the same as before in (18). Therefore, one main goal of this paper has been reached, as we derived a first-order algebraic equation of the Dirac type, which accommodates the \(2s+1\) degrees of freedom of a massive particle with any spin quantum number s. Therefore, we obtain a multicomponent spinor \(\Psi \) that has the dimension (number of independent components) \(d=8(2s+1)\), where two come obviously from the particle and antiparticle degrees of freedom, with each being associated with the usual \((2s+1)\) components of the spin \({\mathbf {S}}\).

Then an open question remains as to what the additional four degrees of freedom relate to, which in Eqs. (19) and (20) are represented in the tensor products by the terms associated with \(\varvec{\sigma }\) and \(\mathsf{{1}_2}\), if they are not interpreted as being related to the spin \({\mathbf {S}}\). These remaining four degrees actually stem from the Lorentz group (in its reducible representation after [15] used here), which has been exploited in the ansatz (11) to ensure that the commutator \([{\mathbf {p}}, {\mathbf {S}}]=0\) throughout the factorization process in (13) and (14) with respect to \(p^\mu =(E,{\mathbf {p}})\), and which finally led to the momentum and spin helicities appearing in the modified Dirac Eq. (17). So two degrees of freedom are related to the kinetic helicity and the remaining two can be interpreted as being due to two types of spin helicity stemming from the definitions as given in Eq. (19). This is a surprising new result which has its origin in the Eq. (9) going back to Dirac [4], and reflects the non-commutability of the three components of the spin three-vector \({\mathbf {S}}\) after the fundamental Eq. (1).

4 The spinorial Lorentz transformation restricts the validity of the generalized Dirac equation to spin-one-half fermions only

Is the extended Dirac equation based on the Gamma matrices quoted in Eqs. (18) and (20) really valid for any spin with \(s>1/2\)? To answer this question we must consider the spinorial generators of the Lorentz transformation for the spinor \(\Psi \), which are given by the Gamma matrices as follows

$$\begin{aligned} L^{\mu \,\nu } = \frac{{\mathrm {i}}}{4} \left[ \Gamma ^\mu , \Gamma ^\nu \right] . \end{aligned}$$
(21)

Thus the spatial components of \(L^{\mu \,\nu }\) define the associated spin operator \(\hat{{\mathbf {S}}}\) as follows

$$\begin{aligned} \hat{S}_{{\mathrm {x}}} = \frac{{\mathrm {i}}}{2} \Gamma _{\mathrm {y}} \Gamma _{\mathrm {z}}, \end{aligned}$$
(22)

whereby cyclic index permutation delivers the three vector components. By insertion of the Gammas we obtain

$$\begin{aligned} \hat{S}_{\mathrm {x}} = \frac{{\mathrm {i}}}{2} \left( \begin{array}{cc} 0 &{}\quad \Sigma _{1 \,{\mathrm {y}}} S_1 \\ - \Sigma _{1 \,{\mathrm {y}}} S_0 &{} \quad 0 \end{array} \right) \, \left( \begin{array}{cc} 0 &{}\quad \Sigma _{1 \,{\mathrm {z}}}S_1 \\ - \Sigma _{1 \,{\mathrm {z}}} S_0 &{}\quad 0 \end{array} \right) = \frac{1}{2} \left( \begin{array}{cc} \Sigma _{1 \,{\mathrm {x}}} S_1 S_0 &{} \quad 0 \\ 0 &{} \quad \Sigma _{1 \,{\mathrm {x}}} S_0 S_1 \end{array} \right) . \end{aligned}$$
(23)

Since \(S_0S_1=S_1S_0 = \mathsf{{1}}_{4(2s+1)}\), we obtain with the definition of the Sigmas in Eq. (11) for the spin three-vector of the extended Dirac field \(\Psi \) the final result

$$\begin{aligned} \hat{{\mathbf {S}}} = \frac{1}{2}\varvec{\sigma } \otimes \mathsf{{1}}_{8(2s+1)}, \end{aligned}$$
(24)

which obeys the spin commutation relations like the Pauli matrices, in which unity has been replaced by \(\mathsf{{1}}_{8(2s+1)}\). We recall the degrees of freedom contained in the extended Dirac equation. We have \(2(2s+1)\) associated with the spin helicity H(s), 2 additional degrees of freedom related to the Sigma matrices, and 2 degrees of freedom associated with the particle-antiparticle doublet. Of course, there still are the two degrees that are explicitly related to the fermion physical spin, which shows up clearly in the above definition of \(\hat{{\mathbf {S}}}\).

As a consequence of the above consideration, a problem seems to arise if we claimed that the extended Dirac equation can be applied to any physical spin value \(s>1/2\), because (24) is just the trivially generalized spin of a fermion with \(s=1/2\)! So the conclusion would be that this equation may be consistent for spin-one-half fermions, but cannot be applied to any higher spin particle. Yet, for the spin 1/2 fermion, there is no problem because of the properties the Pauli matrices. They yield by derivation from (6) the Eq. (7), from which the standard Dirac equation follows directly. But for any larger spin, the metric property, \(\sigma _i\sigma _j +\sigma _j\sigma _i = 2 \delta _{ij} \mathsf{{1}_2}\), does not apply!

Therefore the original spin vector of (1) should in this case perhaps not be identified with the physical spin but be more adequately with isospin, i.e., an intrinsic spin without association to kinetic rotational degrees of freedom. Certainly, isospin cannot be identical with the physical spin that determines the magnetic moment of a spin 1/2 fermion coupled to an electromagnetic field.

However, for a spin-one-half fermion the extended Dirac equation is apparently valid and offers new insights into the inner nature of that particle. When making use of the spinor-helicity formalism in the Pauli-Lubański operator (24) to express the three-vectors as bi-spinors via the Pauli matrices, we are facing new “hidden symmetries”. For a spin-1/2 fermion they are in our interpretation U(1) and SU(3) coming from isospin helicity. Our Eqs. (19) and (20) thus describe for any given isospin a fermion field with physical spin one-half and with inner isospin symmetry.

Thus there is no consistency problem, but we are then led to conclude that the Wigner classification of a particle corresponds to isospin, an interpretation that fully complies with the Coleman-Mandula theorem. So, there does not exist an elementary quantum field obeying a linear wave equation with a physical spin higher than 1/2. The only one is the standard Dirac equation, which can be naturally extended to include any isospin via the spinor-helicity formalism.

5 Summary and conclusions

In this paper we generalized the Dirac equation by making use of the squared Pauli–Lubański operator as written in Eq. (6), which includes isospin explicitly and shows that the kinetic (related to \({\mathbf {p}}\)) and isospin (given by the three-vector operator \({\mathbf {S}}\)) degrees of freedom appear separately, but they can be connected via the spinor-helicity formalism employing tensor multiplication. As a consequence, new degrees of freedom associated with the isospin of a massive particle emerge, which are associated with isospin-helicity as defined in Eqs. (9) and (19). These degrees of freedom occur in addition to the four degrees of freedom of the standard Dirac equation, which are given by the physical spin doublet and particle-antiparticle pair. Each component of the isospin-helicity doublet has \(2(2s+1)\) degrees of freedom for an arbitrary isospin with quantum number s. Isospin-helicity appears naturally in the extended Dirac equation, and its matrix operator has the eigenvalues s and \(-(s+1)\), thus revealing hidden new symmetry in any general isospin field. To obtain this we had to employ the two Sigma matrices (11), which are required to ensure the commutativity of momentum and isospin in the squared Pauli–Lubański operator with respect to the involved four-vector \(p^\mu \).

Our analysis has put the isospin-helicity H(s) in the focus. For isospin 1/2 the related \(4\times 4\) matrix (27) splits into two different subspaces, a one-dimensional space for the eigenvalue \(-(s+1)=-3/2\), and a three-dimensional space for \(s=1/2\). Correspondingly, in our extended Dirac equation a standard Dirac spinor further splits into two sub-spinors owing to the two eigenvalues of the isospin-helicity. We suggest to associate these two sub-spaces with the U(1) and SU(3) gauge symmetries of the electromagnetic and colour charges of the leptons and quarks of the Standard Model (SM). The SU(2) isospin sector undergoing the Higgs mechanism in the SM is not considered here. The U(1) and SU(3) as effective gauge groups at low energy can be explained by the spinor-helicity formulation. These inner symmetries are not directly connected with Lorentz invariance. This is exactly what the squared Pauli–Lubański four-vector indicates when written in the form of the left-hand side of (6). Therefore, the resulting full spinor is 16-dimensional and can thus accommodate the upper or lower parts of the doublets of the first family of leptons and quarks, corresponding to the SU(16) spinor representation [16, 17] of the SO(10) group, which was discussed as a possible model of Grand Unified Theory (GUT) [11].

Apparently, the spin-1/2 three-vector \({\mathbf {S}}=\frac{1}{2} \varvec{\sigma }\) in terms of the Pauli matrices plays a twofold role. It first appears as the generator of the Lorentz group in its fundamental and simplest representation [11, 12], defining the rotation (\(\varvec{\sigma }\)) and left- and right- chiral boost (\(\pm {\mathrm {i}}\varvec{\sigma }\)) operators of the spinorial Lorentz transformation. Secondly, it appears here explicitly as independent degree of freedom of a particle in relation to its intrinsic rotation. The spinor-helicity mechanism offers a new way to explain the physical origin of lepton and quark symmetries in compliance with the Coleman-Mandula theorem [18]. Moreover, Marsch and Narita [19] studied recently the mathematical connections of the Clifford algebra with the su(N)-Lie algebra, or in more physical terms the links between space-time symmetry and internal SU(N) gauge-symmetry for a fermion as described by the standard Dirac equation. For a massless particle their results can be applied to our Gamma matrices in (18), thus supporting the conclusion that isospin-helicity is closely connected with the basic internal gauge symmetries of the SM.