Introduction

In a medium where spatial inversion and time reversal symmetries are both broken, magnetic and electric fields couple in a way that is fundamentally different from the electromagnetic induction described by Maxwell’s equations in vacuum, thus leading to exotic electrodynamics. A topic of particular interest in magneto-electric coupling phenomena recently is axion electrodynamics. Axion electrodynamics is theoretically proposed to be realized in a class of topological materials called axion insulators1,2,3,4,5,6 for sufficiently slowly oscillating electromagnetic fields. In particular, the static axion magneto-electric coupling is quantized by a multiple of the fundamental constant e2/2h7,8,9, which originates from the half-quantized Hall conductance of the topological surface states. While such a quantized magneto-electric coupling has not yet been directly observed, experimental progress has been made to observe its consequences in the static limit10,11,12,13,14,15,16.

In contrast to the static limit, the emergent axion electrodynamics at generic optical frequencies remains largely unexplored theoretically. Formulating a theory of optical axion electrodynamics has to overcome the challenge that the axion coupling is ill-defined in periodic three-dimensional systems, making it hard to calculate and understand. The static axion angle can be calculated by the Chern-Simons integral of the non-abelian Berry connection8,17,18 at the cost of being gauge dependent. Despite its gauge dependence, the Chern-Simons integral is well-defined as an angular variable taking a value between 0 and 2π, because a gauge transformation changes its value only by a multiple of 2π. This is a manifestation that the static surface Hall conductance changes by a multiple of e2/h under deformations. However, at optical frequencies, a similar approach does not seem feasible. This is because the optical surface Hall conductivity is frequency dependent, and thus a generic surface deformation changes the surface Hall response by a non-quantized amount. Owing to this difficulty, theoretical understanding of the optical axion electrodynamics has remained elusive to date19.

In this paper, we make two important steps toward the complete formulation of the optical axion electrodynamics. First, we show that a proper definition of the optical magneto-electric coupling allows us to calculate the optical axion angle in a fully gauge-independent way in a system with finite thickness. Although the optical axion electrodynamics is a part of the linear-response optical magneto-electric effects, it is distinguished from the other contributions. Non-axionic magneto-electric effects are described within the well-established theory of gyrotropic birefringence and natural optical activity, which are based on the bulk current response19,20,21. On the other hand, the optical axion electrodynamics is a surface phenomenon and thus is not captured by those theories19. We therefore define the well-define optical axion angle by analyzing the surface response. Using the well-defined axion angle, we can understand the optical axion electrodynamics of thin films and also that of bulk crystals by increasing the thickness. Second, in systems with periodic boundary conditions along all three directions, we find the optical axion angle at high optical frequencies can be estimated by the optical layer Hall conductivity that we define in the main text. These findings present advantages in numerical calculations as well as conceptual advances.

Our development of a theory of optical axion electrodynamics opens the door to understanding novel axion-induced optical phenomena. Here we provide a concrete example. Magneto-optic effects make electromagnetic waves powerful probes of the magnetic structure in materials. Conventionally, both Kerr and Faraday effects are attributed to the optical Hall effect and thus commonly perceived as probes of net magnetization22,23. Although not well known, previous theoretical studies proposed that the magneto-electric effect can also lead to the Kerr effect, providing a novel way to probe fully compensated antiferromagnets whose symmetries strictly prohibit the Hall effect and net magnetization20,24,25,26,27,28. Still, however, there are two aspects that need further investigation. First, axion electrodynamic contribution to the Kerr effect has not been well understood. Second, since previous studies have focused on three-dimensional bulk systems, magneto-optic Kerr effects in quasi two-dimensional antiferromagnets remain elusive. We present theoretical analysis revealing the precise conditions for realizing the optical axion electrodynamic Kerr effect as well as quantitative numerical analysis allowed by our gauge-invariant formulas.

Our results apply broadly to both topological and non-topological media because optical magneto-electric effects, as non-quantized phenomena, are not sensitive to the topological nature of the ground state. Therefore, our work provides a theoretical basis for the detection and manipulation of antiferromagnetism in a large class of materials, thus having potential for wide applications to antiferromagnetic spintronics and the study of magnetic structure in quantum materials

Meanwhile, topological antiferromagnets need special attention as they are ideal platforms for optical axion electrodynamics. To understand this, we note two aspects. First, it is desired to have antiferromagnets having bulk symmetry that reverses an odd number of spacetime coordinates and is broken on the surfaces (e.g., spatial inversion symmetry), because such a symmetry suppresses bulk magneto-electric effects but not the axion magneto-electric effect. This condition is similar to the requirement of the quantization of the static axion angle but additionally requires that the quantizing-symmetry is broken at the surface to allow for a nonzero value. Second, spatially spreading of electronic quantum states is needed, in order to show a response distinguished from that of decoupled layered or Mott antiferromagnets. This condition is again satisfied in topological antiferromagnets. We thus apply our theory to a model of MnBi2Te4, which is the only experimentally realized axion topological antiferromagnet to date, and show that the Kerr effect in this material is significant.

Results

Optical magneto-electric coupling

Motivated by the equivalence between the surface Hall conductivity and the axion angle in the static limit, we study the surface current response to define the optical axion angle. We consider currents generated by electromagnetic multipole moments. Electromagnetic responses from multipole moments are smaller for higher order moments21: electric dipole Pi electric quadrupole Qij and magnetic dipole Mi higher orders. Here, we consider only up to the electric quadrupole-magnetic dipole order, giving the leading-order magneto-electric effect. The bulk current density is related to the multipole moments by: \({J}_{i}={\dot{P}}_{i}-\frac{1}{2}{\partial }_{j}{\dot{Q}}_{ij}+{\epsilon }_{ijk}{\partial }_{j}{M}_{k}\). While electric quadrupole and magnetic dipole moments do not generate macroscopic currents in macroscopically homogeneous lattice systems, they generate currents on the system boundary where the material property changes spatially. The induced multipole moments have the following form in the frequency domain21:

$${P}_{i}= \mathop{\sum}\limits_{j}({\chi }_{ij}-i{\chi }_{ij}^{{\prime} }){E}_{j}+\frac{1}{2}\mathop{\sum}\limits_{jk}({a}_{ijk}-i{a}_{ijk}^{{\prime} }){\nabla }_{k}{E}_{j}+\mathop{\sum}\limits_{j}({G}_{ij}-i{G}_{ij}^{{\prime} }){B}_{j},\\ {Q}_{ij}= \mathop{\sum}\limits_{k}({a}_{kij}+i{a}_{kij}^{{\prime} }){E}_{k},\\ {M}_{i}= \mathop{\sum}\limits_{j}({G}_{ji}+i{G}_{ji}^{{\prime} }){E}_{j},$$
(1)

where χij = χji and \({\chi }_{ij}^{{\prime} }=-{\chi }_{ji}^{{\prime} }\) are the electric susceptibility tensors, Gij and \({G}_{ij}^{{\prime} }\) are magneto-electric coupling, and \({a}_{ijk}^{{\prime} }={a}_{ikj}^{{\prime} }\) and aijk = aikj are electric quadrupolar susceptibility tensors. Here, we are interested in the magnetic magneto-electric effects described by Gij and \({a}_{ijk}^{{\prime} }\), which occur only when time reversal symmetry is broken while being compatible with spacetime inversion PT symmetry. Therefore, we assume PT symmetry to neglect the complications arising from the bulk Hall conductivity and natural optical activity, both of which are excluded in this symmetry setting, i.e., \({\chi }_{ij}^{{\prime} }=0\) and \({G}_{ij}^{{\prime} }={a}_{ijk}=0\) (Table 1). This assumption does not affect our key results (see Methods for discussions without PT symmetry). As we show below, magneto-electric effects occur in combination with electric quadrupole responses at optical frequencies, requiring the consideration of the combination of Gij and \({a}_{ijk}^{{\prime} }\).

Table 1 Symmetry properties of electromagnetic linear response functions

Let us consider the surface at z = 0 of a three-dimensional homogeneous material with the outward normal direction \(\hat{z}\) (Fig. 1). The surface current density \({{{{{{{{\bf{j}}}}}}}}}^{s}=\int\nolimits_{-{d}_{s}/2}^{{d}_{s}/2}dz{{{{{{{\bf{J}}}}}}}}\), where ds is the characteristic thickness of the interface where response functions change rapidly as functions of z, is related to the multipole moments through

$${j}_{i}^{s}(\omega )= \mathop{\sum}\limits_{j,k}\left[\delta {G}_{jk}(\omega )+\mathop{\sum}\limits_{l}\frac{\omega }{2}{\epsilon }_{kzl}\delta {a}_{jlz}^{{\prime} }(\omega )\right]{\epsilon }_{kzi}{E}_{j}(\omega ) \\ + \mathop{\sum}\limits_{j}\left[{\bar{\sigma }}_{ij}(\omega )-i{\bar{\sigma }}_{izj}(\omega )\right]\delta {E}_{j}(\omega ),$$
(2)

where δf = f( − ds/2) − f(ds/2) and \(\bar{f}=[f(-{d}_{s}/2)+f({d}_{s}/2)]/2\) are the difference and average of the material property f(z) across the interface, and σij and σijk = i(ϵjklTil + ϵiklTjl) + ωSijk are the bulk conductivity tensors defined by Ji = σijEj + ∑j,kσijkqjEk + O(q2) for light wave vector q, where \({T}_{ij}={G}_{ij}-\frac{1}{3}{\delta }_{ij}\mathop{\sum }\nolimits_{k=1}^{3}{G}_{kk}-\frac{i}{6}\omega \mathop{\sum }\nolimits_{k,l=1}^{3}{\epsilon }_{jkl}{a}_{kli}^{{\prime} }\), and \({S}_{ijk}=({a}_{ijk}^{{\prime} }+{a}_{jki}^{{\prime} }+{a}_{kij}^{{\prime} })/3\)19,21.

Fig. 1: Optical magneto-electric effect.
figure 1

Magneto-electric (ME) coupling leads to different electrodynamics within the bulk and on the surface. Inside the magneto-electric medium, two linearly polarized light propagate with different speeds because the wave equation is modified by origin-independent magneto-electric coupling Tij and electric quadrupole susceptibility Sijk (this effect is called gyrotropic birefringence20). On the surface, axion magneto-electric coupling θ(z) comes into play additionally, contributing to the surface Hall conductivity. This is most readily seen by writing the action for optical axion electrodynamics allowing for a spatially varying θ parameter, SOA = (e2/2πh)∫θ(x)EB. Integrating by parts in the presence of an interface along the z direction where the axion angle jumps at the interface δθ and identifying the coefficient of Ai with the surface current density we find \({{{{{{{\bf{J}}}}}}}}=-({e}^{2}/2\pi h)\delta \theta \hat{z}\times {{{{{{{\bf{E}}}}}}}}\). Thus an electric field parallel to the surface sets up a current in the orthogonal direction along the surface, indicative of a surface Hall effect. SOA does not modify the propagation of electromagnetic fields within the bulk medium where θ does not vary spatially.

We neglect the δE term because it is a bulk response whose contribution to the interface vanishes as ds → 0. The form of the surface current density in Eq. (2) suggests defining surface-sensitive magneto-electric coupling by

$${G}_{ix}^{(z)}(\omega )=\, {G}_{ix}(\omega )-\frac{1}{2}\omega {a}_{iyz}^{{\prime} }(\omega ),\\ {G}_{iy}^{(z)}(\omega )=\, {G}_{iy}(\omega )+\frac{1}{2}\omega {a}_{ixz}^{{\prime} }(\omega ),$$
(3)

where the superscript (z) explicitly shows that these quantities depend on the choice of the surface normal direction [(z) and (−z) are the same, though]. The zz component of this surface-sensitive magneto-electric coupling is defined from the response of the surface charge. Using \(\rho=-\nabla \cdot {{{{{{{\bf{P}}}}}}}}+\frac{1}{2}{\partial }_{i}{\partial }_{j}{Q}_{ij}+\ldots\), we obtain \({\rho }^{s}={G}_{zi}{B}_{i}+\frac{\omega }{2}({a}_{yzx}^{{\prime} }-{a}_{xzy}^{{\prime} }){B}_{z}+\frac{\omega }{2}{a}_{zzy}^{{\prime} }{B}_{x}-\frac{\omega }{2}{a}_{zzx}^{{\prime} }{B}_{y}+\ldots\) where the ellipsis includes the electric dipole term χziEi, the symmetric (∂jEk + ∂kEj) terms, and higher-order multipole terms. From this, we define

$${G}_{zz}^{(z)}(\omega )={G}_{zz}(\omega )+\frac{1}{2}\omega \left[{a}_{yxz}^{{\prime} }(\omega )-{a}_{xyz}^{{\prime} }(\omega )\right]$$
(4)

as well as \({G}_{zx}^{(z)}={G}_{zx}+\omega {a}_{zzy}^{{\prime} }/2\) and \({G}_{zy}^{(z)}={G}_{zy}-\omega {a}_{zzx}^{{\prime} }/2\).

While various components of the magneto-electric coupling are related to the surface optical conductivity, there is a unique component that manifests itself only at the surface: the axion angle. We define the optical axion angle by the trace part of \({G}_{ij}^{(z)}\).

$${\theta }^{(z)}(\omega )\equiv \pi \frac{2h}{{e}^{2}}\frac{1}{3}\mathop{\sum }\limits_{i=1}^{3}{G}_{ii}^{(z)}(\omega ).$$
(5)

To see that this only manifests itself at the surface rather than the bulk, we note that the bulk current response by the magneto-electric and electric quadrupole susceptibilities is determined only through Tij and Sijk. Because of this, previous studies focusing on bulk magneto-electric effect did not capture optical axion electrodynamics19,25. See Supplementary Note 1 for more details on the bulk response. The surface-sensitive magneto-electric coupling is fully determined by these bulk-response quantities and the axion angle:

$${G}_{xx}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{T}_{xx}(\omega )-\frac{\omega }{2}{S}_{xyz}(\omega ),\\ {G}_{yy}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{T}_{yy}(\omega )+\frac{\omega }{2}{S}_{xyz}(\omega ),\\ {G}_{zz}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{T}_{zz}(\omega ),\\ {G}_{ij}^{(z)}(\omega )= \, {T}_{ij}(\omega )-\frac{\omega }{2}{S}_{iik}(\omega ){\epsilon }_{jik},\,{{\mbox{for}}}\,i \, \ne \, j\,{{\mbox{and}}}\,\,j \, \ne \, 3.$$
(6)

Since Tij is traceless, it does not contribute to the trace of \({G}_{ij}^{(z)}\). Note that \({G}_{ij}^{(z)}(\omega )\) transforms as a tensor under magnetic layer group actions but not under the full three-dimensional magnetic space group actions.

Magneto-electric coupling with open boundaries in one direction

Defining the combination \({G}_{ij}^{(z)}\) of Gij and \({a}_{ijk}^{{\prime} }\) has an advantage in practical calculations as well as in the formulation. As \({G}_{ij}^{(z)}\) characterizes the surface current response which is measurable, it admits a gauge-invariant form in three-dimensional lattice systems with open boundaries along one direction, or in quasi-two-dimensional systems. This nice property is absent in the diagonal magneto-electric coupling Gii, making it hard to calculate.

The bare magneto-electric coupling Gij = ∂Pi/∂Bj and \({a}_{ijk}^{{\prime} }=\partial {Q}_{jk}/\partial {E}_{i}\) at zero temperature have the following form according to linear response theory (see Methods).

$${G}_{ij}(\omega )= \, \frac{V}{\hslash }\mathop{\sum}\limits_{n,m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{{{{{{{\rm{Re}}}}}}}}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{M}}_{j}|n\rangle,\\ {a}_{ijk}^{{\prime} }(\omega )= \, -\frac{V}{\hslash }\mathop{\sum}\limits_{n,m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\frac{{\omega }_{mn}}{\omega }{{{{{{{\rm{Im}}}}}}}}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{Q}}_{jk}|n\rangle,$$
(7)

where V is the volume of the system, n, m are indices for energy eigenstates with eigenvalue ωn, fnm = fn − fm is the difference between the Fermi-Dirac distribution of the n and m states. \({\hat{P}}_{i}=-e{\hat{r}}^{i}/V\), \({\hat{M}}_{j}=-(e{\epsilon }_{jkl}{\hat{r}}^{k}{\hat{v}}^{l}/2+{\hat{m}}_{j}^{s})/V\), and \({\hat{Q}}_{jk}=-e{\hat{r}}^{j}{\hat{r}}^{k}/V\) are electric dipole, magnetic dipole, and electric quadrupole density operators, respectively, where \(\hat{{{{{{{{\bf{r}}}}}}}}}\) and \(\hat{{{{{{{{\bf{v}}}}}}}}}\) are the position and velocity operators of electrons, and \({\hat{m}}^{s}\) is the spin magnetic moment operator. Equation (7) can be calculated for molecular systems21,29, meaning finite systems with open boundary conditions along all directions, by using the real-space representation of \(\hat{{{{{{{{\bf{r}}}}}}}}}\) and the relation \(\hat{{{{{{{{\bf{v}}}}}}}}}=-i{\hslash }^{-1}[\hat{{{{{{{{\bf{r}}}}}}}}},\,\hat{H}]\), where \(\hat{H}\) is the Hamiltonian of the system.

In periodic systems, however, Gij and \({a}_{ijk}^{{\prime} }\) are not well defined separately because the position operator is not well-defined because the position is not uniquely defined. This manifests through the momentum-space representation of the position operator \(\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|\hat{r}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle=-{\delta }_{mn}i{\partial }_{{{{{{{{\bf{k}}}}}}}}}{\delta }_{{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }{{{{{{{\bf{k}}}}}}}}}+{\delta }_{{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }{{{{{{{\bf{k}}}}}}}}}\langle {u}_{m{{{{{{{\bf{k}}}}}}}}}|i{\partial }_{{{{{{{{\bf{k}}}}}}}}}|{u}_{n{{{{{{{\bf{k}}}}}}}}}\rangle\), whose diagonal matrix element is not well defined because of \({\partial }_{{{{{{{{\bf{k}}}}}}}}}{\delta }_{{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }{{{{{{{\bf{k}}}}}}}}}\). On the other hand, the diagonal matrix elements of the position operator do not appear in the response functions that have a well defined physical meaning in periodic systems. Tij and Sijk are such examples that characterizes the bulk current response19. Since G(z) characterizes the surface response, one can expect that it is well defined in quasi-two-dimensional periodic systems.

After doing some algebra that we relegate to Methods, we can write \({G}_{ii}^{(z)}\)s in two-dimensional momentum space as

$${G}_{xx}^{(z)}(\omega )= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m,{k}_{x},{k}_{y}}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{{{{{{{\rm{Re}}}}}}}}\left[{r}_{nm}^{x}\langle {\psi }_{m({k}_{x},{k}_{y})}|-\frac{1}{2}\left({\hat{v}}^{y}{\hat{r}}^{z}+{\hat{r}}^{z}{\hat{v}}^{y}\right)+{\hat{m}}_{x}^{{{{{{{{\rm{s}}}}}}}}}|{\psi }_{n({k}_{x},{k}_{y})}\rangle \right],\\ {G}_{yy}^{(z)}(\omega )= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m,{k}_{x},{k}_{y}}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{{{{{{{\rm{Re}}}}}}}}\left[{r}_{nm}^{y}\langle {\psi }_{m({k}_{x},{k}_{y})}|\frac{1}{2}\left({\hat{v}}^{x}{\hat{r}}^{z}+{\hat{r}}^{z}{\hat{v}}^{x}\right)+{\hat{m}}_{y}^{{{{{{{{\rm{s}}}}}}}}}|{\psi }_{n({k}_{x},{k}_{y})}\rangle \right],\\ {G}_{zz}^{(z)}(\omega )= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m,{k}_{x},{k}_{y}}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{{{{{{{\rm{Re}}}}}}}}\left[\frac{1}{2}\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}\left({r}_{nm}^{z}{r}_{mp}^{x}{v}_{pn}^{y}-{r}_{np}^{z}{r}_{pm}^{x}{v}_{mn}^{y}-(x\leftrightarrow y)\right)+{r}_{nm}^{z}{({m}_{z}^{{{{{{{{\rm{s}}}}}}}}})}_{mn}\right],$$
(8)

where \({r}_{mn}^{i}=\langle {\psi }_{m}|{\hat{r}}^{i}|{\psi }_{n}\rangle\) and \({v}_{mn}^{i}=\langle {\psi }_{m}|{\hat{v}}^{i}|{\psi }_{n}\rangle\) are matrix elements of the position and velocity operators, \(|{\psi }_{n({k}_{x},{k}_{y})}\rangle\) is the Bloch state, the subscripts n,m and p are band indices. While the position operator matrix element is not well defined in momentum space30, the diagonal matrix elements of \({\hat{r}}^{x}\) and \({\hat{r}}^{y}\) do not appear in Eq. (8), so that the surface-sensitive magneto-electric coupling is well defined in two-dimensional momentum space. Combining equations in Eqs. (5) and (8), we can obtain the optical axion angle. In simple tight-binding models where \({\hat{r}}^{z}\) commutes with \({\hat{v}}^{y}\), \({G}_{xx}^{(z)}(0)\) reduces to the expression of Gxx(0) derived by Liu and Wang31. To our knowledge, our formulas represent the first expressions for calculating the diagonal components of the optical magneto-electric coupling in crystalline (periodic) systems.

The form of \({G}_{xx}^{(z)}\) and \({G}_{yy}^{(z)}\) suggests that their orbital magneto-electric part may be interpreted as a real-space dipole of the Berry curvature. This idea works exactly for two-band systems in the limit of decoupled layer systems, where the matrix element part can be written as the product of the Berry curvature \({F}_{xy}=-2{{{{{{{\rm{Im}}}}}}}}[{r}_{12}^{x}{r}_{21}^{y}]\) and the z component of the position eigenvalues \({r}_{11}^{z}\) or \({r}_{22}^{z}\). We apply this idea below to the case where the z direction is also periodic.

Intra-cell magneto-electric coupling and optical layer Hall conductivity

In fully periodic lattice systems where the z direction is also periodic, it is hard to calculate the full orbital part of \({G}_{ii}^{(z)}(\omega )\) because \({\hat{r}}^{z}\) is not well defined in momentum space. Nevertheless, we can still define and calculate the magneto-electric coupling of the unit cell, which we call the intra-cell magneto-electric coupling. Note, this treatment will be necessarily approximate, in contrast to our previous discussion of the slab geometry, but provides an physical understanding for the results. For example, the use of the intra-cell magneto-electric coupling makes the relation concrete between the axion angle and the antiferromagnetism in inversion symmetric systems. Ultimately we must compare the results between this approach and the previous slab calculation as we do in another section below.

Let us begin by giving a physical intuition that a nonzero axion angle is natural in a fully compensated antiferromagnet32. To see this, recall the analogy between electric polarization in 1D and the axion theta angle θ in 3D. The former can be defined in a system free of net charge, by stacking alternate positive and negative charges. Similarly, if we stack alternate planes with Hall conductance \(\pm {g}_{xy}^{H}\) such that the net Hall conductance vanishes, the axion magneto-electric coupling becomes well defined and is the analog of electric polarization of the alternating planes. In fact, with an applied magnetic field perpendicular to the planes, the induced polarization from having alternating charges in the \(\pm {g}_{xy}^{H}\) layers, leads to a finite electric polarization which is readily calculated as \({g}_{xy}^{H}\delta {a}_{z}/{a}_{z}\), where δaz is spacing between alternating antiferromagnetic planes versus the vertical size of the unit cell az. The charge in each flux quantum area is gH(h/e). For an antiferromagnet where spacing between planes = 1/2az, this is just \(\theta=2\pi {g}_{xy}^{H}/2\).

To present a more detailed analysis, let us decompose the position operator into the intra-cell polarization and unit cell position parts by \({r}_{mn}^{z}={{\mathbb{A}}}_{mn}^{z}+{R}_{mn}^{z}\). Namely,

$$\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\hat{r}}^{z}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle= \, {\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\mathop{\sum}\limits_{\beta,\alpha }\langle {\psi }_{m{{{{{{{\bf{k}}}}}}}}}|{\psi }_{\beta {{{{{{{\bf{k}}}}}}}}}\rangle {{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})\langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ +\mathop{\sum}\limits_{\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {R}^{z}\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle$$
(9)

where \(|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle\) is the Wannier state with the collective index α for spin and orbital (cf. n and m are band indices), \(|{\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}\rangle={N}^{-1/2}\mathop{\sum}\nolimits_{{{{{{{{\bf{R}}}}}}}}}{e}^{i{{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle\), and \({{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})=\mathop{\sum}\nolimits_{{{{{{{{\bf{R}}}}}}}}}\langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {e}^{i{{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}}\). The second term in Eq. (9) vanishes for n ≠ m as one can see by writing it as \(-i{\delta }_{mn}{\partial }_{{k}^{z}}{\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\). This decomposition is independent of choosing the basis α within the unit cell, where each unit cell is labeled by a given lattice vector R. However, the decomposition depends on the choice of the unit cell, which we discuss below.

The intra-cell polarization term, the first term in Eq. (9), defines the magneto-electric coupling of the unit cell. This intra-cell magneto-electric coupling depends on the choice of the unit cell. In our case, choosing a unit cell corresponds to fixing the value of the Wannier centers \(\langle {w}_{\alpha {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{0}}}}}}}}}\rangle\), which is ambiguous by respective lattice translations of the Wannier states \(\left|{w}_{\alpha {{{{{{{\bf{0}}}}}}}}}\right\rangle\). A physical interpretation of this multi-valuedness is similar to that of electric polarization32: The magneto-electric coupling depends on how we open the boundary, and there exists a preferred choice of the unit cell for each boundary condition (Fig. 2(a)).

Fig. 2: Intra-cell and unit-cell-position contributions to optical magneto-electric coupling.
figure 2

a Preferred choices of the unit cell with open boundaries. Arrows represent the average moment of a layer, where we bipartition the degrees of freedom within the unit cell into the upper and lower parts, each of which defines what we call a layer. The left-hand-side and middle even-layer systems are related by spatial inversion symmetry and also by the translation of the blue-arrow layers by a lattice constant if the deformation of the electronic states at the surface is neglected. The right-hand-side odd-layer system is inversion symmetric. The inversion centers are shown as yellow stars. b Hall conductivity of the unit cell in the presence of surface states. The vertical displacement Rz of a unit cell times its net Hall conductivity contributes to diagonal magneto-electric coupling \({G}_{xx}^{(z)}\), \({G}_{yy}^{(z)}\), and \({G}_{zz}^{(z)}\).

The unit-cell position term has the form \({\sum }_{{R}_{z}}{R}^{z}{\sigma }_{xy}^{H}({R}_{z})\), where \({\sigma }_{xy}^{H}\equiv ({\sigma }_{xy}-{\sigma }_{yx})/2\) is the Hall conductivity, and \({\sigma }_{xy}^{H}({R}^{z})=-{e}^{2}{\hslash }^{-1}{V}^{-1}{\sum }_{n\ne m,{k}_{x},{k}_{y}}{f}_{nm}{\omega }_{mn}/{({\omega }_{mn}-\omega )}^{-1}{{{{{{{\rm{Im}}}}}}}}[{r}_{nm}^{x}{r}_{mn}^{y}{P}_{nn}^{{R}^{z}}]\), and \({\hat{P}}^{{R}^{z}}={\sum }_{\alpha,{R}_{x},\,{R}_{y}}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|\) is the projection to Rz. This term is also multi-valued in periodic systems because the unit cell position R is not uniquely defined in periodic systems. This part, however, does not contribute to the magneto-electric coupling when the Hall conductivity of the unit layer vanishes, which is the case in PT-symmetric systems. When the boundary is introduced, the Hall conductivity of the surface unit layer can be nonzero even when the bulk Hall conductivity vanishes (Fig. 2(b)). This change of the Rz term is the main source of the difference in the magneto-electric couplings between finite-size systems and periodic lattice systems. This effect is significant especially in the static limit of axion insulators, where the emergent Dirac surface states modify the low-frequency surface Hall conductivity in the order of e2/2h.

As we derive below, in inversion-symmetric even-layer antiferromagnets, the intra-cell contribution to the magneto-electric coupling is simplified to the optical layer Hall conductivity. Namely, the optical axion angle is

$$\frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )\,\approx \,\frac{1}{2}{a}_{z}{\sigma }_{xy}^{{{{{{{{\rm{LH}}}}}}}}}(\omega )+\mathop{\sum}\limits_{{R}_{z}}{R}^{z}{\sigma }_{xy}^{H}({R}_{z})\;\; {{\mbox{for inversion-symmetric AFM}}},$$
(10)

where az is the vertical lattice constant, and σxy is a three-dimensional conductivity taking the unit of conductance (unit of e2/h) divided by the length. As for the definition of layers, we note that there are two inversion-invariant values of the vertical displacement z within az. We refer to the two quasi-two-dimensional bipartite regions centered at those two invariant z values as layers (Fig. 2(a)). The layer Hall conductivity is then defined as \({\sigma }_{xy}^{{{{{{{{\rm{LH}}}}}}}}}=({\sigma }_{xy}^{H,u}-{\sigma }_{xy}^{H,d})/2\) from the bulk optical Hall conductivity projected to the single layer l = u, d:

$${\sigma }_{xy}^{H,l}(\omega )=-\frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m,{k}_{x},{k}_{y}}\frac{{f}_{nm}{\omega }_{mn}}{{\omega }_{mn}-\omega }{{{{{{{\rm{Im}}}}}}}}\left[{r}_{nm}^{x}{r}_{m{n}^{{\prime} }}^{y}{P}_{{n}^{{\prime} }n}^{l}\right],$$
(11)

where \({P}_{{n}^{{\prime} }n}^{l}({{{{{{{\bf{k}}}}}}}})={\sum }_{\alpha,{{{{{{{\bf{R}}}}}}}};{r}_{\alpha,{{{{{{{\bf{R}}}}}}}}}^{z}\in l\,{{\mbox{layers}}}}\langle {\psi }_{n{{{{{{{\bf{k}}}}}}}}}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{{n}^{{\prime} }{{{{{{{\bf{k}}}}}}}}}\rangle\), \(|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle\)s are inversion-symmetric Wannier states, and \({r}_{\alpha,{{{{{{{\bf{R}}}}}}}}}^{z}=\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle\) is the Wannier center.

To obtain Eq. (10), note that the unit cell for even-layer systems is symmetric under the combination of inversion and a lattice translation of either the u layer by −az (or the l layer by + az) (Fig. 2(a)). If we focus on the intra-cell part (the \({\mathbb{A}}\) part), the combined symmetry gives a constraint \({\theta }_{{\mathbb{A}}}^{(z)}(\omega )=-{\theta }_{{\mathbb{A}}}^{(z)}(\omega )-{a}_{z}{\sigma }_{xy}^{H,d}(\omega )\), where the last term is due to the transformation property of the axion angle \(\delta {\theta }^{(z)}(\omega )={d}_{z}{\sigma }_{xy}^{H}(\omega )\) under the translation by a vector d. As we consider antiferromagnets with zero net Hall conductivity, such that \({\sigma }_{xy}^{H,d}=-{\sigma }_{xy}^{H,u}=-{\sigma }_{xy}^{{{{{{{{\rm{LH}}}}}}}}}\), we arrive at Eq. (10). Another useful way of understanding this result is to think of the electric polarization generated by applying a uniform magnetic field, and then the displacement of layers of one sign of the Hall effect obviously contributes to change in electric polarization, as we explained at the beginning of this section.

Magneto-optic effects in fully compensated antiferromagnets

The optical axion angle manifests directly through the magneto-optic Kerr effect. To gain some intuition for this, recall that the change in axion angle at the surface leads to a surface Hall effect \({\sigma }_{xy}^{s}=(\delta \theta /2\pi ){e}^{2}/h\). Clearly, such a surface Hall conductance will lead to a Kerr effect. We present a more systematic analysis below.

Let us consider the reflection at the single interface between two media 1 and 2 (Fig. 3). For simplicity, we assume that both media have MxT, C3z, and PT symmetries, which are shared by magneto-electric materials Cr2O3 and MnBi2Te4. We also assume normal incidence (i.e., light is incident along \(-\hat{z}\)). Then there is no birefringence because of the symmetry we require, so the propagation of light within each medium is then characterized by a single complex-valued refractive index nμ, where μ = 1, 2 labels the two media (See Eq. (6)).

Fig. 3: Reflections by PT-symmetric magneto-electric media.
figure 3

We assume PT symmetry to exclude natural optical activity, for simplicity. a Single interface. The light propagation within each medium is determined by the refractive index n and origin-independent magneto-electric coupling Tij and electric quadrupole susceptibility Sijk tensors. We consider the case where no birefringence occurs such that the light propagation is described by a unique refractive index in each medium. b Double interfaces. Infinite reflections occur between the top and bottom surfaces of medium 2. Light obtains the complex phase ϕ = kd = n2ωd/c while propagating the distance d in medium 2.

The electric field in medium 1 consists of incident and reflected fields Ei and Er while that in medium 2 is the transmitted field Et:

$${{{{{{{{\bf{E}}}}}}}}}_{1}= \, {{{{{{{{\bf{E}}}}}}}}}^{i}+{{{{{{{{\bf{E}}}}}}}}}^{r}\equiv (1+r){{{{{{{{\bf{E}}}}}}}}}^{i},\\ {{{{{{{{\bf{E}}}}}}}}}_{2}= \, {{{{{{{{\bf{E}}}}}}}}}^{t}\equiv t{{{{{{{{\bf{E}}}}}}}}}^{i},$$
(12)

where r and t = 1 + r are 2 × 2 Jones matrices for reflection and transmission, respectively. E1 = E2 at the interface because electric fields are continuous by Faraday’s law, because we consider normal incidence such that electric fields are parallel to the interface. The contribution from the surface conductivity is encoded in the boundary condition of the magnetic field at the interface.

$${{{{{{{{\bf{B}}}}}}}}}^{t}={{{{{{{{\bf{B}}}}}}}}}^{i}+{{{{{{{{\bf{B}}}}}}}}}^{r}+{\mu }_{0}\hat{z}\times {{{{{{{{\bf{j}}}}}}}}}^{s}$$
(13)

Here, we consider only the surface currents induced from the bulk, i.e., \(\hat{z}\times {{{{{{{{\bf{j}}}}}}}}}^{s}={\sigma }_{xy}^{s}{{{{{{{\bf{E}}}}}}}}\), where \({\sigma }_{xy}^{s}={G}_{xx}^{(z),\mu=2}-{G}_{xx}^{(z),\mu=1}\). By solving the boundary condition equations, we obtain

$${r}_{xx}= \, \frac{({n}_{1}-{n}_{2})({n}_{1}+{n}_{2})-{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}}{{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {r}_{xy}= \, -\frac{2{n}_{1}({\mu }_{0}c{\sigma }_{xy}^{s})}{{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}+{({n}_{1}+{n}_{2})}^{2}}.$$
(14)

Magneto-electric coupling appears in the reflective Jones matrix through the surface conductivity, which is a manifestation that the Kerr effects here are surface phenomena.

The complex Kerr angle is defined by \({\phi }_{K}={\varphi }_{K}+i{\eta }_{K}={\tan }^{-1}({r}_{xy}/{r}_{xx})\). Its real part measures the rotation of the light polarization plane while its imaginary part measures the circular dichroism, i.e., the intensity imbalance between the reflected left and right circularly polarized light. The Kerr angle is \(O({\mu }_{0}c{\sigma }_{xy}^{s})\) in general, but it can be much enhanced when n1 ≈ n2 because of the suppression of rxx.

When the sample thickness is much larger than the wavelength of light, it is enough to suppose that reflection occurs at a single interface. However, when thickness d is comparable to or less than wavelength λ, which is the case particularly relevant for thin films, we need to consider the response from the whole sample including top and bottom surfaces (cf. When the photon energy is smaller than the band gap, double interfaces can be relevant even for dλ for an insulating medium14 because then light incident on the top reaches the bottom without attenuation.).

To study this case, we consider three media with refractive index nμ, where μ = 1, 2, 3, as shown in Fig. 3b. We again assume MxT, C3z, and PT symmetries for each media. The Jones reflection matrix of the sample for light incident from medium 1 is then the infinite sum of multiple reflections.

$$r={r}_{T}+{e}^{2i\phi }(1+{r}_{T}^{{\prime} }){r}_{B}{\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}(1+{r}_{T}),$$
(15)

where we use tT,B = 1 + rT,B and \({t}_{T}^{{\prime} }=1+{r}_{T}^{{\prime} }\), and ϕ = n2ωd/c is the complex-valued phase obtained by the one-way propagation through the sample thickness d. Here, the subscripts T and B indicate the top and bottom of the sample, and the prime indicates the process where the light is incident to the top surface from below (we follow the notation in ref. 13). See Methods for the expression of Jones matrices rT, \({r}_{T}^{{\prime} }\), and rB. Similarly, for transmission,

$$t=(1+{r}_{B}){\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}{e}^{i\phi }(1+{r}_{T}).$$
(16)

Note that t ≠ 1 + r when ϕ ≠ 0.

Considering the case where only medium 2 is magneto-electric, we set \({G}_{xx}^{(z),1}={G}_{xx}^{(z),3}=0\) and \({G}_{xx}^{(z),2}\equiv {G}_{xx}^{(z)}\,\ne \,0\). We first consider δnϕ, where δn ≡ n3 − n1. The Kerr and Faraday rotation angles are then given by

$$\tan {\phi }_{K}= \frac{2{n}_{1}{\mu }_{0}c{G}_{xx}^{(z)}}{{n}_{2}^{2}-{n}_{1}^{2}+{({\mu }_{0}c{G}_{xx}^{(z)})}^{2}}+O\left(\delta n\right),\\ \tan {\phi }_{F}= -\frac{{\mu }_{0}c{G}_{xx}^{(z)}\sin \phi }{\left({n}_{1}^{2}+{n}_{2}^{2}+{({\mu }_{0}c{G}_{xx}^{(z)})}^{2}\right)\sin \phi+2i{n}_{1}{n}_{2}\cos \phi }\delta n+O\left(\delta {n}^{2}\right),$$
(17)

where \(\tan {\phi }_{F}={t}_{xy}/{t}_{xx}\). A nonzero Faraday rotation requires δn ≠ 0 because PT symmetry needs to be broken33. On the other hand, the Kerr rotation can be nonzero with δn = 0 because it is compatible with PT symmetry (Table 1). The Kerr rotation is independent of ϕ in the leading order of δn when δn is sufficiently small as if the response comes from the top surface only, while it actually comes from multiple reflections between the top and bottom surfaces. In this limit, the way the Kerr effect goes away is highly nontrivial as the thickness goes to zero (i.e., ϕ → 0 with ϕδn). The Kerr angle stays constant, but the amplitudes of the reflected signals ultimately vanish. Therefore, a finite Kerr effect can be observed from a thin film when optical equipment is highly sensitive.

However, a nontrivial ϕ dependence appears when ϕ is the smallest parameter (δnϕ), where we have

$$\tan {\phi }_{K}= \, \frac{4i{n}_{1}{n}_{3}{\mu }_{0}c{G}_{xx}^{(z)}}{{n}_{2}({n}_{1}^{2}-{n}_{3}^{2})}\phi+O\left({\phi }^{2}\right),\\ \tan {\phi }_{F}= \, -\frac{i({n}_{1}-{n}_{3}){\mu }_{0}c{G}_{xx}^{(z)}}{{n}_{2}({n}_{1}+{n}_{3})}\phi+O\left({\phi }^{2}\right),$$
(18)

which show that both Kerr and Faraday rotation vanish for ϕ = 0. Therefore, a nonzero ϕ is necessary for the Kerr effect as well as the Faraday effect in fully compensated antiferromagnets. Note that, while the Kerr angle in Eq. (17) is independent of ϕ, it applies only when ϕδn.

The crossover between two regimes respectively described by Eqs. (17) and (18) occurs when ϕ ~ δn. The corresponding wavelength scale

$$\lambda \sim {\lambda }^{*}\equiv 2\pi \left|\frac{{n}_{2}}{\delta n}\right|d$$
(19)

is much larger than the sample thickness d when δn 1. This is remarkable because one might naively expect that, because the response from the spatially separated top and bottom layers are not resolved when λd, the Kerr effect is vanishingly small in that regime and scales linearly with d/λ. However, our analysis shows that a much stricter λλ* is required for the suppression of the Kerr angle. To explain how this works, we note again that the Kerr angle and the amplitude of the Kerr rotated signal can behave differently. While the amplitude of the Kerr rotated signal ( rxy) is indeed suppressed for λd, the amplitude of the non-rotated signal (rxx) is also suppressed in the same limit. Their suppression at large wavelengths compensate each other to keep the ratio ϕK = rxy/rxx as long as λ is smaller than a larger length scale λ*, above which the Kerr angle ultimately gets suppressed.

In thin film axion insulators, Eq. (18) is typically more relevant at photon energies below the surface gap. The surface gap of experimentally realized axion insulators are about 50 meV16,34,35. For example, if we take n2 = 5 and d = 1 nm, we obtain a very small value ϕ ≤ 1.27 × 10−3 at ω ≤ 50 meV, which is typically smaller than δn. On the other hand, in the infrared and visible regime where the photon energy is in the order of eV, equation (17) can become relevant. We demonstrate this in the following section.

Model calculations

Let us apply our theory to study the optical axion electrodynamics in MnBi2Te4. MnBi2Te4 is the only stoichiometric compound that experimentally realizes the intrinsic antiferromagnetic axion insulator34,35, which has now become an attractive platform for studying axion magneto-electric effects10,11,12,36,37,38,39. As it is a layered antiferromagnet, its few-layer behavior and layer number dependence is also of interest31,40.

Here we calculate its magneto-optic properties based on the low-energy model in refs. 36,41. The goal of our calculations here is to cement the validity of our new theory by providing a concrete model example as well as to understand qualitative features (e.g., the dominance of the axion contribution and the significant Kerr and negligible Faraday effects) of the magneto-optic response in MnBi2Te4. Our model is expected to quantitatively capture the low-energy properties of the material. On the other hand, at photon energies much larger than the band gap, a precise quantitative calculation of the magneto-optical spectrum will require a model, like the full ab-initio model, that captures all the significant optical transitions involving those states neglected in our model. With this in mind, we consider a nearest-neighbor tight-binding Hamiltonian on a layer-stacked triangle lattice

$$\hat{H}=\mathop{\sum}\limits_{i,\alpha,\beta }{\hat{c}}_{i\alpha }^{{{{\dagger}}} }{({h}_{0})}_{\alpha \beta }{\hat{c}}_{i\beta }-\mathop{\sum}\limits_{\left\langle i,j\right\rangle,\alpha,\beta }{\hat{c}}_{i\alpha }^{{{{\dagger}}} }{t}_{\alpha \beta }^{ij}{\hat{c}}_{j\beta },$$
(20)

where i, j are the site indices, \(\left\langle i,j\right\rangle\) means that the summation is over nearest neighbors, and α, β = 1, …, 4 run over two spin and two orbital degrees of freedom at each site.

As the non-magnetic state has space group \(R\bar{3}m\) (No. 166), we impose time reversal T = isyK, inversion P = τz, and threefold \({C}_{3z}=\exp \left(-i\pi {s}_{z}/3\right)\) and twofold C2x = − isx rotational symmetries, where si and τi are Pauli matrices for spin and orbital, respectively. The onsite Hamiltonian satisfying all the symmetries of the nonmagnetic state is h0 = e0 + e5τz. Along the z direction, the nearest-neighbor hopping matrices are \({T}_{4}\equiv {t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{4},j}={t}_{0}^{z}+i{t}_{3}^{z}{s}_{z}{\tau }_{x}+{t}_{5}^{z}{\tau }_{z}\), where a4 = (0, 0, az), az is the out-of-plane lattice parameter. For the in-plane directions, the hopping matrices are \({t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{1},j}={t}_{0}+i{t}_{1}{s}_{x}{\tau }_{x}+i{t}_{4}{\tau }_{y}+{t}_{5}{\tau }_{z}={({t}^{j,j+{{{{{{{{\bf{a}}}}}}}}}_{1}})}^{{{{\dagger}}} }\), \({t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{2},j}={C}_{3z}{T}_{1}{C}_{3z}^{-1}\), and \({t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{3},j}={C}_{3z}{T}_{2}{C}_{3z}^{-1}\), where a1 = (a, 0, 0), a2 = C3za1, a3 = C3za2, a is the in-plane lattice parameter (see Methods for further details).

We consider the effect of the layer-alternating (i.e., A-type) antiferromagnetism by adding (−1)n−1mσz to h0, where n is the layer index. While this term breaks time reversal symmetry, inversion symmetry remains the symmetry of the lattice system. However, finite even-layer systems break inversion symmetry and have non-zero axion angle according to Eq. (10).

Figure 4a shows the orbital part of the axion angle calculated with the tight-binding parameters of MnBi2Te4 derived in ref. 36 (We make a momentum-dependent overall energy shift to obtain an insulating filling at half filling as we describe in Methods. The band structure and electric susceptibility are shown in Supplementary Figs. 1 and 2.). At high energies above 1 eV, the axion angle is well approximated by the optical layer Hall conductivity for any number of layers. However, the axion angle deviates significantly from the optical layer Hall conductivity as the photon energy gets lower below 1 eV. The deviation at the low energy increases with the number of layers because the surface massive Dirac fermion is then more localized at the surfaces and increases the second term in Eq. (10), making the static axion angle reach the quantized value θ = π.

Fig. 4: Optical magneto-electric coupling in the low-energy tight-binding model of MnBi2Te4.
figure 4

a and b Real and imaginary part of the orbital magneto-electric axion angle. Bulk LH indicates the optical layer Hall conductivity calculated with periodic boundary conditions. Bulk LH approaches the exact axion magneto-electric coupling as photon energy increases. Orbital magneto-electric Txx and the spin magneto-electric coupling. Spin parts contribute to axion angle and Txx through \({\theta }^{(z),s}=(2{G}_{xx}^{s}+{G}_{zz}^{s})/3\) and \({T}_{xx}^{s}=({G}_{xx}^{s}-{G}_{zz}^{s})/3\). c Spectra for two layers. These are three orders of magnitudes smaller than the orbital axion magneto-electric coupling. d Layer-number dependence at photon energy ω = 2 eV. Solid and dashed lines are curves fitted with a/Nl + b form, where a and b are fitting parameters. All optical response functions are calculated with γ = 10 meV to broaden the resonance through ω → ω + iγ.

Txx and spin magneto-electric coupling are much smaller than the axion angle, as shown in Fig. 4b,c, respectively. This is consistent with our expectation that these origin-independent magneto-electric couplings are strongly suppressed in systems with local inversion symmetry. Furthermore, they decrease inversely with the number of layers Nl31, because only O(1/Nl) portion of layers near the top and bottom generates a nontrivial response. This contrasts to the case with a finite Txx for Nl → , where the response is coming from O(Nl) layers.

As Txx is relatively small, optical axion electrodynamics dominates the Kerr and Faraday effects in this system. Figure 5 shows the Kerr and Faraday rotation angles calculated with the magneto-electric coupling. We consider the case where the model system (medium 2) is encapsulated by medium 1 and medium 3, having frequency-independent refractive indices n1 = 2.2 and n3 = 2.4, respectively, corresponding to those of the hexagonal Boron nitride and diamond at photon energy around 1 eV42,43. The calculated Kerr rotation angle φK is about 0.02 at photon energies larger than 1 eV, which is about one order of magnitude smaller than φK 1 in typical ferromagnets although our antiferromagnetic system has zero net magnetic moment. The Kerr angle in real MnBi2Te4 can even be much enhanced because of the contributions from higher-energy bands that we do not include here. As we consider n1 ≠ n3, the Faraday rotation is nonzero because the cancellation between the top and bottom surfaces is incomplete. The Faraday effect is two orders of magnitude weaker than the Kerr effect.

Fig. 5: Kerr and Faraday rotation angles in the low-energy tight-binding model of MnBi2Te4.
figure 5

a Complex Kerr rotation angle \({\phi }_{K}={\tan }^{-1}({r}_{xy}/{r}_{xx})\). b Complex Faraday rotation angle \({\phi }_{F}={\tan }^{-1}({t}_{xy}/{t}_{xx})\). The real and imaginary parts of the complex angles are shown as solid and dashed lines, respectively. We consider the effect of both top and bottom surfaces by using Eqs. (15) and (16) with the refractive indices n1 = 2.2 and n3 = 2.4 for the capsulating media. The complex refractive index of the model itself is obtained from the electric dipole susceptibility through \({n}_{2}(\omega )=\sqrt{1+{\chi }_{xx}(\omega )/{\epsilon }_{0}}\). For LH, we calculate the axion angle from the layer Hall conductivity with periodic boundary conditions and use the thickness of 50 layers for ϕ = n2ωd/c. All optical response functions are calculated with γ = 10 meV to broaden the resonance through ω → ω + iγ.

Discussion

Our theory of optical axion electrodynamics fills a crucial missing piece in the macroscopic theory of magneto-optic effects in antiferromagnets developed mostly by Graham and Raab21,25,26,27. They used only origin-independent Tij and Sijk to ensure physically meaningful results such as the consistency with the reciprocal relations. However, our approach shows that we can include the origin-dependent axion magneto-electric coupling in the theory, and it is precisely the axion angle that controls the Kerr effect in antiferromagnets with local inversion symmetry. As we show by using the low-energy tight-binding model of MnBi2Te4, the omission of the axion electrodynamics can underestimate the Kerr effect by orders of magnitudes. In general, the same suppression of Tij and Sijk is expected in systems with bulk symmetries that reverses an odd number of spacetime coordinates. As long as those symmetries are broken at the surfaces, the axion magneto-electric coupling is not much affected by the symmetries, such that axion electrodynamics dominates the response.

In fact, the trace part of the magneto-electric coupling was included in the study by Hornreich and Shtrikman20 prior to refs. 21,25,26,27, However, their estimate of the effect was four orders of magnitudes smaller than the value experimentally observed subsequently44. This inconsistency lead to the appearance of theories based on different approaches24,25,26,27, all of which do not include the axion electrodynamics. Our introduction of the surface-sensitive magneto-electric coupling and the study of double interfaces allow for precise quantitative understanding of the Kerr effect including the axion electrodynamics, especially for thin films.

While we focus on the Kerr effect due to macroscopic magneto-electric coupling for fully compensated antiferromagnets, there is another microscopic mechanism based on the spatial modulation (phase change as well as the attenuation) of the electric field \(E(z)={E}_{0}{e}^{i{k}_{z}z}\)24, where kz = nω/c is complex valued. However, the microscopic mechanism produces only minor effects. To see this, let ϕK,0 be the Kerr rotation angle by a single layer with net magnetization in A-type antiferromagnet. Considering the reflection of each layer as well as the complex phase rotation during the propagation, one obtains the Kerr angle \({\phi }_{K}={\phi }_{K,0}(1-{e}^{2i{k}_{z}{a}_{z}}+{e}^{4i{k}_{z}{a}_{z}}-{e}^{6i{k}_{z}{a}_{z}}+\ldots )/(1+{e}^{2i{k}_{z}{a}_{z}}+{e}^{4i{k}_{z}{a}_{z}}+{e}^{6i{k}_{z}{a}_{z}}+\ldots )=-i{k}_{z}{a}_{z}{\phi }_{K,0}+O({({k}_{z}{a}_{z})}^{2})\)for an even number layers24, where az is the layer spacing. This Kerr angle is smaller than the axion-induced ϕK ≈ ϕK,0 because azλ for the wavelength down to the UV regime.

In the static limit, both Faraday and Kerr effects are often considered as manifestations of the axion electrodynamics8,13,14. It is because the systems under consideration have finite net Hall conductivity. The same sign of the Hall conductivity is induced on the top and bottom surfaces of a Z2 topological insulator by either external magnetic fields8,13 or coupling to ferromagnets14. The main focus of those studies is the manifestation of the half-quantized surface Hall conductivity, rather than the magneto-electric response of antiferromagnets.

An open question we leave for future studies is to formulate a quantum geometric theory of optical axion electrodynamics in periodic systems, generalizing the Chern-Simons integral in the static limit. A drawback of calculating intra-cell optical magneto-electric coupling through Eq. (9) (or calculating the layer Hall conductivity) is that it does not capture the topological magneto-electric effect because we drop the second term in Eq. (10). A unified formula that captures both intra-cell optical magneto-electric coupling and topological magneto-electric effect is desired, and it is likely to require extending the quantum geometric formulation for electric dipole moments45 to magnetic dipole and electric quadrupole moments and defining a proper optical Chern-Simons integral. However, this may not be feasible, in which case we are forced to work with quasi-two-dimensional systems.

Finally, we note that the optical axion angle we define should be distinguished from the dymanical axion fields46. In our optical axion electrodynamics based on linear response theory, the effective action SOA ∫ dωdxθ(ω, x)E(ω, x) B(ω, x) for non-absorptive media describes the propagation of the electromagnetic fields modified by elastic scattering by the medium. The optical axion angle is a ground-state property, which is non-dynamical. On the other hand, the dynamical axion field interacts spacetime-locally with the electromagnetic fields through Sdynamic ∫ dtdxθ(t, x)E(t, x) B(t, x). This describes Raman scattering where the energy or momentum of the incoming electromagnetic field is tranfered to the dynamical axion field. The interplay between the two distinct phenomena is an interesting research direction.

Methods

Generalization to include natural optical activity

Electromagnetic multipole moments

Let us consider electric dipole Pi, electric quadrupole Qij, and magnetic dipole Mi moment densities induced by electric and magnetic fields21:

$${P}_{i}= \, {P}_{i}^{0}+{\chi }_{ij}{E}_{j}+\frac{1}{2}{a}_{ijk}{\nabla }_{k}{E}_{j}+{G}_{ij}{B}_{j}\\ + \,{\omega }^{-1}\left[{\chi }_{ij}^{{\prime} }{\dot{E}}_{j}+\frac{1}{2}{a}_{ijk}^{{\prime} }{\nabla }_{k}{\dot{E}}_{j}+{G}_{ij}^{{\prime} }{\dot{B}}_{j}\right]\ldots \\ {Q}_{ij}= \, {Q}_{ij}^{0}+{{\mathfrak{a}}}_{ijk}{E}_{k}+{\omega }^{-1}{{\mathfrak{a}}}_{ijk}^{{\prime} }{\dot{E}}_{k}\ldots \\ {M}_{i}= \, {M}_{i}^{0}+{{\mathfrak{G}}}_{ij}{E}_{j}+{\omega }^{-1}{{\mathfrak{G}}}_{ij}^{{\prime} }{\dot{E}}_{j}\ldots$$
(21)

for monochromatic electromagnetic fields in time domain, where \({P}_{i}^{0}\), \({Q}_{ij}^{0}\), and \({M}_{i}^{0}\) are permanent multipole moments, \({{\mathfrak{a}}}_{ijk}={a}_{kij}\), \({{\mathfrak{a}}}_{ijk}^{{\prime} }=-{a}_{kij}^{{\prime} }\), \({{\mathfrak{G}}}_{ij}={G}_{ji}\), \({{\mathfrak{G}}}_{ij}^{{\prime} }=-{G}_{ji}^{{\prime} }\), The ellipsis indicates electric-octupole/magnetic-quadrupole or higher-order multipole contributions that we neglect here. Here, the primed susceptibility tensors transform oppositely under time reversal compared to the non-primed ones. For example, while χij is even under time reversal, \({\chi }_{ij}^{{\prime} }\) is odd under time reversal.

Electromagnetic multipole moment densities are defined by21

$${\hat{P}}_{i}= \, -e{\hat{r}}^{i}/V,\\ {\hat{Q}}_{ij}= \, -e{\hat{r}}^{i}{\hat{r}}^{j}/V,\\ {\hat{M}}_{i}= \, \frac{1}{4}{\epsilon }_{ijk}\left({\hat{r}}^{j}{\hat{J}}_{k}^{{{{{{{{\rm{orb}}}}}}}}}-{\hat{J}}_{j}^{{{{{{{{\rm{orb}}}}}}}}}{\hat{r}}^{k}\right)+{\hat{M}}_{i}^{{{{{{{{\rm{spin}}}}}}}}}\\= \, {\hat{M}}_{i}^{{{{{{{{\rm{orb}}}}}}}}}+{\hat{M}}_{i}^{{{{{{{{\rm{spin}}}}}}}}},$$
(22)

where we split the orbital magnetic and spin parts, which respectively originates from the minimal coupling  →  + ieA and the explicit dependence on B independent of the minimal coupling.

$${\hat{J}}_{i}^{{{{{{{{\rm{orb}}}}}}}}} =\frac{1}{V}\frac{\partial \hat{H}}{\partial {A}_{i}}{\left|\right.}_{{{{{{{{\bf{B}}}}}}}}\;{{\mbox{fixed}}}},\\ {\hat{M}}_{i}^{{{{{{{{\rm{spin}}}}}}}}} =-\frac{1}{V}\frac{\partial \hat{H}}{\partial {B}_{i}}{\left|\right.}_{{{{{{{{\bf{A}}}}}}}}\;{{\mbox{fixed}}}}.$$
(23)

Surface-sensitive magneto-electric coupling

By generalizing the procedure in the main text to include both natural optical activity and gyrotropic birefringence, we define

$${\tilde{G}}_{ix}^{(z)}(\omega )= \, {\tilde{G}}_{ix}(\omega )-\frac{i}{2}\omega {\tilde{a}}_{iyz}(\omega ),\\ {\tilde{G}}_{iy}^{(z)}(\omega )= \, {\tilde{G}}_{iy}(\omega )+\frac{i}{2}\omega {\tilde{a}}_{ixz}(\omega ),\\ {\tilde{G}}_{zz}^{(z)}(\omega )= \, {\tilde{G}}_{zz}(\omega )-\frac{i}{2}\omega \left[{\tilde{{\mathfrak{a}}}}_{zxy}(\omega )-{\tilde{{\mathfrak{a}}}}_{zyx}(\omega )\right]$$
(24)

from the surface response

$${j}_{x}^{s}= \, \mathop{\sum }\limits_{i=1}^{3}\left({\tilde{{\mathfrak{G}}}}_{yi}-\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{xzi}\right){E}_{i}+\ldots,\\ {j}_{y}^{s}= \, -\mathop{\sum }\limits_{i=1}^{3}\left({\tilde{{\mathfrak{G}}}}_{xi}+\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{yzi}\right){E}_{i}+\ldots,\\ {\rho }^{s}= \,{\tilde{G}}_{zi}{B}_{i}-\frac{i}{2}\omega ({\tilde{{\mathfrak{a}}}}_{zxy}-{\tilde{{\mathfrak{a}}}}_{zyx}){B}_{z}-\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{yzz}{B}_{x}+\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{xzz}{B}_{y}+\ldots,$$
(25)

where \({\tilde{G}}_{ij}={G}_{ij}-i{G}_{ij}^{{\prime} }\), \({\tilde{{\mathfrak{G}}}}_{ij}={G}_{ji}+i{G}_{ji}^{{\prime} }\), \({\tilde{a}}_{ijk}={a}_{ijk}-i{a}_{ijk}^{{\prime} }\), and \({\tilde{{\mathfrak{a}}}}_{ijk}={a}_{kij}+i{a}_{kij}^{{\prime} }\). We can also define

$$\begin{array}{ll}{\tilde{G}}_{zi}^{(z,2)}(\omega )&={\tilde{G}}_{zi}(\omega )-\frac{i}{2}\omega \mathop{\sum}\limits_{k}{\tilde{{\mathfrak{a}}}}_{kzz}(\omega ){\epsilon }_{kzi}\,{{\mbox{for}}}\,i=x,y,\end{array}$$
(26)

which is distinguished from \({\tilde{G}}_{zi}^{(z)}(\omega )\). However, we focus on \({\tilde{G}}_{zi}^{(z)}(\omega )\) because we are interested in current generation rather than the charge fluctuation on the surface. We decompose \({\tilde{G}}_{ij}^{(z)}\) into

$${\tilde{G}}_{xx}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{\tilde{T}}_{xx}(\omega )-\frac{\omega }{2}\left[{S}_{xyz}(\omega )+i{{{\Gamma }}}_{xyz}(\omega )\right],\\ {\tilde{G}}_{yy}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{\tilde{T}}_{yy}(\omega )+\frac{\omega }{2}\left[{S}_{xyz}(\omega )+i{{{\Gamma }}}_{xyz}(\omega )\right],\\ {\tilde{G}}_{zz}^{(z)}(\omega )= \, \frac{{e}^{2}}{2\pi h}{\theta }^{(z)}(\omega )+{\tilde{T}}_{zz}(\omega ),\\ {\tilde{G}}_{i\ne j}^{(z)}(\omega )= \, {\tilde{T}}_{ij}(\omega )+\frac{\omega }{2}{\epsilon }_{zij}\left[{S}_{iiz}(\omega )+i{{{\Gamma }}}_{iiz}(\omega )\right],\,{{\mbox{for}}}\,i,\, j\in 1,2,\\ {\tilde{G}}_{zi}^{(z)}(\omega )= \, {\tilde{T}}_{zj}(\omega )+\frac{\omega }{2}\mathop{\sum }\limits_{k=1}^{3}{\epsilon }_{kzi}\left[{S}_{zzk}(\omega )-i{a}_{kzz}\right],\,{{\mbox{for}}}\,i\in 1,2,$$
(27)

where

$${{{\Gamma }}}_{ijk}= \, {a}_{ijk}+{a}_{jik}-{a}_{kij},\\ {\tilde{T}}_{ij}= \, {G}_{ij}-\frac{1}{3}{\delta }_{ij}\mathop{\sum }\limits_{k=1}^{3}{G}_{kk}-\frac{i}{6}\omega \mathop{\sum }\limits_{k,l=1}^{3}{\epsilon }_{jkl}{a}_{kli}^{{\prime} }-i\left({G}_{ij}^{{\prime} }-\mathop{\sum }\limits_{k,l=1}^{3}{\epsilon }_{jkl}\frac{1}{2}\omega {a}_{kli}\right),$$
(28)

and \({\theta }^{(z)}(\omega )=2\pi h{e}^{-2}\mathop{\sum }\nolimits_{i=1}^{3}{G}_{ii}^{(z)}(\omega )/3\) and \({S}_{ijk}(\omega )=\frac{1}{3}[{a}_{ijk}^{{\prime} }(\omega )+{a}_{jki}^{{\prime} }(\omega )+{a}_{kij}^{{\prime} }(\omega )]\) are the same as in the main text.

\({\tilde{G}}_{ij}^{(z)}(\omega )\) have nontrivial dependence under the change of the spatial origin by d = (dx, dy, dz) because

$$\delta {\theta }^{(z)}(\omega )= \, -{\sigma }_{xy}^{H}(\omega ){d}_{z},\\ \delta \left(-\frac{i}{2}\omega {{{\Gamma }}}_{ijz}(\omega )\right)= \, -{\sigma }_{ij}^{{{{{{{{\rm{sym}}}}}}}}}(\omega ){d}_{z},\\ \delta \left(-\frac{i}{2}\omega {a}_{kzz}(\omega )\right)= \, -{\sigma }_{kz}^{{{{{{{{\rm{sym}}}}}}}}}(\omega ){d}_{z},$$
(29)

where \({\tilde{\sigma }}_{ij}(\omega )=-i\omega {\tilde{\chi }}_{ij}(\omega )\) is the complex-valued optical conductivity tensor, and \({\sigma }_{ij}^{H}(\omega )=[{\tilde{\sigma }}_{ij}(\omega )-{\tilde{\sigma }}_{ji}(\omega )]/2=-\omega {\chi }_{ij}^{{\prime} }(\omega )\) is its anti-symmetric part.

The origin dependence shows the ambiguity of defining the surface degrees of freedom. Let us recall that we define the surface current density \({j}^{s}=\int\nolimits_{-{d}_{s}/2}^{{d}_{s}/2}dz{j}^{i}(z)\) over the region − ds/2 ≤ z ≤ ds/2. While we can define a physically meaningful value of ds based on the surface property of a system, this value is still not completely uniquely defined (e.g., if ds is the surface thickness, 1.01ds also makes sense as a surface thickness). Because of the ambiguity of ds, the amount of the bulk-conductivity contribution to js can also vary. For example, let us consider the xy component of the surface conductivity. It has a bulk-conductivity term as well as the magnetic dipole and electric quadrupole contributions. Let us suppose that the surface is defined as the interface between medium 1 (at z < 0) and medium 2 at (z > 0).

$${\sigma }_{xy}^{s}(\omega )= \, {\tilde{{\mathfrak{G}}}}_{yy}^{(z)}(-{d}_{s}/2)-{\tilde{{\mathfrak{G}}}}_{yy}^{(z)}({d}_{s}/2)+\int\nolimits_{-{d}_{s}/2}^{{d}_{s}/2}dz{\sigma }_{xy}(\omega,z)\\= \, {\tilde{{\mathfrak{G}}}}_{yy}^{(z)}(-{d}_{s}/2)-{\tilde{{\mathfrak{G}}}}_{yy}^{(z)}({d}_{s}/2)+{\sigma }_{xy,1}(\omega )\frac{{d}_{s}}{2}+{\sigma }_{xy,2}(\omega )\frac{{d}_{s}}{2}\\= \, {\tilde{{\mathfrak{G}}}}_{yy,1}^{(z)}+\delta {\tilde{{\mathfrak{G}}}}_{yy,1}^{(z)}-{\tilde{{\mathfrak{G}}}}_{yy,2}^{(z)}-\delta {\tilde{{\mathfrak{G}}}}_{yy,2}^{(z)},$$
(30)

where

$$\begin{array}{l}\delta {\tilde{{\mathfrak{G}}}}_{yy}^{(z)}=-{\sigma }_{xy}(\omega ){d}_{z}\end{array}$$
(31)

is the change of \({\tilde{{\mathfrak{G}}}}_{yy}^{(z)}\) by the shifting of the origin by d = (0, 0, dz). This shows that, while we can define \({\sigma }_{xy}^{s}(\omega )\) as the difference of \({\tilde{{\mathfrak{G}}}}_{yy}^{(z)}\) between two media for any value of ds, we have to shift the spatial origin by − ds/2 and ds/2 for medium 1 and medium 2, respectively.

Quantum mechanical expressions of the magneto-electric coupling

Linear response theory

The susceptibility tensor for the linear response of an operator \(\hat{A}\) to the external field FB

$$A(t)={\left\langle \hat{A}(t)\right\rangle }_{{F}_{B}=0}+\int\,{dt}^{{\prime} }{\tilde{\chi }}_{AB}(t-{t}^{{\prime} }){F}_{B}({t}^{{\prime} })$$
(32)

is given by

$${\tilde{\chi }}_{AB}(t-{t}^{{\prime} })=-\frac{i}{\hslash }{{\Theta }}(t-{t}^{{\prime} })\langle [\hat{A}(t),\hat{B}({t}^{{\prime} })]\rangle,$$
(33)

where \(\hat{B}=\partial \hat{H}(F)/\partial {F}_{B}{|}_{{F}_{B}=0}\) is conjugate to the external field. Here, we take the Heisenberg picture \(\hat{O}(t)={e}^{i\hat{H}t}\hat{O}{e}^{-i\hat{H}t}\).

In the frequency domain, the susceptibility tensor can be written as

$${\tilde{\chi }}_{AB}(\omega ) =\, \int\nolimits_{-\infty }^{\infty }d{t}^{{\prime} }{e}^{i\omega (t-{t}^{{\prime} })}{\tilde{\chi }}_{AB}(t-{t}^{{\prime} })\\ =\, -\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega }\frac{\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle }{{\omega }_{mn}-\omega }+({{{{{{{\rm{FS\ terms}}}}}}}}),$$
(34)

where \(|n\rangle\) is the energy eigenstate of the unperturbed Hamiltonian with energy En = ωn, ωmn = ωm − ωn, and fnm = fn − fm is the difference between the Fermi-Dirac distribution function f of the \(|n\rangle\) and \(|m\rangle\) states, and the FS terms originate from the Fermi surface, i.e., they have momentum-space derivatives acting on the Fermi-Dirac distribution function (∂kfn) in the momentum space representation. The derivation goes as follows. At zero temperature, we have

$${\tilde{\chi }}_{AB}(\omega+i{{\Gamma }})= \, \int\nolimits_{-\infty }^{\infty }d{t}^{{\prime} }{e}^{i(\omega+i{{\Gamma }})(t-{t}^{{\prime} })}{\tilde{\chi }}_{AB}(t-{t}^{{\prime} })\\= \, -\frac{i}{\hslash }\int\nolimits_{-\infty }^{\infty }d{t}^{{\prime} }{e}^{i\omega (t-{t}^{{\prime} })}{{\Theta }}(t-{t}^{{\prime} })\langle [\hat{A}(t),\hat{B}({t}^{{\prime} })]\rangle \\= \, -\frac{i}{\hslash }\mathop{\sum}\limits_{n,m}\int\nolimits_{-\infty }^{t}d{t}^{{\prime} }{e}^{i(\omega+i{{\Gamma }})(t-{t}^{{\prime} })}{f}_{n}(\langle n|\hat{A}(t)|m\rangle \langle m|\hat{B}({t}^{{\prime} })|n\rangle -\langle n|\hat{B}({t}^{{\prime} })|m\rangle \langle m|\hat{A}(t)|n\rangle )\\= \, -\frac{i}{\hslash }\mathop{\sum}\limits_{n,m}\int\nolimits_{-\infty }^{t}d{t}^{{\prime} }{e}^{i(\omega+i{{\Gamma }})(t-{t}^{{\prime} })}{f}_{n}({e}^{-i{\omega }_{mn}(t-{t}^{{\prime} })}\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle -{e}^{i{\omega }_{mn}(t-{t}^{{\prime} })}\langle n|\hat{B}|m\rangle \langle m|\hat{A}|n\rangle )\\= \, -\frac{i}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{n}\bigg(\frac{1}{-i(\omega+i{{\Gamma }}-{\omega }_{mn})}\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle -\frac{1}{-i(\omega+i{{\Gamma }}+{\omega }_{mn})}\langle n|\hat{B}|m\rangle \langle m|\hat{A}|n\rangle \bigg)\\= \, -\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}\frac{{f}_{nm}\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle }{{\omega }_{mn}-(\omega+i{{\Gamma }})}-\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}\bigg({f}_{n}\frac{\langle n|\hat{B}|m\rangle \langle m|\hat{A}|n\rangle }{\omega+i{{\Gamma }}+{\omega }_{mn}}-{f}_{m}\frac{\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle }{\omega+i{{\Gamma }}+{\omega }_{nm}}\bigg)\\= \, -\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}\frac{{f}_{nm}\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle }{{\omega }_{mn}-(\omega+i{{\Gamma }})}-\frac{1}{\hslash }\frac{1}{\omega+i{{\Gamma }}}\mathop{\sum}\limits_{n}{f}_{n}\langle n|[{\hat{B}}_{n},{\hat{A}}_{n}]|n\rangle \\= \, -\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega }\frac{\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle }{{\omega }_{mn}-(\omega+i{{\Gamma }})}.$$
(35)

where we introduce a finite relaxation rate Γ for convergence of time integral, \({\hat{O}}_{n}={\hat{P}}_{n}\hat{O}{\hat{P}}_{n}\) with \({\hat{P}}_{n}=\left|n\right\rangle \left\langle n\right|\) is the projection of \(\hat{O}\) to states \(|n\rangle\), and \({\sum }_{n}{f}_{n}\langle n|[{\hat{B}}_{n},{\hat{A}}_{n}]|n\rangle={\sum }_{n\ne m}{f}_{nm}{A}_{nm}{B}_{mn}\).

It is often convenient to separate the real and imaginary parts of \(\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle\) as follows.

$${\tilde{\chi }}_{AB}(\omega )= \, -\frac{1}{2\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{1}{\omega }\left[\frac{{\omega }_{mn}}{{\omega }_{mn}-\omega }-\frac{{\omega }_{nm}}{{\omega }_{nm}-\omega }\right]{{{{{{{\rm{Re}}}}}}}}\bigg[\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle \bigg]\\ -\frac{i}{2\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{1}{\omega }\left[\frac{{\omega }_{mn}}{{\omega }_{mn}-\omega }+\frac{{\omega }_{nm}}{{\omega }_{nm}-\omega }\right]{{{{{{{\rm{Im}}}}}}}}\bigg[\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle \bigg]\\= \, -\frac{1}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\left(\frac{{\omega }_{mn}}{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Re}}}}}}}}\left[\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle \right]+i\frac{{\omega }_{mn}^{2}/\omega }{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Im}}}}}}}}\bigg[\langle n|\hat{A}|m\rangle \langle m|\hat{B}|n\rangle \bigg]\right)\\= \, {\chi }_{AB}(\omega )-i{\chi }_{AB}^{{\prime} }(\omega ),$$
(36)

where we assume \(\hat{A}\) and \(\hat{B}\) are Hermitian operators.

Let us take electric susceptibility \({\tilde{\chi }}_{ij}\) as an example, which is defined by

$$\begin{array}{l}{P}_{i}(t)={\left\langle {\hat{P}}_{i}\right\rangle }_{E=0}+\int\,{dt}^{{\prime} }{\tilde{\chi }}_{ij}(t-{t}^{{\prime} }){E}_{j}({t}^{{\prime} }).\end{array}$$
(37)

Then, \(\hat{A}={\hat{P}}_{i}\) is the electric polarization density and \(\hat{B}=-{\hat{P}}_{j}V\) is the polarization, so we have

$${\tilde{\chi }}_{ij}(\omega )= \, \frac{V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega ({\omega }_{mn}-\omega )}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{P}}_{j}|n\rangle \\= \, \frac{2V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{n}\frac{{\omega }_{mn}}{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Re}}}}}}}}\left[\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{P}}_{j}|n\rangle \right] \\ +\,i\frac{2V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{n}\frac{{\omega }_{mn}^{2}/\omega }{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Im}}}}}}}}\bigg[\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{P}}_{j}|n\rangle \bigg]\\= \, {\chi }_{ij}(\omega )-i{\chi }_{ij}^{{\prime} }(\omega )$$
(38)

Similarly, other susceptibility tensors are given by

$${\tilde{a}}_{ijk}(\omega ) =\frac{V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega ({\omega }_{mn}-\omega )}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{Q}}_{jk}|n\rangle={a}_{ijk}-i{a}_{ijk}^{{\prime} },\\ {\tilde{{\mathfrak{a}}}}_{ijk}(\omega ) =\, \frac{V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega ({\omega }_{mn}-\omega )}\langle n|{\hat{Q}}_{ij}|m\rangle \langle m|{\hat{P}}_{k}|n\rangle={a}_{kij}+i{a}_{kij}^{{\prime} },\\ {\tilde{G}}_{ij}(\omega ) =\, \frac{V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega ({\omega }_{mn}-\omega )}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{M}}_{j}|n\rangle={G}_{ij}-i{G}_{ij}^{{\prime} },\\ {\tilde{{\mathfrak{G}}}}_{ij}(\omega ) =\, \frac{V}{\hslash }\mathop{\sum}\limits_{n,m}{f}_{nm}\frac{{\omega }_{mn}}{\omega ({\omega }_{mn}-\omega )}\langle n|{\hat{M}}_{i}|m\rangle \langle m|{\hat{P}}_{j}|n\rangle={G}_{ji}+i{G}_{ji}^{{\prime} }.$$
(39)

Structure of the optical magneto-electric coupling

The bare magneto-electric coupling can be decomposed into three parts as follows.

$${G}_{ij}(\omega )= \, {G}_{ij}^{{{{{{{{\rm{K}}}}}}}}}(\omega )+{G}_{ij}^{{{{{{{{\rm{C}}}}}}}}}(\omega )+{G}_{ij}^{A}(\omega )\\= \, \frac{{e}^{2}}{\hslash V}{\epsilon }_{jkl}\mathop{\sum}\limits_{n\in {{{{{{{\rm{occ}}}}}}}},m\in {{{{{{{\rm{unocc}}}}}}}},{{{{{{{\bf{k}}}}}}}}}\frac{{\omega }_{mn}}{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Re}}}}}}}}\left[\mathop{\sum}\limits_{{n}^{{\prime} }}{r}_{nm}^{i}{r}_{m{n}^{{\prime} }}^{k}{v}_{{n}^{{\prime} }n}^{l}+\mathop{\sum}\limits_{{m}^{{\prime} }}{r}_{nm}^{i}{v}_{m{m}^{{\prime} }}^{l}{r}_{{m}^{{\prime} }n}^{k}\right]\\ + \frac{{e}^{2}}{\hslash V}{\epsilon }_{jkl}\mathop{\sum}\limits_{n,{n}^{{\prime} }\in {{{{{{{\rm{occ}}}}}}}},m\in {{{{{{{\rm{unocc}}}}}}}},{{{{{{{\bf{k}}}}}}}}}\frac{{\omega }_{mn}^{2}}{{\omega }_{mn}^{2}-{\omega }^{2}}{{{{{{{\rm{Im}}}}}}}}\left[{r}_{nm}^{i}{r}_{m{n}^{{\prime} }}^{k}{r}_{{n}^{{\prime} }n}^{l}\right]\\ + \frac{{e}^{2}}{\hslash V}{\epsilon }_{jkl}\mathop{\sum}\limits_{n\in {{{{{{{\rm{occ}}}}}}}},m\in {{{{{{{\rm{unocc}}}}}}}},{{{{{{{\bf{k}}}}}}}}}\frac{{\omega }^{2}{\omega }_{mn}}{{({\omega }_{mn}^{2}-{\omega }^{2})}^{2}}{{{{{{{\rm{Re}}}}}}}}\left[{r}_{nm}^{i}{r}_{mn}^{k}\right]({v}_{n}^{l}+{v}_{m}^{l}).$$
(40)

The K-term and C-term correspond to the Kubo-like and Chern-Simons terms in ref. 19 of the static limit. The last A-term is nonzero only when ω ≠ 0. While the C-term was called the Chern-Simons term, it has additional terms, actually. In the static limit,

$${G}_{ij}^{{{{{{{{\rm{C}}}}}}}}}(0)={\delta }_{ij}\frac{{\theta }_{{{{{{{{\rm{CS}}}}}}}}}{e}^{2}}{2\pi h}+\frac{1}{3}{\epsilon }_{jkl}\mathop{\lim }\limits_{\omega \to 0}\left[\omega {a}_{kli}^{{\prime} }(\omega )\right]+\frac{{e}^{2}}{6\hslash }{\epsilon }_{jkl}{\int}_{{{{{{{{\bf{k}}}}}}}}}{\partial }_{k}{g}_{il},$$
(41)

Here, θCS is the axion angle given by the Chern-Simons integral in the three-dimensional Brillouin zone17,18

$${\theta }_{{{{{{{{\rm{CS}}}}}}}}}= \, -\frac{4{\pi }^{2}}{3V}{\epsilon }_{ijk}{{{{{{{\rm{Im}}}}}}}}{{{{{{{\rm{Tr}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}P{\hat{r}}^{k}\right]\\= \, -\frac{1}{4\pi }{\epsilon }_{ijk}{\int}_{{{{{{{{\rm{BZ}}}}}}}}}{d}^{3}k\left[\left({A}^{i}\frac{1}{2}{F}^{jk}+\frac{i}{3}{A}^{i}{A}^{j}{A}^{k}\right)\right.\\ -\left.\mathop{\sum}\limits_{n,{n}^{{\prime} }\in {{{{{{{\rm{occ}}}}}}}}}{\partial }_{j}{{{{{{{\rm{Re}}}}}}}}\left(\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {A}_{{n}^{{\prime} }n}^{k}\right)\right],$$
(42)

where P is the projection to the occupied states, and \({A}_{{n}^{{\prime} }n}^{k}=\langle {u}_{{n}^{{\prime} }}|i{\partial }_{k}|{u}_{n}\rangle\) is the non-abelian Berry connection for the occupied states. Since the last term vanishes in insulators with vanishing Chern number, only the Chern-Simons integral remains in the expression. The quadrupole term takes the form \(\mathop{\lim }\nolimits_{\omega \to 0}\left[\omega {a}_{kli}^{{\prime} }(\omega )\right]=-\frac{{e}^{2}}{\hslash V}{\epsilon }_{jkl}{{{{{{{\rm{Im}}}}}}}}{{{{{{{\rm{Tr}}}}}}}}[P{\hat{r}}^{k}Q{\hat{r}}^{l}{\hat{r}}^{i}]\). Lastly, \({g}_{ij}={\sum }_{n\in {{{{{{{\rm{occ}}}}}}}},m\in {{{{{{{\rm{unocc}}}}}}}}}{r}_{nm}^{i}{r}_{mn}^{j}\) is the quantum metric of the occupied states. The quantum metric term in Eq. (41) cancels the Fermi surface contribution from the quadrupole term (the quantum metric contribution in electric quadrupole responses was discussed in47,48), such that there is no Fermi surface contribution to the magneto-electric coupling Gij. In comparison, note that its time-reversal-symmetric counterpart \({G}_{ij}^{{\prime} }\) has a Fermi surface contribution, which gives rise to natural optical activity in metals, termed gyrotropic magnetic effect49,50. The quadrupole and quantum metric terms were missed in previous studies17,18, but they do not affect the axion angle.

Let us derive Eq. (42). A key equation in our derivation is

$$\mathop{\sum}\limits_{j,k}{\epsilon }_{ijk}\left\langle i{\partial }_{j}{\psi }_{m}|{\psi }_{p}\right\rangle \langle {\psi }_{p}|i{\partial }_{k}{\psi }_{n}\rangle=\mathop{\sum}\limits_{j,k}{\epsilon }_{ijk}\langle {\partial }_{j}{\psi }_{m}|{\partial }_{k}{\psi }_{n}\rangle=0,$$
(43)

which follows from the Wannier representation:

$$\mathop{\sum}\limits_{j,k}{\epsilon }_{ijk}\langle {\partial }_{j}{\psi }_{m}|{\partial }_{k}{\psi }_{n}\rangle= \, {\epsilon }_{ijk}\frac{1}{N}\mathop{\sum}\limits_{j,k,{{{{{{{\bf{R}}}}}}}},{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}{R}_{j}{R}_{k}^{{\prime} }{e}^{i({{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\cdot {{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\langle {w}_{m{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}|{w}_{n{{{{{{{\bf{R}}}}}}}}}\rangle \\= \, {\delta }_{mn}\frac{1}{N}\mathop{\sum}\limits_{{{{{{{{\bf{R}}}}}}}}}{e}^{i{{{{{{{\bf{k}}}}}}}}\cdot ({{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\mathop{\sum}\limits_{j,k}{\epsilon }_{ijk}{R}_{j}{R}_{k}\\= \, 0.$$
(44)

Using this identity, we can show that

$$\mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}{{{{{{{\rm{TrIm}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}P{\hat{r}}^{k}\right]= \, -\mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}{{{{{{{\rm{TrIm}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}Q{\hat{r}}^{k}\right]\\= \, \mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}{{{{{{{\rm{Re}}}}}}}}\left[\left({A}_{n{n}^{{\prime} }}^{i}-\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle \right)\frac{1}{2}{F}_{{n}^{{\prime} }n}^{jk}\right]\\= \, \mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}{{{{{{{\rm{Re}}}}}}}}\left[{A}_{n{n}^{{\prime} }}^{i}\frac{1}{2}{F}_{{n}^{{\prime} }n}^{jk}-\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle \left({\partial }_{j}{A}_{{n}^{{\prime} }n}^{k}-i{({A}^{j}{A}^{k})}_{{n}^{{\prime} }n}\right)\right]\\= \, \mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}\left[{A}_{n{n}^{{\prime} }}^{i}\frac{1}{2}{F}_{{n}^{{\prime} }n}^{jk}+i\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {({A}^{j}{A}^{k})}_{{n}^{{\prime} }n}-{\partial }_{j}{{{{{{{\rm{Re}}}}}}}}\left(\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {A}_{{n}^{{\prime} }n}^{k}\right)\right]\\= \, \mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}\left[{A}_{n{n}^{{\prime} }}^{i}\frac{1}{2}{F}_{{n}^{{\prime} }n}^{jk}-i\frac{1}{3}{{{{{{{\rm{Tr}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}P{\hat{r}}^{k}-{A}^{i}{A}^{j}{A}^{k}\right]-{\partial }_{j}{{{{{{{\rm{Re}}}}}}}}\left(\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {A}_{{n}^{{\prime} }n}^{k}\right)\right]\\= \, \frac{1}{3}\mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}{{{{{{{\rm{TrIm}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}P{\hat{r}}^{k}\right]+\mathop{\sum}\limits_{i,j,k}{\epsilon }_{ijk}\left[{{{{{{{\rm{Tr}}}}}}}}\left({A}^{i}\frac{1}{2}{F}^{jk}+\frac{i}{3}{A}^{i}{A}^{j}{A}^{k}\right) \right.\\ -\left.{\partial }_{j}{{{{{{{\rm{Re}}}}}}}}\left(\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {A}_{{n}^{{\prime} }n}^{k}\right)\right].$$
(45)

It follows that

$$\frac{1}{3}{{{{{{{\rm{Tr}}}}}}}}G= \, \frac{{e}^{2}}{\hslash }\frac{1}{3}{\epsilon }_{ijk}{{{{{{{\rm{TrIm}}}}}}}}\left[P{\hat{r}}^{i}P{\hat{r}}^{j}P{\hat{r}}^{k}\right]\\= \, \frac{{e}^{2}}{\hslash }\frac{1}{2}{\epsilon }_{ijk}\left[{{{{{{{\rm{Tr}}}}}}}}\left({A}^{i}\frac{1}{2}{F}^{jk}+\frac{i}{3}{A}^{i}{A}^{j}{A}^{k}\right)-{\partial }_{j}{{{{{{{\rm{Re}}}}}}}}\left(\langle i{\partial }_{i}{\psi }_{n}|{\psi }_{{n}^{{\prime} }}\rangle {A}_{{n}^{{\prime} }n}^{k}\right)\right].$$
(46)

This is equivalent to Eq. (42). At finite ω, the trace part of the magneto-electric coupling is not represented as a Chern-Simons integral by the same approach.

Gauge-invariant expressions

Since \({\tilde{G}}_{ij}^{(z)}(\omega )\) and \({\tilde{G}}_{ij}^{(z,2)}(\omega )\) are origin dependent only along the z direction, they can be calculated gauge independently when the z direction has open boundaries. Here we derive such gauge-invariant expressions. We focus on the orbital magneto-electric coupling to simplify expressions. It is straightforward to include spin parts as the spin magnetic moment is well defined in momentum space. The expressions of Tij, \({T}_{ij}^{{\prime} }\) and Sijk are also given for completeness.

Let us begin with \({\tilde{G}}_{ix}^{(z)}(\omega )\).

$${\tilde{G}}_{ix}^{(z)}(\omega )= \, {\tilde{G}}_{ix}(\omega )-\frac{i}{2}\omega {\tilde{a}}_{iyz}(\omega )\\= \, \frac{V}{\hslash }\mathop{\sum}\limits_{n\ne m}{f}_{nm}\frac{\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{m}}_{x}|n\rangle -\frac{i}{2}{\omega }_{mn}\langle n|{\hat{P}}_{i}|m\rangle \langle m|{\hat{Q}}_{yz}|n\rangle }{{\omega }_{mn}-\omega }\\= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left({r}_{nm}^{i}\langle m|\frac{1}{2}\left({\hat{r}}^{y}{\hat{v}}^{z}-{\hat{r}}^{z}{\hat{v}}^{y}\right)|n\rangle -\frac{i}{2}{\omega }_{mn}{r}_{nm}^{i}\langle m|{\hat{r}}^{y}{\hat{r}}^{z}|n\rangle \right)\\= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{r}_{nm}^{i}\langle m|\frac{1}{2}\left({\hat{r}}^{y}{\hat{v}}^{z}-{\hat{r}}^{z}{\hat{v}}^{y}-\frac{1}{i\hslash }[{\hat{r}}^{y}{\hat{r}}^{z},\hat{H}]\right)|n\rangle \\= \, \frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{r}_{nm}^{i}\langle m|\frac{1}{2}\left(-{\hat{v}}^{y}{\hat{r}}^{z}-{\hat{r}}^{z}{\hat{v}}^{y}\right)|n\rangle .$$
(47)

Similarly,

$${\tilde{G}}_{iy}^{(z)}(\omega )={\tilde{G}}_{iy}(\omega )+\frac{i}{2}\omega {\tilde{a}}_{ixz}(\omega )=\frac{{e}^{2}}{\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{r}_{nm}^{i}\langle m|\frac{1}{2}\left({\hat{v}}^{x}{\hat{r}}^{z}+{\hat{r}}^{z}{\hat{v}}^{x}\right)|n\rangle .$$
(48)

For the zz component, we obtain

$${\tilde{G}}_{zz}^{(z)}(\omega )= \, {\tilde{G}}_{zz}-\frac{i}{2}\omega \left[{\tilde{{\mathfrak{a}}}}_{zxy}(\omega )-{\tilde{{\mathfrak{a}}}}_{zyx}(\omega )\right]\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}\left[{r}_{nm}^{z}{r}_{mp}^{x}{v}_{pn}^{y}-{r}_{np}^{z}{r}_{pm}^{x}{v}_{mn}^{y}-(x\leftrightarrow y)\right]$$
(49)

by using

$${\tilde{G}}_{zz}= \, \frac{V}{\hslash }\mathop{\sum}\limits_{n\ne m}{f}_{nm}\frac{\langle n|{\hat{P}}_{z}|m\rangle \langle m|{\hat{M}}_{z}|n\rangle }{{\omega }_{mn}-\omega }\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{r}_{nm}^{z}\left[\mathop{\sum}\limits_{p\ne m}({r}_{mp}^{x}{v}_{pn}^{y}-{r}_{mp}^{y}{v}_{pn}^{x})+i{\omega }_{mn}({r}_{mm}^{x}{r}_{mn}^{y}-{r}_{mm}^{y}{r}_{mn}^{x})\right]\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\mathop{\sum}\limits_{p\ne m}({r}_{mp}^{x}{v}_{pn}^{y}-{r}_{mp}^{y}{v}_{pn}^{x})+i{\omega }_{mn}\left(\left[{({\hat{r}}^{z}{\hat{r}}^{x})}_{nm}-\mathop{\sum}\limits_{p\ne m}{r}_{np}^{z}{r}_{pm}^{x}\right]{r}_{mn}^{y}-(x\to y)\right)\right]\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}({r}_{mp}^{x}{v}_{pn}^{y}-{r}_{mp}^{y}{v}_{pn}^{x})-i{\omega }_{mn}\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}\left[{r}_{np}^{z}{r}_{pm}^{x}-(x\leftrightarrow y)\right]{r}_{mn}^{y}\right]\\ + \,\frac{i}{2}\omega \left[{\tilde{{\mathfrak{a}}}}_{zxy}(\omega )-{\tilde{{\mathfrak{a}}}}_{zyx}(\omega )\right].$$
(50)

For the zx component, we can follow the strategy for the zz component

$${\tilde{G}}_{zx}= \, \frac{V}{\hslash }\mathop{\sum}\limits_{n\ne m}{f}_{nm}\frac{\langle n|{\hat{P}}_{z}|m\rangle \langle m|{\hat{M}}_{x}|n\rangle }{{\omega }_{mn}-\omega }\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }{r}_{nm}^{z}\left[\mathop{\sum}\limits_{p\ne m}{r}_{mp}^{y}{v}_{pn}^{z}-\mathop{\sum}\limits_{p}{r}_{mp}^{z}{v}_{pn}^{y}+i{\omega }_{mn}{r}_{mm}^{y}{r}_{mn}^{z}\right]\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\mathop{\sum}\limits_{p\ne m}{r}_{mp}^{y}{v}_{pn}^{z}-{r}_{nm}^{z}\mathop{\sum}\limits_{p}{r}_{mp}^{z}{v}_{pn}^{y}+i{\omega }_{mn}\left(\left[{({\hat{r}}^{z}{\hat{r}}^{y})}_{nm}-\mathop{\sum}\limits_{p\ne m}{r}_{np}^{z}{r}_{pm}^{y}\right]{r}_{mn}^{z}\right)\right]\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{mp}^{y}{v}_{pn}^{z}-{r}_{nm}^{z}\mathop{\sum}\limits_{p}{r}_{mp}^{z}{v}_{pn}^{y}-i{\omega }_{mn}\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{np}^{z}{r}_{pm}^{y}{r}_{mn}^{z}\right]\\ +\,\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{zyz}(\omega )$$
(51)

to define

$${\tilde{G}}_{zx}^{(z,2)}= \, {\tilde{G}}_{zx}-\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{yzz}(\omega )\\= \, \frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\left(\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{mp}^{y}{v}_{pn}^{z}-\mathop{\sum}\limits_{p}{r}_{mp}^{z}{v}_{pn}^{y}\right)-\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{np}^{z}{r}_{pm}^{y}{v}_{mn}^{z}\right].$$
(52)

Similarly, we define

$${\tilde{G}}_{zy}^{(z,2)}= {\tilde{G}}_{zy}+\frac{i}{2}\omega {\tilde{{\mathfrak{a}}}}_{xzz}(\omega )\\= -\frac{{e}^{2}}{2\hslash V}\mathop{\sum}\limits_{n\ne m}\frac{{f}_{nm}}{{\omega }_{mn}-\omega }\left[{r}_{nm}^{z}\left(\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{mp}^{x}{v}_{pn}^{z}-\mathop{\sum}\limits_{p}{r}_{mp}^{z}{v}_{pn}^{x}\right)-\mathop{\sum}\limits_{p;{E}_{p}\ne {E}_{m}}{r}_{np}^{z}{r}_{pm}^{x}{v}_{mn}^{z}\right].$$
(53)

Note that \({G}_{zi}^{(z)}={G}_{zi}^{(z,2)}\) while \({G}_{zi}^{{\prime} (z)}\,\ne \,{G}_{zi}^{{\prime} (z,2)}\) for i = x, y.

Fully origin-independent bulk response functions Tij, \({T}_{ij}^{{\prime} }\) and Sijk can be calculated from the bulk conductivity tensor defined by Ji = ∑j,kσijkqjEk19:

$${T}_{ij}(\omega )= \, \frac{1}{3i}{\epsilon }_{jkl}{\sigma }_{ikl},\\ {T}_{ij}^{{\prime} }(\omega )= \, \frac{1}{8i}{\epsilon }_{jkl}\left[2({\sigma }_{ikl}-{\sigma }_{kil})-({\sigma }_{kli}-{\sigma }_{lki})\right],\\ {S}_{ijk}(\omega )= \, -\frac{1}{6i}\left({\sigma }_{ijk}+{\sigma }_{jki}+{\sigma }_{kij}+{\sigma }_{jik}+{\sigma }_{ikj}+{\sigma }_{kji}\right),$$
(54)

where

$${\sigma }_{ikl}= \, \frac{i{e}^{2}}{\hslash \omega }\mathop{\sum}\limits_{n,m}{\int}_{{{{{{{{\bf{k}}}}}}}}}{f}_{nm}\left[\frac{{\omega }_{mn}}{{\omega }_{mn}-\omega }\left({({r}_{nm}^{j}{B}_{mn}^{ik})}^{*}+{r}_{nm}^{i}{B}_{mn}^{jk}\right)+\frac{{\omega }_{mn}^{2}}{2{({\omega }_{mn}-\omega )}^{2}}{r}_{nm}^{i}{r}_{mn}^{j}({v}_{mm}^{k}+{v}_{nn}^{k})\right]\\ -\,\frac{i{e}^{2}}{\hslash {\omega }^{2}}\mathop{\sum}\limits_{n,m}{\int}_{{{{{{{{\bf{k}}}}}}}}}{\partial }_{k}{f}_{n}{v}_{nn}^{i}{v}_{nn}^{j}$$
(55)

and

$${B}_{mn}^{kl}=\frac{1}{2}\left(\mathop{\sum}\limits_{p\ne m}{r}_{mp}^{k}{v}_{pn}^{l}+\mathop{\sum}\limits_{p\ne n}{v}_{mp}^{l}{r}_{pn}^{k}\right).$$
(56)

Decomposition of the position operator

Here we derive Eq. (9). Let us consider the matrix element of the position operator in the Bloch state basis. We transform it to the Wannier basis \(\left|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\right\rangle\) by

$$\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\hat{r}}^{z}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle= \, \mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle \langle {w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, \mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle \langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{T}}_{{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}^{{{{\dagger}}} }{\hat{r}}^{z}{\hat{T}}_{{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, \mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle ({R}^{z}{\delta }_{\alpha \beta }{\delta }_{{{{{{{{\bf{R}}}}}}}},{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}+\langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha ({{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\rangle )\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, \mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle \langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha ({{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle+\mathop{\sum}\limits_{\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {R}^{z}\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle,$$
(57)

where we use \(|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle={\hat{T}}_{{{{{{{{\bf{R}}}}}}}}}|{w}_{\alpha {{{{{{{\bf{0}}}}}}}}}\rangle\) in the second line, where \({\hat{T}}_{{{{{{{{\bf{R}}}}}}}}}\) is the translation operator by R, and use \({\hat{T}}_{{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}^{{{{\dagger}}} }{\hat{r}}^{z}{\hat{T}}_{{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}={\hat{r}}^{z}+{{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}^{z}\) and the orthonormality of the Wannier states \(\langle {w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle={\delta }_{\alpha \beta }{\delta }_{{{{{{{{\bf{R}}}}}}}},{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\) in the third line. To get Eq. (9) from Eq. (57), we define \({{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})=\mathop{\sum}\limits_{{{{{{{{\bf{R}}}}}}}}}\langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {e}^{i{{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}}\), such that

$$\langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle=\frac{1}{N}\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}{{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}}){e}^{-i{{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}},$$
(58)

where N is the number of k points. Then, the first term in the last line of Eq. (57) becomes

$$ \mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}} \langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\rangle \langle {w}_{\beta {{{{{{{\bf{0}}}}}}}}}|{\hat{r}}^{z}|{w}_{\alpha ({{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\rangle \langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ \quad=\mathop{\sum}\limits_{\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} };\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{\bf{0}}}}}}}}}\rangle \frac{1}{N}\mathop{\sum}\limits_{{{{{{{{{\bf{k}}}}}}}}}^{{\prime\prime} }}{{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{{\bf{k}}}}}}}}}^{{\prime\prime} }){e}^{-i{{{{{{{{\bf{k}}}}}}}}}^{{\prime\prime} }\cdot ({{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}{e}^{i({{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{\bf{R}}}}}}}}-{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\cdot {{{{{{{{\bf{R}}}}}}}}}^{{\prime} })}\langle {w}_{\alpha {{{{{{{\bf{0}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ \quad=\, \mathop{\sum}\limits_{\alpha,\beta,{{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\beta {{{{{{{\bf{0}}}}}}}}}\rangle {{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}}){e}^{i{{{{{{{\bf{k}}}}}}}}\cdot {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}{e}^{-i{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\cdot {{{{{{{{\bf{R}}}}}}}}}^{{\prime} }}\langle {w}_{\alpha {{{{{{{\bf{0}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ \quad=\, {\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}N\mathop{\sum}\limits_{\alpha,\beta }\langle {\psi }_{m{{{{{{{\bf{k}}}}}}}}}|{w}_{\beta {{{{{{{\bf{0}}}}}}}}}\rangle {{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})\langle {w}_{\alpha {{{{{{{\bf{0}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ \quad=\, {\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\mathop{\sum}\limits_{\alpha,\beta }\langle {\psi }_{m{{{{{{{\bf{k}}}}}}}}}|{\psi }_{\beta {{{{{{{\bf{k}}}}}}}}}\rangle {{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})\langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle,$$
(59)

and we arrive at Eq. (9).

$$\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\hat{r}}^{z}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle= \, {\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\mathop{\sum}\limits_{\beta,\alpha }\langle {\psi }_{m{{{{{{{\bf{k}}}}}}}}}|{\psi }_{\beta {{{{{{{\bf{k}}}}}}}}}\rangle {{\mathbb{A}}}_{\beta \alpha }^{z}({{{{{{{\bf{k}}}}}}}})\langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\ +\mathop{\sum}\limits_{\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {R}^{z}\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle .$$
(60)

The second term is nonzero only when n = m because

$$\mathop{\sum}\limits_{\alpha,{{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{w}_{\alpha {{{{{{{\bf{R}}}}}}}}}\rangle {R}^{z}\langle {w}_{\alpha {{{{{{{\bf{R}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle= \, \frac{1}{N}\mathop{\sum}\limits_{\alpha,{{{{{{{\bf{R}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}_{1},{{{{{{{{\bf{k}}}}}}}}}_{2}}{e}^{i({{{{{{{{\bf{k}}}}}}}}}_{2}-{{{{{{{{\bf{k}}}}}}}}}_{1})\cdot {{{{{{{\bf{R}}}}}}}}}\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\psi }_{\alpha {{{{{{{{\bf{k}}}}}}}}}_{1}}\rangle {R}^{z}\langle {\psi }_{\alpha {{{{{{{{\bf{k}}}}}}}}}_{2}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, \frac{1}{N}\mathop{\sum}\limits_{{{{{{{{\bf{R}}}}}}}}}{e}^{i({{{{{{{\bf{k}}}}}}}}-{{{{{{{{\bf{k}}}}}}}}}^{{\prime} })\cdot {{{{{{{\bf{R}}}}}}}}}{R}^{z}\mathop{\sum}\limits_{\alpha }\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\psi }_{\alpha {{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\rangle \langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, -i{\partial }_{{k}^{z}}{\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\mathop{\sum}\limits_{\alpha }\langle {\psi }_{m{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}|{\psi }_{\alpha {{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\rangle \langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\rangle \\= \, -i{\delta }_{mn}{\partial }_{{k}^{z}}{\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }},$$
(61)

where we use that \({\partial }_{{k}^{z}}{\delta }_{{{{{{{{\bf{k}}}}}}}},{{{{{{{{\bf{k}}}}}}}}}^{{\prime} }}\,\ne \,0\) requires \({{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\to {{{{{{{\bf{k}}}}}}}}\), and \({\sum }_{\alpha }\left\langle {\psi }_{m{{{{{{{\bf{k}}}}}}}}}|{\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}\right\rangle \left\langle {\psi }_{\alpha {{{{{{{\bf{k}}}}}}}}}|{\psi }_{n{{{{{{{\bf{k}}}}}}}}}\right\rangle={\delta }_{mn}\).

Macroscopic electrodynamics in the medium

Electric displacement D and magnetic field H satisfying Maxwell’s equations

$$\nabla \cdot {{{{{{{\bf{D}}}}}}}} ={\rho }^{f},\\ \nabla \times {{{{{{{\bf{H}}}}}}}} ={{{{{{{{\bf{J}}}}}}}}}^{f}+\dot{{{{{{{{\bf{D}}}}}}}}}$$
(62)

are defined by

$$\begin{array}{l}\left(\begin{array}{l}{{{{{{{\bf{D}}}}}}}}\\ {{{{{{{\bf{H}}}}}}}}\end{array}\right)=\left(\begin{array}{ll}\tilde{A}&\tilde{T}\\ \tilde{U}&\tilde{X}\end{array}\right)\left(\begin{array}{l}{{{{{{{\bf{E}}}}}}}}\\ {{{{{{{\bf{B}}}}}}}}\end{array}\right),\end{array}$$
(63)

where \(\tilde{F}=F-iF^{\prime}\) for F = A, T, U, X, and

$${A}_{ij} ={\epsilon }_{0}{\delta }_{ij}+{\chi }_{ij}+\frac{1}{3}({a}_{ijk}^{{\prime} }+{a}_{jki}^{{\prime} }+{a}_{kij}^{{\prime} }){k}_{k},\\ {A}_{ij}^{{\prime} } ={\chi }_{ij}^{{\prime} },\\ {T}_{ij} ={G}_{ij}-\frac{1}{3}{\delta }_{ij}{G}_{kk}-\frac{1}{6}\omega {\epsilon }_{jkl}{a}_{kli}^{{\prime} },\\ {T}_{ij}^{{\prime} } ={G}_{ij}^{{\prime} }-\frac{1}{2}\omega {\epsilon }_{jkl}{a}_{kli},\\ {U}_{ij} =-{U}_{ji},\\ {U}_{ij}^{{\prime} } ={U}_{ji},\\ {X}_{ij} ={\mu }_{0}^{-1}{\delta }_{ij},\\ {X}_{ij}^{{\prime} } =\, 0$$
(64)

up to electric quadrupole-magnetic dipole order. While Maxwell’s equations do not uniquely specify the form of D and H, additional requirements from the reciprocity relations and spatial-origin independence gives Eq. (64) as shown in25. Here, the free charge and current ρf and Jf are boundary charge and current appearing due to the change of material properties across the interface that is not described by \({\tilde{T}}_{ij}\), \({\tilde{U}}_{ij}\), and \({S}_{ijk}=({a}_{ijk}^{{\prime} }+{a}_{jki}^{{\prime} }+{a}_{kij}^{{\prime} })/3\).

In PT-symmetric systems where \({G}_{ij}^{{\prime} }={a}_{ijk}=0\),

$${\rho }^{f}= \, -{B}_{j}{\partial }_{i}\left({G}_{ij}-{T}_{ij}-\frac{1}{2}{\epsilon }_{jkl}{a}_{kli}^{{\prime} }\right)+\frac{i}{2}({\partial }_{j}{E}_{k}+{\partial }_{k}{E}_{j}){\partial }_{i}({a}_{kij}^{{\prime} }-{S}_{kij}) \\ +\frac{i}{2}{E}_{k}{\partial }_{i}{\partial }_{j}({a}_{kij}^{{\prime} }-{S}_{kij}),\\ {J}_{i}^{f}= \, {E}_{j}{\partial }_{k}\left[{\epsilon }_{ikl}({G}_{jl}-{T}_{ji})-\frac{\omega }{2}({a}_{jik}^{{\prime} }-{S}_{jik})\right],$$
(65)

Wave equation

The wave equation up to electric quadrupole/magnetic dipole takes the following form21:

$$\left[{\delta }_{ij}+{\epsilon }_{0}^{-1}{\tilde{\chi }}_{ij}+i{\mu }_{0}nc\mathop{\sum}\limits_{k}{\kappa }_{k}{\tilde{\sigma }}_{ijk}+{n}^{2}({\kappa }_{i}{\kappa }_{j}-{\delta }_{ij})\right]{E}_{j}=0$$
(66)

where \({\tilde{\chi }}_{ij}={\chi }_{ij}-i{\chi }_{ij}^{{\prime} }\) satisfying χij = χji and \({\chi }_{ij}^{{\prime} }=-{\chi }_{ji}^{{\prime} }\), κi = ki/k is the propagation direction of light, n is the refractive index, \({\tilde{\sigma }}_{ijk}={\sigma }_{ijk}-i{\sigma }_{ijk}^{{\prime} }\) is the complex bulk conductivity coefficient defined by \({\tilde{\sigma }}_{ij}({{{{{{{\bf{q}}}}}}}})={\tilde{\sigma }}_{ij}(0)+{\tilde{\sigma }}_{ijk}{q}_{k}+\ldots\), and

$${\sigma }_{ijk}= \, i\left[{\epsilon }_{ikl}{G}_{jl}+{\epsilon }_{jkl}{G}_{il}-\frac{1}{2}\omega ({a}_{ijk}^{{\prime} }+{a}_{jik}^{{\prime} })\right]={\sigma }_{jik},\\ {\sigma }_{ijk}^{{\prime} }= \, i\left[-{\epsilon }_{ikl}{G}_{jl}^{{\prime} }+{\epsilon }_{jkl}{G}_{il}^{{\prime} }+\frac{1}{2}\omega ({a}_{ijk}-{a}_{jik})\right]=-{\sigma }_{jik}^{{\prime} }.$$
(67)

Let us assume C3z and C2x symmetries for simplicity. We further impose that the bulk Hall response is zero, i.e., \({\chi }_{ij}^{{\prime} }=0\), in order to focus on the magneto-electric and electric-quadrupole effects. For \({{{{{{{\boldsymbol{\kappa }}}}}}}}=\pm \hat{z}\), the wave equation is

$$\begin{array}{l}\left(\begin{array}{ll}1+{\epsilon }_{0}^{-1}{\chi }_{xx}-{n}^{2}&n{\kappa }_{z}{\mu }_{0}c{\sigma }_{xyz}^{{\prime} }\\ -n{\kappa }_{z}{\mu }_{0}c{\sigma }_{xyz}^{{\prime} }&1+{\epsilon }_{0}^{-1}{\chi }_{xx}-{n}^{2}\end{array}\right)\left(\begin{array}{l}{E}_{x}\\ {E}_{y}\end{array}\right)=0.\end{array}$$
(68)

The refractive index satisfying the wave equation is given by

$${n}_{\pm }^{{\kappa }_{z}}=\sqrt{1+{\epsilon }_{0}^{-1}{\chi }_{xx}+{(i{\mu }_{0}c{\sigma }_{xyz}^{{\prime} }/2)}^{2}}\mp i{\kappa }_{z}{\mu }_{0}c{\sigma }_{xyz}^{{\prime} }/2,$$
(69)

for circular polarization \(\hat{\pm }=\hat{x}+i\hat{y}\).

Reflection and transmission from a single interface

We consider the interface of medium 1 (z > 0) and medium 2 (z < 0) with the surface normal \(\hat{z}\). For normal incidence, in the circularly polarized basis,

$$\begin{array}{l}\left(\begin{array}{l}{H}_{+}\\ {H}_{-}\end{array}\right)=\frac{1}{{\mu }_{0}c}\left(\begin{array}{ll}-{\mu }_{0}c{\tilde{T}}_{xx}^{\mu }-i{n}_{\mu+}^{{\kappa }_{z}}{\kappa }_{z}&0\\ 0&-{\mu }_{0}c{\tilde{T}}_{xx}^{\mu }+i{n}_{\mu -}^{{\kappa }_{z}}{\kappa }_{z}\end{array}\right)\left(\begin{array}{l}{E}_{+}\\ {E}_{-}\end{array}\right)\end{array}$$
(70)

within the media μ = 1 or 2, where n± depends on the sign κz, and κz = − 1 for incident and transmitted light, while κz = 1 for reflected light. Here,

$${\tilde{T}}_{xx} =\frac{1}{3}({G}_{xx}-{G}_{zz})-\frac{1}{6}\omega ({a}_{yzx}^{{\prime} }-{a}_{zyx}^{{\prime} })-i\left[{G}_{xx}^{{\prime} }-\frac{1}{2}\omega ({a}_{yzx}-{a}_{zyx})\right]\\ =\frac{i}{3}{\sigma }_{zxy}-\frac{1}{2}{\sigma }_{xyz}^{{\prime} }.$$
(71)

As we consider light incident from medium 1 to medium 2, the electric field in medium 1 consists of incident and reflected fields while that in medium 2 is the transmitted field.

$${{{{{{{{\bf{E}}}}}}}}}_{1}= \, {{{{{{{{\bf{E}}}}}}}}}^{i}+{{{{{{{{\bf{E}}}}}}}}}^{r}\equiv (1+r){{{{{{{{\bf{E}}}}}}}}}^{i},\\ {{{{{{{{\bf{E}}}}}}}}}_{2}= \, {{{{{{{{\bf{E}}}}}}}}}^{t}\equiv t{{{{{{{{\bf{E}}}}}}}}}^{i},$$
(72)

where

$$\begin{array}{l}r=\left(\begin{array}{ll}{r}_{++ }&{r}_{++ }\\ {r}_{-+}&{r}_{--}\end{array}\right)=\left(\begin{array}{ll}{r}_{++ }&0\\ 0&{r}_{--}\end{array}\right),\quad t=1+r\end{array}$$
(73)

by C3z symmetry and the continuity of E at the interface.

The H field satisfies the boundary condition

$$\begin{array}{l}{{{{{{{{\bf{H}}}}}}}}}^{t}={{{{{{{{\bf{H}}}}}}}}}^{i}+{{{{{{{{\bf{H}}}}}}}}}^{r}+\hat{z}\times {{{{{{{{\bf{j}}}}}}}}}_{f},\end{array}$$
(74)

where \(\hat{z}\times {{{{{{{{\bf{j}}}}}}}}}_{f}=({e}^{2}/2\pi h)({\theta }_{2}^{(z)}-{\theta }_{1}^{(z)}){{{{{{{\bf{E}}}}}}}}\) is the surface current due to the axion magneto-electric coupling. In terms of B fields, the boundary condition has the form of Eq. (13):

$$\begin{array}{ll}{{{{{{{{\bf{B}}}}}}}}}^{t}&={{{{{{{{\bf{B}}}}}}}}}^{i}+{{{{{{{{\bf{B}}}}}}}}}^{r}+{\mu }_{0}\hat{z}\times {{{{{{{{\bf{j}}}}}}}}}_{s},\end{array}$$
(75)

where \({{{{{{{{\bf{j}}}}}}}}}_{s}={{{{{{{{\bf{j}}}}}}}}}_{f}+({\tilde{T}}_{xx}^{2}-{\tilde{T}}_{xx}^{1}){{{{{{{\bf{E}}}}}}}}\times \hat{z}\) is the total two-dimensional surface current density. By solving the boundary condition, we obtain

$${r}_{++ }= \, \frac{{n}_{1L}-{n}_{2L}-i{\mu }_{0}c{\sigma }_{xy}^{s}}{{n}_{1R}+{n}_{2L}+i{\mu }_{0}c{\sigma }_{xy}^{s}},\\ {r}_{--}= \, \frac{{n}_{1R}-{n}_{2R}+i{\mu }_{0}c{\sigma }_{xy}^{s}}{{n}_{1L}+{n}_{2R}-i{\mu }_{0}c{\sigma }_{xy}^{s}},\\ {r}_{+-}= \, {r}_{-+}=0,$$
(76)

where \({n}_{\mu L}={n}_{\mu+}^{-}={n}_{\mu -}^{+}\) is the refractive index for left circularly polarization, and \({n}_{\mu R}={n}_{\mu -}^{-}={n}_{\mu+}^{+}\) is the refractive index for the right circularly polarization, and

$${\sigma }_{xy}^{s}={\tilde{G}}_{xx}^{(z),2}-{\tilde{G}}_{xx}^{(z),1}={\tilde{T}}_{xx}^{\mu=2}-{\tilde{T}}_{xx}^{\mu=1}+{\sigma }_{xy}^{f}$$
(77)

is the two-dimensional surface conductivity, and \({\sigma }_{xy}^{f}=({e}^{2}/2\pi h)({\theta }_{2}^{(z)}-{\theta }_{1}^{(z)})\). From the expressions of r++ and r−− and Eq. (69), we obtain the Kerr angle

$${\varphi }_{K}={\tan }^{-1}\frac{{r}_{xy}}{{r}_{xx}}={\tan }^{-1}\left[\frac{-i({r}_{++ }-{r}_{--})}{{r}_{++ }+{r}_{--}}\right].$$
(78)

In non-magnetic systems where \({G}_{ij}={a}_{ijk}^{{\prime} }=0\), the Kerr angle vanishes systems because

$${r}_{++ }-{r}_{--}\propto i{\mu }_{0}c(\sigma _{xyz}^{\prime \mu=2}-\sigma _{xyz}^{\prime\mu=1})+{n}_{2R}-{n}_{2L}-({n}_{1R}-{n}_{1L})=0,$$
(79)

where we use that \({\tilde{T}}_{xx}^{\mu }=-\sigma _{xyz}^{\prime\mu }/2\) when σijk = 0. One may think of this cancellation as a compensation between bulk and surface responses. The refractive indices are responsible for circular birefringence in the bulk, while \(\delta {\sigma }_{xyz}^{{\prime} }\) is responsible for the surface current that leads to the jump of B field at the surface. Their effects cancel such that there is no net polar Kerr rotation (i.e., no Kerr rotation at normal incidence), compatible with the reciprocal relation imposed by time reversal symmetry33,51,52,53,54.

In contrast, a nonzero polar Kerr rotation PT-symmetric antiferromagnets occurs because no compensation occurs due to the absence of circular birefringence in the bulk. By imposing PT symmetry and breaking T symmetry, we get a nonzero Kerr angle from

$${r}_{xx}= \, \frac{1}{2}({r}_{++ }+{r}_{--})=\frac{({n}_{1}-{n}_{2})({n}_{1}+{n}_{2})-{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}}{{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {r}_{xy}= \, \frac{1}{2i}({r}_{++ }-{r}_{--})=-\frac{2{n}_{1}({\mu }_{0}c{\sigma }_{xy}^{s})}{{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}+{({n}_{1}+{n}_{2})}^{2}}.$$
(80)

Note that our derivation using the H field make it manifest that the magneto-optic reflection has two different origins: bulk-propagation and surface effects, respectively contained in Txx and θ(z).

Reflection and transmission from two interfaces

Now we consider three media with (nμ, mμ), where μ = 1, 2, 3, as shown in Fig. 3b. We consider the limit where the wavelength of light λ is much larger than the sample thickness d and neglect the variation of the electric field between the top and bottom within the sample.

The Jones reflection matrix of the sample for light incident from medium 1 is then

$$r= \, {r}_{T}+{t}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}{e}^{i\phi }{t}_{T}+{t}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}({e}^{i\phi }{r}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}){e}^{i\phi }{t}_{T}+{t}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}{({e}^{i\phi }{r}_{T}^{{\prime} }{e}^{i\phi }{r}_{B})}^{2}{e}^{i\phi }{t}_{T}+\ldots \\= \, {r}_{T}+{e}^{2i\phi }{t}_{T}^{{\prime} }{r}_{B}{\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}{t}_{T}\\= \, {r}_{T}+{e}^{2i\phi }(1+{r}_{T}^{{\prime} }){r}_{B}{\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}(1+{r}_{T}),$$
(81)

where we used tT,B = 1 + rT,B and \({t}_{T}^{{\prime} }=1+{r}_{T}^{{\prime} }\), and

$$\phi=\frac{{n}_{2}\omega d}{c}$$
(82)

is the complex-valued phase obtained by the propagation across the sample. Here, the subscript for r and t indicates the top and bottom of the sample, and the prime indicates the process where the light propagation is reversed (we follow the notation in ref. 13).

For transmission, the Jones matrix is

$$t= \, {t}_{B}{e}^{i\phi }{t}_{T}+{t}_{B}({e}^{i\phi }{r}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}){e}^{i\phi }{t}_{T}+{t}_{B}({e}^{i\phi }{r}_{T}^{{\prime} }{e}^{i\phi }{r}_{B})({e}^{i\phi }{r}_{T}^{{\prime} }{e}^{i\phi }{r}_{B}){t}_{T}+\ldots \\= \, {t}_{B}{\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}{e}^{i\phi }{t}_{T}\\= \, (1+{r}_{B}){\left(1-{e}^{2i\phi }{r}_{T}^{{\prime} }{r}_{B}\right)}^{-1}{e}^{i\phi }(1+{r}_{T}).$$
(83)

The expressions above show that both r and t can be obtained obtained from the reflective Jones matrices at the top and bottom interfaces. We suppose that each media has PT, C3z-, and MxT symmetries, which is the setup in main text. Then we have

$${({r}_{T})}_{xx}= \, \frac{({n}_{1}-{n}_{2})({n}_{1}+{n}_{2})-{({\mu }_{0}c{\sigma }_{xy}^{T})}^{2}}{{({\mu }_{0}c{\sigma }_{xy}^{T})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {({r}_{T})}_{xy}= \, -\frac{2{n}_{1}({\mu }_{0}c{\sigma }_{xy}^{T})}{{({\mu }_{0}c{\sigma }_{xy}^{s})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {({r}_{T}^{{\prime} })}_{xx}= \, \frac{-({n}_{1}-{n}_{2})({n}_{1}+{n}_{2})-{({\mu }_{0}c{\sigma }_{xy}^{T})}^{2}}{{({\mu }_{0}c{\sigma }_{xy}^{T})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {({r}_{T}^{{\prime} })}_{xy}= \, -\frac{2{n}_{2}({\mu }_{0}c{\sigma }_{xy}^{T})}{{({\mu }_{0}c{\sigma }_{xy}^{T})}^{2}+{({n}_{1}+{n}_{2})}^{2}},\\ {({r}_{B})}_{xx}= \, \frac{({n}_{2}-{n}_{3})({n}_{2}+{n}_{3})-{({\mu }_{0}c{\sigma }_{xy}^{B})}^{2}}{{({\mu }_{0}c{\sigma }_{xy}^{B})}^{2}+{({n}_{2}+{n}_{3})}^{2}},\\ {({r}_{B})}_{xy}= \, \frac{2{n}_{2}({\mu }_{0}c{\sigma }_{xy}^{B})}{{({\mu }_{0}c{\sigma }_{xy}^{B})}^{2}+{({n}_{2}+{n}_{3})}^{2}},$$
(84)

where the superscript T or B for the surface Hall conductivity σxy indicates the top and bottom surfaces.

Kerr angle and Stokes parameters

The reflective circular dichroism and Kerr rotation angle can be defined with Stokes parameters si = 0,1,2,3 for reflected light by

$${{{{{{{\rm{RCD}}}}}}}} \equiv \, \frac{{s}_{3}}{{s}_{0}},\\ {\vartheta }_{K} \equiv \, \frac{1}{2}{\tan }^{-1}\frac{{s}_{2}}{{s}_{1}},$$
(85)

where Stokes parameters for reflected light are

$${s}_{0}= \, {I}_{+}^{r}+{I}_{-}^{r},\\ {s}_{1}= \, {I}_{x}^{r}-{I}_{y}^{r},\\ {s}_{2}= \, {I}_{x+y}^{r}-{I}_{x-y}^{r},\\ {s}_{3}= \, {I}_{+}^{r}-{I}_{-}^{r},$$
(86)

where \(\hat{\pm }=\hat{x}\pm i\hat{y}\). While this definition of the complex Kerr angle is different from the more popular ϕK = rxy/rxx = φK + iηK we use in the main text, it has the advantage that it can be obtained simply by measuring the intensity of the linearly polarized light. In our case where C3z symmetry is present, the reflective coefficients in circularly polarized basis are

$${r}_{++ }= \, \frac{1}{2}\left[{r}_{xx}+{r}_{yy}+i\big({r}_{xy}-{r}_{yx}\big)\right]={r}_{xx}+i{r}_{xy},\\ {r}_{--}= \, \frac{1}{2}\left[{r}_{xx}+{r}_{yy}-i\big({r}_{xy}-{r}_{yx}\big)\right]={r}_{xx}-i{r}_{xy},\\ {r}_{+-}= \, \frac{1}{2}\left[{r}_{xx}-{r}_{yy}-i\big({r}_{xy}+{r}_{yx}\big)\right]=0,\\ {r}_{-+}= \, \frac{1}{2}\left[{r}_{xx}-{r}_{yy}+i\big({r}_{xy}+{r}_{yx}\big)\right]=0.$$
(87)

It follows that \({{{{{{{\rm{RCD}}}}}}}}=2{{{{{{{\rm{Im}}}}}}}}[{r}_{xx}^{*}{r}_{xy}]/(|{r}_{xx}{|}^{2}+|{r}_{xy}{|}^{2})\simeq 2{{{{{{{\rm{Im}}}}}}}}[{r}_{xy}/{r}_{xx}]=2{\eta }_{K}\) and \({\vartheta }_{K}=\frac{1}{2}{\tan }^{-1}[2{{{{{{{\rm{Re}}}}}}}}({r}_{xx}{r}_{yx}^{*})/(|{r}_{xx}{|}^{2}-|{r}_{xy}{|}^{2})]\simeq {{{{{{{\rm{Re}}}}}}}}[{r}_{xy}/{r}_{xx}]={\varphi }_{K}\) for small rxy/rxx.

Tight-binding model of MnBi2Te4

We begin with the lattice version of the three-dimensional low-energy model of MnBi2Te4 at Γ = (0, 0, 0) presented in ref. 36. We consider the four basis states.

$$\begin{array}{l}\left|P{1}_{z}^{+},\uparrow \right\rangle,\quad \left|P{2}_{z}^{-},\uparrow \right\rangle,\quad \left|P{1}_{z}^{+},\downarrow \right\rangle,\quad \left|P{2}_{z}^{-},\downarrow \right\rangle,\end{array}$$
(88)

where P1 and P2 states originates from the p orbitals in Bi and Te, respectively, the sign ± indicates the inversion parities, and the arrows indicate the spin-z direction. The symmetry operators in the nonmagnetic state are time reversal T, inversion P, threefold rotation C3z, and twofold rotation C2x.

$$\begin{array}{l}T=i{s}_{y}K,\quad P={\tau }_{z},\quad {C}_{3z}=\exp \left[-i\frac{\pi }{3}{s}_{z}\right],\quad {C}_{2x}=-i{s}_{x}.\end{array}$$
(89)

Because of PT = isyτzK symmetry, only the following five Gamma matrices are allowed in the Hamiltonian in addition to the overall energy shift proportional to the identity matrix Γ0.

$${{{\Gamma }}}_{1}= \, {s}_{x}{\tau }_{x}(+,-),\\ {{{\Gamma }}}_{2}= \, {s}_{y}{\tau }_{x} (-,-),\\ {{{\Gamma }}}_{3}= \, {s}_{z}{\tau }_{x}(-,-),\\ {{{\Gamma }}}_{4}= \, {s}_{0}{\tau }_{y} (+,-),\\ {{{\Gamma }}}_{5}= \, {s}_{0}{\tau }_{z} (+,+).$$
(90)

Here, the pair signs show the commutation (+) and anticommutation (−) relations with C2x and P in order, i.e., \(({\epsilon }_{{C}_{2x}},{\epsilon }_{P})\) where MΓi = ϵMΓiM for i = 1, …, 5.

The low-energy effective Hamiltonian up to second order in k has the form

$${h}_{{{{{{{{\rm{eff}}}}}}}}}= \, \epsilon ({{{{{{{\bf{k}}}}}}}}){s}_{0}{\tau }_{0}+{A}_{1}{k}_{z}{s}_{z}{\tau }_{x}+{A}_{2}({k}_{x}{s}_{x}+{k}_{y}{s}_{y}){\tau }_{x}+M({{{{{{{\bf{k}}}}}}}}){\tau }_{z}\\= \, \epsilon ({{{{{{{\bf{k}}}}}}}}){{{\Gamma }}}_{0}+{A}_{1}{k}_{z}{{{\Gamma }}}_{3}+{A}_{2}({k}_{x}{{{\Gamma }}}_{1}+{k}_{y}{{{\Gamma }}}_{2})+M({{{{{{{\bf{k}}}}}}}}){{{\Gamma }}}_{5},$$
(91)

where

$$\epsilon ({{{{{{{\bf{k}}}}}}}})= \, {C}_{0}+{C}_{1}{k}_{z}^{2}+{C}_{2}({k}_{x}^{2}+{k}_{y}^{2}),\\ M({{{{{{{\bf{k}}}}}}}})= \, {M}_{0}+{M}_{1}{k}_{z}^{2}+{M}_{2}({k}_{x}^{2}+{k}_{y}^{2}).$$
(92)

Let us find a tight-binding Hamiltonian on the two-dimensional triangular lattice that leads to the above low-energy Hamiltonian. We suppose that each lattice site has four degrees of freedom given by Eq. (88).

$$\begin{array}{l}{\hat{H}}_{{{{{{{{\rm{TB}}}}}}}}}=\mathop{\sum}\limits_{i,\alpha,\beta }{\hat{c}}_{i\alpha }^{{{{\dagger}}} }{({h}_{0})}_{\alpha \beta }{\hat{c}}_{i\beta }-\mathop{\sum}\limits_{\left\langle i,j\right\rangle,\alpha,\beta }{\hat{c}}_{i\alpha }^{{{{\dagger}}} }{t}_{\alpha \beta }^{ij}{\hat{c}}_{j\beta }.\end{array}$$
(93)

Here, the onsite Hamiltonian satisfying all the symmetries of the nonmagnetic state is

$$\begin{array}{l}{h}_{0}={e}_{0}{{{\Gamma }}}_{0}+{e}_{5}{{{\Gamma }}}_{5}.\end{array}$$
(94)

Along the z direction, the nearest-neighbor hopping matrices are

$$\begin{array}{l}{T}_{4}\equiv {t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{4},j}={t}_{0}^{z}{{{\Gamma }}}_{0}+i{t}_{3}^{z}{{{\Gamma }}}_{3}+{t}_{5}^{z}{{{\Gamma }}}_{5},\end{array}$$
(95)

where a4 = (0, 0, az), az is the inter-layer lattice parameter. the form of T4 is constrained by the following symmetry conditions

$$T{T}_{4}{T}^{-1}= \, {T}_{4},\\ {C}_{3z}{T}_{4}{C}_{3z}^{-1}= \, {T}_{4},\\ {C}_{2x}{T}_{4}{C}_{2x}^{-1}= \, {T}_{4}^{{{{\dagger}}} },\\ P{T}_{4}{P}^{-1}= \, {T}_{4}^{{{{\dagger}}} },$$
(96)

For the in-plane directions, the hopping matrices are

$${T}_{1} \equiv \, {t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{1},j}={t}_{0}{{{\Gamma }}}_{0}+i{t}_{1}{{{\Gamma }}}_{1}+i{t}_{4}{{{\Gamma }}}_{4}+{t}_{5}{{{\Gamma }}}_{5}={({t}^{j,j+{{{{{{{{\bf{a}}}}}}}}}_{1}})}^{{{{\dagger}}} },\\ {T}_{2} \equiv \, {t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{2},j}={C}_{3z}{T}_{1}{C}_{3z}^{-1}={t}_{0}{{{\Gamma }}}_{0}+i{t}_{1}\frac{-{{{\Gamma }}}_{1}+\sqrt{3}{{{\Gamma }}}_{2}}{2}+i{t}_{4}{{{\Gamma }}}_{4}+{t}_{5}{{{\Gamma }}}_{5}={({t}^{j,j+{{{{{{{{\bf{a}}}}}}}}}_{2}})}^{{{{\dagger}}} },\\ {T}_{3} \equiv \, {t}^{j+{{{{{{{{\bf{a}}}}}}}}}_{3},j}={C}_{3z}{T}_{2}{C}_{3z}^{-1}={t}_{0}{{{\Gamma }}}_{0}+i{t}_{1}\frac{-{{{\Gamma }}}_{1}-\sqrt{3}{{{\Gamma }}}_{2}}{2}+i{t}_{4}{{{\Gamma }}}_{4}+{t}_{5}{{{\Gamma }}}_{5}={({t}^{j,j+{{{{{{{{\bf{a}}}}}}}}}_{3}})}^{{{{\dagger}}} }$$
(97)

along the in-plane directions, where a1 = (a, 0, 0), a2 = C3za1, a3 = C3za2, a is the in-plane lattice parameter, and the form of T1 is constrained by the following symmetry conditions

$$T{T}_{1}{T}^{-1}= \, {T}_{1},\\ {C}_{2x}{T}_{1}{C}_{2x}^{-1}= \, {T}_{1},\\ P{T}_{1}{P}^{-1}= \, {T}_{1}^{{{{\dagger}}} },$$
(98)

and Tis are related by C3z because of C3z symmetry imposing, for example,

$${\hat{c}}_{j+{{{{{{{{\bf{a}}}}}}}}}_{2}\alpha }^{{{{\dagger}}} }{({T}_{2})}_{\alpha \beta }{\hat{c}}_{j\beta }= \, {\hat{C}}_{3z}{\hat{c}}_{j+{{{{{{{{\bf{a}}}}}}}}}_{1}\alpha }^{{{{\dagger}}} }{({T}_{1})}_{\alpha \beta }{\hat{c}}_{j\beta }{\hat{C}}_{3z}^{-1}\\= \, \left({\hat{C}}_{3z}{\hat{c}}_{j+{{{{{{{{\bf{a}}}}}}}}}_{1}\alpha }^{{{{\dagger}}} }{\hat{C}}_{3z}^{-1}\right){({T}_{1})}_{\alpha \beta }\left({\hat{C}}_{3z}{\hat{c}}_{j\beta }{\hat{C}}_{3z}^{-1}\right)\\= \, {\hat{c}}_{j+{{{{{{{{\bf{a}}}}}}}}}_{2}\gamma }^{{{{\dagger}}} }{({C}_{3z})}_{\gamma \alpha }{({T}_{1})}_{\alpha \beta }{\hat{c}}_{j\delta }{({C}_{3z}^{*})}_{\delta \beta }\\= \, {\hat{c}}_{j+{{{{{{{{\bf{a}}}}}}}}}_{2}\alpha }^{{{{\dagger}}} }{({C}_{3z}{T}_{1}{C}_{3z}^{-1})}_{\alpha \beta }{\hat{c}}_{j\beta }.$$
(99)

In momentum space, the tight-binding Hamiltonian becomes

$${h}_{{{{{{{{\rm{TB}}}}}}}}}= \, {h}_{0}-({T}_{1}{e}^{-i{k}_{1}a}+{T}_{2}{e}^{-i{k}_{2}a}+{T}_{3}{e}^{-i{k}_{3}a}+{T}_{4}{e}^{-i{k}_{4}{a}_{z}}+h.c.)\\= \, \left[{e}_{0}-2{t}_{0}(\cos {k}_{1}a+\cos {k}_{2}a+\cos {k}_{3}a)-2{t}_{0}^{z}\cos {k}_{4}{a}_{z}\right]{{{\Gamma }}}_{0}\\ -{t}_{1}(2\sin {k}_{1}a-\sin {k}_{2}a-\sin {k}_{3}a){{{\Gamma }}}_{1}-\sqrt{3}{t}_{1}(\sin {k}_{2}a \\ -\sin {k}_{3}a){{{\Gamma }}}_{2}-2{t}_{3}^{z}\sin {k}_{4}{a}_{z}{{{\Gamma }}}_{3}\\ -2{t}_{4}(\sin {k}_{1}a+\sin {k}_{2}a+\sin {k}_{3}a){{{\Gamma }}}_{4} \\ + \left[{e}_{5}-2{t}_{5}(\cos {k}_{1}a+\cos {k}_{2}a+\cos {k}_{3}a)-2{t}_{5}^{z}\cos {k}_{4}{a}_{z}\right]{{{\Gamma }}}_{5},$$
(100)

where

$${k}_{1}= {k}_{x},\\ {k}_{2}= \, \frac{1}{2}\left(-{k}_{x}+\sqrt{3}{k}_{y}\right),\\ {k}_{3}= \, \frac{1}{2}\left(-{k}_{x}-\sqrt{3}{k}_{y}\right),\\ {k}_{4}= \, {k}_{z}.$$
(101)

By expanding the tight-binding Hamiltonian up to second order in k, we obtain

$${h}_{{{{{{{{\rm{TB}}}}}}}}}= \left[{e}_{0}-6{t}_{0}-2{t}_{0}^{z}+\frac{3}{2}{t}_{0}{a}^{2}({k}_{x}^{2}+{k}_{y}^{2})+{t}_{0}^{z}{a}_{z}^{2}{k}_{z}^{2}\right]{{{\Gamma }}}_{0} \\ -3{t}_{1}a{k}_{x}{{{\Gamma }}}_{1}-3{t}_{1}a{k}_{y}{{{\Gamma }}}_{2}-2{t}_{3}^{z}a{k}_{z}{{{\Gamma }}}_{2}\\ + \left[{e}_{5}-6{t}_{5}-2{t}_{5}^{z}+\frac{3}{2}{t}_{5}{a}^{2}({k}_{x}^{2}+{k}_{y}^{2})+{t}_{5}^{z}{a}_{z}^{2}{k}_{z}^{2}\right]{{{\Gamma }}}_{5}+O({k}^{3}).$$
(102)

Comparing this with heff, we find

$${e}_{0}= \, {C}_{0}+2{C}_{1}/{a}_{z}^{2}+4{C}_{2}/{a}^{2},\\ {e}_{5}= \, {M}_{0}+2{M}_{1}/{a}_{z}^{2}+4{M}_{2}/{a}^{2},\\ {t}_{0}= \, \frac{2{C}_{2}}{3{a}^{2}},\\ {t}_{0}^{z}= \, \frac{{C}_{1}}{{a}_{z}^{2}},\\ {t}_{1}= \, -\frac{{A}_{2}}{3a},\\ {t}_{3}^{z}= \, -\frac{{A}_{1}}{2{a}_{z}},\\ {t}_{5}= \, \frac{2{M}_{2}}{3{a}^{2}},\\ {t}_{5}^{z}= \, \frac{{M}_{1}}{{a}_{z}^{2}}.$$
(103)

We use the parameters derived in ref. 36 with a modification, C2 = 0, we take in order to obtain an insulating filling:

$${C}_{0}= \, -0.0048\ {{{{{{{\rm{eV}}}}}}}},\\ {C}_{1}= \, 2.7232\ {{{{{{{\rm{eV}}}}}}}} \AA ^{2},\\ {C}_{2}= \, 0\ {{{{{{{\rm{eV}}}}}}}} \AA ^{2},\\ {M}_{0}= \, -0.1165\ {{{{{{{\rm{eV}}}}}}}},\\ {M}_{1}= \, 11.9048\ {{{{{{{\rm{eV}}}}}}}} \AA ^{2},\\ {M}_{2}= \, 9.4048\ {{{{{{{\rm{eV}}}}}}}} \AA ^{2},\\ {A}_{1}= \, 2.7023\ {{{{{{{\rm{eV}}}}}}}} \AA,\\ {A}_{2}= \, 3.1964\ {{{{{{{\rm{eV}}}}}}}} \AA,\\ a= \, 4.334\ \AA,\\ {a}_{z}= \, \frac{1}{3}c=\frac{1}{3}40.91\; \AA=13.64\ \AA .$$
(104)

The spin part is described by

$${h}_{{{{{{{{\rm{spin}}}}}}}}}=-{\mu }_{B}{{{{{{{\bf{B}}}}}}}}\cdot {{{{{{{\bf{s}}}}}}}}{\tau }_{0},$$
(105)

where the Bohr magneton is

$${\mu }_{B}=3.8099\frac{e}{\hslash }\ {{{{{{{\rm{eV}}}}}}}} \AA ^{2}.$$
(106)

We consider the antiferromagnetic state of a few-layer MnBi2Te4 with layer-alternating out-of-plane moments. The antiferromagnetic moment is described by

$${h}_{{{{{{{{\rm{AFM}}}}}}}}}=m{l}_{z}{s}_{z}{\tau }_{0},$$
(107)

where lz is a Pauli matrix in the sublattice (i.e., layer) space. We use m = 0.03 eV for our calculations. The full 8 × 8 tight-binding Hamiltonian in momentum space is then

$${h}_{{{{{{{{\rm{TB}}}}}}}}}= \, {h}_{{{{{{{{\rm{para}}}}}}}}}+{h}_{{{{{{{{\rm{AFM}}}}}}}}}\\= \, \left[{e}_{0}-2{t}_{0}(\cos {k}_{1}a+\cos {k}_{2}a+\cos {k}_{3}a)\right]{l}_{0}{{{\Gamma }}}_{0}-2{t}_{0}^{z}\cos {k}_{4}{a}_{z}{l}_{x}{{{\Gamma }}}_{0}\\ -{t}_{1}(2\sin {k}_{1}a-\sin {k}_{2}a-\sin {k}_{3}a){l}_{0}{{{\Gamma }}}_{1}-\sqrt{3}{t}_{1}(\sin {k}_{2}a\\ -\sin {k}_{3}a){l}_{0}{{{\Gamma }}}_{2}+2{t}_{3}^{z}\sin {k}_{4}{a}_{z}{l}_{y}{{{\Gamma }}}_{3} -2{t}_{4}(\sin {k}_{1}a+\sin {k}_{2}a+\sin {k}_{3}a){l}_{0}{{{\Gamma }}}_{4}\\ + \left[{e}_{5}-2{t}_{5}(\cos {k}_{1}a+\cos {k}_{2}a+\cos {k}_{3}a)\right]{{{\Gamma }}}_{5}-2{t}_{5}^{z}\cos {k}_{4}{a}_{z}{l}_{x}{{{\Gamma }}}_{5}+m{l}_{z}{{{\Gamma }}}_{12},$$
(108)

where Γ12 = Γ1Γ2/2i = szτ0.