1 Introduction

Among the three neutrinos in nature, the tau-neutrino is the least studied one. Although the existence of \(\nu _\tau \) had been established by the precise measurement of the Z boson invisible decay width, its direct detection (i.e., detection of \(\tau \) from the Charged Current (CC) interaction of \(\nu _\tau \)) was announced only in the early 21st century by the DONUT experiment at FermiLAB [1]. Indeed, the \(\nu _\tau \) data sample does not exceed \(\sim 21\) events, consisting of the 9 DONUT events [2], the 10 \(\nu _\tau \) events registered by OPERA long baseline experiment [3] and two high energy cosmic \(\nu _\tau \) candidate events by ICECUBE [4]. The main reason why registering \(\nu _\tau \) events is so difficult is that the produced \(\tau \) at low energies is too short-lived to lead to a discernible track. Moreover, the conventional sources for neutrinos such as nuclear beta processes, muon decay or pion and Kaon decay produce only neutrinos of the first or second generations. The \(\nu _\tau \) detected by OPERA comes from the oscillation of \(\nu _\mu \) produced at CERN SPS en route to the detector at the Gran Sasso underground lab in Italy. Most likely, the high energy cosmic \(\nu _\tau \) flux is also produced from the oscillation of \(\nu _\mu \) and \(\nu _e\) fluxes propagating from the source to the detector. At the DONUT experiment, however, the \(\nu _\tau \) flux was the decay product of heavy mesons.Footnote 1

The FASER\(\nu \) [9] and SND@LHC [10, 11] detectors during the run III of the LHC (2022-2024) will bring about a breakthrough in studying \(\nu _\tau \). FASER\(\nu \) and SND@LHC are dense detectors, designed to detect (and distinguish) all three kinds of neutrinos. These experiments can also be sensitive to a variety of new physics involving dark matter [12,13,14,15,16] or beyond SM interaction of \(\nu _\mu \) [11, 17,18,19,20,21,22,23,24,25].

In this paper, we explore three new scenarios that can lead to the overproduction of the \(\tau \) events at forward experiments, FASER\(\nu \) and SND@LHC. We build models for these scenarios based on adding new scalar doublets to the SM. We show how by imposing global U(1) flavor symmetries, the desired flavor structure of the Yukawa coupling can be obtained. As a bonus, these symmetries can explain the smallness of the first generation leptons and quarks. In each case, we show that how present experimental and observational constraints can be avoided and suggest strategies to test the accompanying prediction of the model by various experiments.

The scenarios are the following: (1) \(\pi ^+\rightarrow \mu ^+\nu _\tau \) with a branching ratio of \(\sim 10^{-3}\). We show that this process can be obtained by adding scalar doublets to the SM such that their charged components are mixed. Despite the stringent bounds from the processes such as \(\tau ^+\rightarrow \mu ^+ \pi ^0\), we show that within our model \(Br(\pi ^+\rightarrow \mu ^+\nu _\tau )\sim 10^{-3}\) can be achieved. (2) \(\pi ^+\rightarrow \mu ^+\bar{\nu }_\tau \) with again \(Br(\pi ^+\rightarrow \mu ^+\bar{\nu }_\tau )\sim 10^{-3}\). The model that we build to embed this scenario involves a singlet charged scalar with an asymmetric coupling to the second and third generation of left-handed leptons. Such a coupling has been proposed in [26] to explain the anomalies observed in the tau decay. (3) \(\tau \) production via \(\nu _e\) (\(\nu _\mu \)) scattering off the matter fields. In the model that we build for this scenario, \(\tau \) and \(\nu _e\) (\(\nu _\mu \)) have a Yukawa coupling with a new scalar doublet. We discuss the present bounds from the NOMAD data on the cross section of this process and then derive improvements that can be brought about by the upcoming FASER\(\nu \) and SND@LHC experiments.

Ref. [22] discusses the bounds to be derived from FASER\(\nu \) on the effective couplings that can lead to processes \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) and \(\nu _e+\mathrm{nucleus}\rightarrow \tau +X\). The bounds that we have found for FASER\(\nu \) are in good agreement with theirs. We proceed with deriving the shape of the spectrum of \(\tau \) for each scenario and comparing with the background \(\tau \) spectrum within the Standard Model. We show that studying the spectrum during the high luminosity phase of the LHC at FASER\(\nu \) 2 will dramatically increase the sensitivity to new physics. We also discuss the impact of the uncertainty in the prediction of the \(\nu _\tau \) flux within the Standard Model.

We show that the effects of \(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau \) and the \(\tau \) production by \(\nu _e\) can be also described in terms of Charged Current (CC) Non-Standard Interaction (NSI) and the modified coherent source and detector eigenstates. In other words, we build viable models for sizable CC-NSI with observable effects at long baseline experiments such as DUNE.

This paper is organized as follows. In Sects. 2.1, 2.2 and 2.3, we describe the models that give rise to the \(\tau \) excess at forward experiments as mentioned above. We outline the parameter ranges that lead to a sizable excess and discuss their predictions for the CMS and ATLAS, anomalous muon magnetic dipole moment and rare decays of the tau. In Sect. 2.4, we show how the effects predicted by these models can be described within the well-studied formalism of the Charged Current (CC) Non-Standard Interaction (NSI). We show that, thanks to a \(m_\pi ^2/(m_u+m_d)^2\) enhancement, we can obtain sizable CC NSI. We interpret the constraints on the CC NSI as bounds on our model. In Sect. 3, we derive the spectrum of \(\tau \) within each model and discuss how the spectrum can help to discriminate between the Standard Model (SM) background for the \(\tau \) events and the signals. In Sect. 4, we describe the relevant characteristics of the forward experiments of our interest and show that during the run III, FASER\(\nu \) can significantly reduce the uncertainty in the SM prediction for the number of the \(\tau \) events.

In Sect. 5, we discuss the signature of the models in the forward experiments and present our results for the upcoming SND@LHC and FASER\(\nu \) experiments as well as for the FASER\(\nu \) upgrade with higher statistics. A summary and discussion is given in Sect. 6.

2 The model(s)

In this section, we introduce the models for (i) \(\pi ^+\rightarrow \mu ^+ \nu _\tau \); (ii) \(\pi ^+\rightarrow \mu ^+ \bar{\nu }_\tau \) and (iii) the \(\tau \) production by \(\nu _e\) (\(\nu _\mu \)) scattering off the matter fields. In each case, we review the bounds on the parameter space of the model. As we shall discuss, in the lepton number violating model leading to \(\pi ^+\rightarrow \mu ^+ \bar{\nu }_\tau \), a Majorana neutrino mass of \(O(10^{-3})\) eV is induced at the two-loop level which is too tiny to violate the upper bounds on neutrino mass. To explain the neutrino mass, the models should be extended to include a mechanism such as seesaw.

We then show how the effects of these new models in the neutrino experiments can be described by the coherent \(|\nu ^d\rangle \) and \(|\nu ^s\rangle \) states that have extensively been used in the literature to describe the CC-NSI.

Table 1 The \(U_1(1)\times U_2(1)\) charges of the fields. The rest of the fields, including \(u_R\) and the Higgs are taken neutral under \(U_1(1)\times U_2(1)\)

2.1 A model for \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \)

The \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) process is constrained by the precision measurement of the ratio \(Br(\pi \rightarrow e \nu )/Br(\pi \rightarrow \mu \nu )\) where \(\nu \) can be any neutral fermion with a mass below 1 MeV that appears as missing energy. Notice that the SM prediction for this ratio is free from the uncertainties in the pion decay constant. The measurement is compatible with the SM prediction to the level of \(2.4\times 10^{-3}\) [27], implying that \(Br(\pi ^+ \rightarrow e^+ \nu _\tau )<2.4\times 10^{-3}Br(\pi ^+ \rightarrow e^+ \nu _e)=2.8\times 10^{-7}\) and \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )<2.4\times 10^{-3}Br(\pi ^+ \rightarrow \mu ^+ \nu _\mu )=2.4\times 10^{-3}\).Footnote 2 Since the bound on \(Br(\pi ^+ \rightarrow e^+ \nu _\tau )\) is too strong to lead to an observable effect at FASER\(\nu \) and other similar experiments, we will only focus on \(\pi ^+ \rightarrow \mu ^+\nu _\tau \).

The effective four-Fermi coupling

$$\begin{aligned} G_{\nu \mu }\left( \bar{\mu }\frac{1-\gamma _5}{2}\nu _\tau \right) \left( \bar{d} \frac{1\pm \gamma _5}{2} u\right) \end{aligned}$$
(1)

leads to

$$\begin{aligned} \varGamma (\pi ^+ \rightarrow \mu ^+ \nu _\tau )=G_{\nu \mu }^2 \frac{m_\pi }{32 \pi } \frac{F_\pi ^2}{(m_u+m_d)^2} (m_\pi ^2-m_\mu ^2)^2. \end{aligned}$$
(2)

With \( G_{\nu \mu } \sim 4\times 10^{-8}~\mathrm{GeV}^{-2}\), \(Br (\pi ^+ \rightarrow \mu ^+ \nu _\tau )\sim 10^{-3}\). Notice that although the \(G_{\nu \mu }\) coupling is chirality-flipping, the angular momentum conservation and the fact that both interactions are short-ranged imply that the polarizations of the muons emitted in \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) and \(\pi ^+ \rightarrow \mu ^+ \nu _\mu \) are equal. As a result, the precise measurement of the muon polarization [28] does not constrain \(G_{\nu \mu }\).

To obtain the effective coupling in Eq. (1), we introduce two scalar doublets, \(\varPhi _1=(\phi _1^+ \ \phi _1^0)^T\) and \(\varPhi _2=(\phi _2^+ \ \phi _2^0)^T\) with the following Yukawa couplings with the doublets, \(L_\tau =(\nu _\tau \ \tau _L)^T\) and \(Q_1=(u_L \ d_L)^T\):

$$\begin{aligned} \lambda _d \bar{d} \varPhi _1^\dagger Q_1+\lambda _u \bar{u} \varPhi _1^T c Q_1+\lambda _\mu \bar{\mu }\varPhi _2^\dagger L_\tau +\mathrm{H.c.} , \end{aligned}$$
(3)

where c is an asymmetric matrix with \(c_{12}=-c_{21}=1\). If \(\varPhi _1\) is identified with \(\varPhi _2\) or if the neutral components of these two doublets are mixed, the effective LFV \( G_\pi (\bar{\mu }_R \tau _L)(\bar{u}\gamma _5 u -\bar{d}\gamma _5 d)\) and \({G_\eta } (\bar{\mu }_R \tau _L)(\bar{u}\gamma _5 u +\bar{d}\gamma _5 d)\) couplings can be obtained by integrating out the heavy states. \(G_\eta \) and \(G_\pi \) will be respectively proportional to \(\lambda _u+\lambda _d\) and \(\lambda _u-\lambda _d\). These effective couplings lead to \(\tau \rightarrow \mu \pi ^0\) and \(\tau \rightarrow \mu \eta ^0\) which are severely constrained [28] and set bounds: \( G_\pi < 5 \times 10^{-9} ~~\mathrm{GeV}^{-2} \) and \( G_\eta <4\times 10^{-10}~~\mathrm{GeV}^{-2}.\) To obtain \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\sim 10^{-3}\), we therefore need \(G_{\nu \mu }\gg G_\pi , G_\eta \). This in turn implies \(\varPhi _1 \ne \varPhi _2\). Moreover, the mixing between the neutral components of \(\varPhi _1\) and \(\varPhi _2\) should be much smaller than that between their charged components.Footnote 3

To explain the flavor structure of the Yukawa couplings and to simplify the Lagrangian by removing unwanted terms, we impose an approximate \(U_1(1)\times U_2(1)\) global symmetry. The \(U_1(1)\times U_2(1)\) charges of the relevant fields are shown in Table 1. The rest of the SM fields are neutral under this new \(U_1(1)\times U_2(1)\).

With this assignment, \(\lambda _d\ne 0\) but \(\lambda _u=0\) so our analysis will be simplified. Notice that the Yukawa couplings of u and d to the SM Higgs breaks the \(U_1(1)\) symmetry so the smallness of the u and d masses can be explained as a bonus in this model. We can proceed with assigning unequal \(U_1(1)\times U_2(1)\) charges to \(e_R\) and \(L_e\) to also explain the lightness of the first generation of leptons but this is not the main goal of the present paper. The \(U_1(1)\times U_2(1)\) symmetry explains the flavor structure of the Yukawa couplings and forbids mixing terms between \(\varPhi _1\) and \(\varPhi _2\) such as \(\varPhi _1^\dagger \varPhi _2\), \(|H|^2\varPhi _1^\dagger \varPhi _2\) and \((H^\dagger \varPhi _1)(\varPhi _2^\dagger H)\). As a result, \(\phi _1^0\) and \(\phi _2^0\) will not be mixed, preventing \(\tau \rightarrow \mu \pi ^0\).

The mixing between \(\phi _1^+\) and \(\phi _2^+\), which is required to obtain \(\pi ^+\rightarrow \mu ^+ \nu _\tau \), breaks \(U_1(1)\times U_2(1)\). After electroweak symmetry breaking, we can obtain such a mixing between the charged components of \(\varPhi _1\) and \(\varPhi _2\) without mixing their neutral components via the following term

$$\begin{aligned} \lambda _{12}(H^T c\varPhi _1)(\varPhi _2^\dagger cH^*) . \end{aligned}$$
(4)

Notice that this term explicitly breaks the global \(U_1(1)\times U_2(1)\) to a single U(1) under which \(\varPhi _1\) and \(\varPhi _2\) have equal charges. The effective coupling \(G_{\nu \mu }\) can be written as

$$\begin{aligned} G_{\nu \mu }= & {} \frac{\lambda _\mu \lambda _d}{m_{\phi _1^+}^2} \frac{\lambda _{12}v^2/2}{m_{\phi _2^+}^2}\nonumber \\= & {} 4 \times 10^{-8} ~\mathrm{GeV}^{-2}\frac{\lambda _\mu }{0.3} \frac{\lambda _d}{0.3} \frac{\lambda _{12}}{0.12}\frac{(300~\mathrm{GeV})^2}{m_{\phi _1^+}^2}\frac{(300~\mathrm{GeV})^2}{m_{\phi _2^+}^2}.\nonumber \\ \end{aligned}$$
(5)

The \(\lambda _\mu \) coupling can also give a contribution to \((g-2)_\mu \) of \(\varDelta a_\mu \sim \lambda _\mu ^2 m_\mu ^2/(100 \pi ^2 m_{\varPhi _2}^2)\) [29,30,31,32]. In order to account for the \((g-2)_\mu \) anomaly [33,34,35,36,37,38] with \(m_{\varPhi _2}\sim 300\) GeV, \(\lambda _\mu \) should saturate the perturbativity bound: \(\lambda _\mu \sim 3\). In fact, this is a general feature of the models that explain the \((g-2)_\mu \) anomaly with new Yukawa coupling [39]. To maintain \(G_{\nu \mu }\sim 4 \times 10^{-8}\) GeV\(^{-2}\), we can decrease \(\lambda _{12}\) by one order of magnitude. The smallness of \(\lambda _{12}\) can be explained by \(U_1(1)\times U_2(1)\rightarrow U(1)\).

The components of \(\varPhi _2\) can be pair produced at the LHC via the electroweak interactions. They will subsequently decay as \(\phi _2^0 \rightarrow \mu ^+\tau ^-\) and \(\phi _2^+ \rightarrow \mu ^+\nu _\tau \). The components of \(\varPhi _1\) can also be pair produced via the electroweak interactions. Moreover, the \(\bar{d}+u\) and \(\bar{d}+d\) scatterings can respectively produce \(\phi _1^+\) and \(\phi _1^0\) in association with the gluon. The \(\varPhi _1\) components will subsequently decay into a pair of jets. Through the mixing between \(\varPhi _1\) and \(\varPhi _2\), the electroweak interaction can also produce \(\varPhi _1 \varPhi _2\) pairs. Moreover, the mixing can lead to the leptonic (hadronic) decay modes for \(\varPhi _1\) (\(\varPhi _2\)). These effects are however subdominant and further suppressed by \(O[(\lambda _{12} v^2/m_{\varPhi _{1,2}}^2)^2]\). The heavier component of \(\varPhi _1\) or \(\varPhi _2\) can also decay into the lighter one and the W boson. The splittings between the two components are however constrained by the oblique parameters [40]. The signature of pair production of the \(\varPhi _1\) as well as single \(\varPhi _1\) production in association of gluon(s) will be multijet signal which suffers from high background. To our best knowledge, \(\phi _1\) heavier than 200 GeV decaying into jets is still unconstrained by the LHC. However, it may be discovered during the high luminosity phase of the LHC. The signatures of the \(\phi _2^+ (\phi _2^0)^\dagger \), \( \phi _2^-\phi _2^0\), \( \phi _2^+\phi _2^-\) and \((\phi _2^0)^\dagger \phi _2^0\) are respectively \(\mu ^+ \nu _\tau \mu ^- \tau ^+\), \(\mu ^- \bar{\nu }_\tau \mu ^+ \tau ^-\), \(\mu ^+ \nu _\tau \mu ^- \bar{\nu }_\tau \) and \(\mu ^- \tau ^+ \mu ^+ \tau ^-\) where the invariant masses of the \(\tau \) and \(\mu \) pairs are equal to \(m_{\phi _2^0}\). To our best knowledge, neither a dedicated search for \(\phi _2^0\) with an arbitrary mass decaying into \(\mu ^+\tau ^-\) nor a search for \(\phi _2^+\) decaying into the muon plus missing energy has been carried out, yet.Footnote 4

Notice that \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) is enhanced by \(f_\pi ^2/(m_u +m_d)^2\) but the cross section of the \(\nu _\tau \) interaction on the nuclei via the new \(G_{\nu \mu }\) does not enjoy such as enhancement. Moreover, since there is a large background for the (\(\mu \)+jets) signal from the CC interaction of \(\nu _\mu \), we do not need to worry about the impact of \(G_{\nu \mu }\) on the detection.

In Sect. 5, we shall study the bounds from FASER\(\nu \) on \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\). This scenario could lead to the tau production at the NOMAD detector, too. However, at NOMAD the energies of neutrinos from the pion decay are around 20 GeV so the momentum of the jets recoiling against the produced \(\tau \) would be too low to survive the cuts applied by the NOMAD collaboration to identify the \(\tau \) production [42]. At NOMAD, the neutrino flux with energies higher than 50 GeV was also produced but the production was dominated by the Kaon decay rather than the pion decay. As a result, the exotic decay \(K^+ \rightarrow \mu ^+ \nu _\tau \) can already strongly be constrained by NOMAD. We have therefore focused only on the exotic pion decay in this paper.

2.2 A model for \(\pi ^+ \rightarrow \bar{\nu }_\tau \mu ^+\) with a connection to observed anomalies in \(\tau \) decay

Ref. [26] proposes a model to address the \(2 \sigma \) discrepancy between the observation and the SM prediction in the \(\tau \rightarrow \mu \nu _\tau \bar{\nu }_\mu \) mode [43]. The model is based on the introduction of a new charged singlet \(\varPhi ^+\) heavier than 300 GeV and with an interaction of form

$$\begin{aligned} \mathscr {L}&=-\frac{\lambda _{23}}{2} L_{a,\mu }\varepsilon _{ab} L_{b \tau } \varPhi ^++\mathrm{H.c}\nonumber \\&= -\frac{\lambda _{23}}{2} (\nu _\mu ^T c\tau _L-\mu _L^T c \nu _\tau )\varPhi ^+ +\mathrm{H.c} \end{aligned}$$
(6)

From \(\mathrm{Br}(\tau \rightarrow \mu \nu \nu )/\mathrm{Br}(\tau (\mu )\rightarrow e \nu \nu )\), Ref. [26] finds

$$\begin{aligned} 0.052 \frac{m_{\varPhi ^+}}{300~\mathrm{GeV}}<\lambda _{23}<0.148 \frac{m_{\varPhi ^+}}{300~\mathrm{GeV}}. \end{aligned}$$
(7)

The \(\lambda _{23}\) coupling can also give rise to \((g-2)_\mu \) but considering the upper bound on \(\lambda _{23}\) shown in Eq. (7), the contribution will be too small to account for the observed deviation from the standard model prediction [29,30,31,32,33,34].

Table 2 The U(1) charges of the fields. The rest of the fields, including \(u_R\) and the Higgs are neutral under U(1)

Let us reintroduce \(\varPhi _1\) of Sect. 2.1 to this section with the U(1) charges as in Table 2. With this assignment, we can have a trilinear term as

$$\begin{aligned} A\varPhi ^- H^Tc\varPhi _1 \end{aligned}$$

which after electroweak symmetry breaking induces a mixing between \(\varPhi ^+\) and \(\phi _1^+\) given by

$$\begin{aligned} \sin 2 \theta =\frac{2A~ v/\sqrt{2}}{m_{\phi _1^+}^2-m_{\varPhi ^+}^2}. \end{aligned}$$
(8)

Integrating out the heavy fields, we shall have an effective coupling of form

$$\begin{aligned} G_{\bar{\nu }\mu } \left( \bar{d}\frac{1-\gamma _5}{2}u\right) (\nu _\mu ^T c\tau _L-\mu _L^T c \nu _\tau )+\mathrm{H.c.} \end{aligned}$$
(9)

where

$$\begin{aligned} G_{\bar{\nu }\mu }=\frac{\lambda _d\lambda _{23} }{2}\frac{A v/\sqrt{2}}{m_{\varPhi ^+}^2m_{\phi _1^+}^2 }. \end{aligned}$$

With this effective Lagrangian, a new decay mode \(\pi ^+ \rightarrow \bar{\nu }_\tau \mu ^+\) will open with a rate given by Eq. (2) but replacing \(G_{\nu \mu }\) with \(G_{\bar{\nu }\mu }\). Similarly to the decay via \(G_{\nu \mu }\), with \(G_{\bar{\nu }\mu }\sim 5 \times 10^{-8}\) GeV\(^{-2}\), Br\((\pi ^+ \rightarrow \bar{\nu }_\tau \mu ^+)\) can be as large as \(10^{-3}\).

The axial component of \(G_{\bar{\nu }\mu }\) can lead to

$$\begin{aligned} \varGamma (\tau ^- \rightarrow \bar{\nu }_\mu \pi ^-)\sim \frac{G_{\bar{\nu }\mu }^2}{4\pi }\frac{F_\pi ^2 m_\pi ^2}{(m_u+m_d)^2} m_\tau \end{aligned}$$
(10)

which is again enhanced by \( m_\pi ^2/(m_u+m_d)^2\). The corresponding branching ratio is \(\mathrm{Br}(\tau ^- \rightarrow \bar{\nu }_\mu \pi ^-)\sim 5 \times 10^{-6} [G_{\bar{\nu }\mu }/(5 \times 10^{-8}~ \mathrm{GeV}^{-2})]^2\) which is much smaller than the uncertainty in \(\mathrm{Br}(\tau \rightarrow \pi \nu )\) which is \(5\times 10^{-4}\) [28]. Within the SM, the branching ratio of \(\tau ^-\rightarrow \nu _\tau \pi ^0\pi ^-\) is even larger than that of the two body decay \(\tau ^-\rightarrow \nu _\tau \pi ^-\). The enhancement is due to the spin 1 \(\rho \) resonance from the vectorial part of the charged current, \(\tau ^- \rightarrow \nu _\tau \rho ^-\rightarrow \nu _\tau \pi ^0\pi ^-\) [44, 45]. In our model, since the mediator (\(\varPhi ^+\)) has zero spin, no \(\rho \) resonance occurs so we expect \(\tau ^- \rightarrow \bar{\nu }_\mu \pi ^0\pi ^-\) to be suppressed. Via the \(G_{\bar{\nu }\mu }\) interaction, \(\nu _\mu \) can produce \(\tau \) in the detector, too, but the cross section will be suppressed by \(\sim G_{\bar{\nu }\mu }^2/(8G_F^2)\sim 2\times 10^{-6}\) relative to the SM CC interaction of \(\nu _\mu \). This means the number of \(\tau \) events produced during the run III of the LHC by the \(\nu _\mu \) flux will be as small as O(0.01) and therefore negligible. Similarly, the bound on the \(\tau \) production at NOMAD [42] can be avoided.

In this model, \(\varPhi ^+\) and \(\varPhi ^-\) can be pair produced at the HL-LHC by electromagnetic interactions. They will then decay as \(\varPhi ^+ \rightarrow \mu ^+ \nu \) and \(\varPhi ^+ \rightarrow \tau ^+ \nu \) so the signals will be excess in the \(\mu ^+\mu ^-+\mathrm{missing ~energy}\), \(\tau ^+\tau ^-+\mathrm{missing ~energy}\), \(\tau ^- \mu ^++\mathrm{missing ~energy}\) and \(\tau ^+\mu ^-+\mathrm{missing ~energy}\) signals. The \(\varPhi _1\) pairs can also be produced at the LHC, decaying into jets as described in the previous subsection.

Since in this model, lepton number is violated, neutrinos can obtain a Majorana mass term. Notice that lepton number violation requires both the \(\lambda _{23}\) coupling and the mixing between \(\varPhi ^+\) and \(\varPhi ^+_1\). As a result, the neutrino mass generation in this model starts at two-loop level, inducing a mass of form \(\nu _\tau ^Tc\nu _\mu \) of order of

$$\begin{aligned} G_{\bar{\nu }\mu } G_F(m_\tau ^2-m_\mu ^2)\frac{(m_u+m_d)m_W^2}{(16\pi ^2)^2}\sim \mathrm{m}e\mathrm{V}. \end{aligned}$$

The \((m_\tau ^2-m_\mu ^2)\) comes from a cancellation between the two diagrams in which \(\mu \) and \(\tau \) propagate. The induced mass is too small to violate the current upper bounds on the neutrino mass.

2.3 Non-standard \(\tau \) production at the detector

In this section, we introduce a variation of the model introduced in Sect. 2.1 with the difference that \(\varPhi _2\) couples to \(\tau _R\) instead of \(\mu _R\) as follows

$$\begin{aligned} \lambda _e \bar{\tau }_R \varPhi _2^\dagger L_e+ \lambda _\mu \bar{\tau }_R \varPhi _2^\dagger L_\mu . \end{aligned}$$
(11)

If \(\lambda _e\) and \(\lambda _\mu \) are both nonzero, they can contribute to \(\mu \rightarrow e \gamma \) at one loop which is severely constrained by the experimental bounds. As a result, we assume that only one of \(\lambda _e\) and \(\lambda _\mu \) is nonzero. This pattern can be explained by the \(U_2(1)\) symmetry. For example, if we assign \(U_2(1)\) charges to \(\varPhi _2\) and leptons as shown in Table 3, we can simultaneously explain nonzero \(\lambda _e\), vanishing \(\lambda _\mu \) and the smallness of the electron mass.

Table 3 The \(U_2(1)\) charges of the fields. The rest of the fields, including the second generation leptons, \(\varPhi _1\), quarks and the Higgs are neutral under \(U_2(1)\)

Like the model in Sect. 2.1, we allow only the charged components to mix with each other. As a result, the severely constrained decay modes \(\tau ^- \rightarrow e^- \pi ^0\) or \(\tau ^- \rightarrow \mu ^- \pi ^0\) cannot be obtained at the tree level. However, we obtain

$$\begin{aligned} G_e (\bar{\tau }_R \nu _e)( \bar{u}_L d_R) \ \ \ \mathrm{or } \ \ \ G_\mu (\bar{\tau }_R \nu _\mu ) (\bar{u}_L d_R) \end{aligned}$$
(12)

where \( G_{e}=\lambda _d \lambda _e \lambda _{12}v^2/(2m_{\phi _1^+}^2m_{\phi _2^+}^2)\) and \( G_{\mu }=\lambda _d \lambda _\mu \lambda _{12}v^2/(2m_{\phi _1^+}^2m_{\phi _2^+}^2)\). These effective couplings respectively lead to \(\tau ^- \rightarrow \pi ^- +\nu _e\) and \(\tau ^- \rightarrow \pi ^- +\nu _\mu \). Similarly to Sect. 2.1 and the case of \(G_{\bar{\nu }\mu }\) in Eq. (10), the uncertainty on \(\tau ^+\rightarrow \pi ^+ \nu \) gives the constraint \(G_{e (\mu )}< 5 \times 10^{-7}\) GeV\(^{-2}\).Footnote 5 Saturating this constraint, we shall have \(\sigma (\nu _{e(\mu )}+\mathrm{nucleus}\rightarrow \tau +X)/\sigma (\nu _\mu +\mathrm{nucleus}\rightarrow \mu +X)\sim (G_{e(\mu )}/4G_F)^2\sim 10^{-4}\). In this model, regardless of the origin of the neutrinos (whether they come from the pion or Kaon decays), the electron or muon neutrinos with energies sufficiently larger than the tau mass can lead to the production of \(\tau \). As a result, the NOMAD experiment can constrain \(G_e\) and \(G_\mu \) (cf. the model in Sect. 2.1 which avoids the NOMAD constraints as explained.) The number of the \(\nu _\mu \) charged current events with an energy larger than 25 GeV observed at NOMAD was above \(2\times 10^5\) which is one order of magnitude larger than the anticipated number at FASER\(\nu \) during run III. The bound from NOMAD on \(G_\mu \) would therefore be of order of \(5\times 10^{-8}\) GeV\(^{-2}\) which is even stronger than the bound from \(\tau ^+ \rightarrow \pi ^+ \nu _\mu \). Such a strong bound on \(G_\mu \) makes observing a deviation from the SM prediction at FASER\(\nu \) hopeless so we shall not study the effects of \(G_\mu \) at FASER\(\nu \) any further. On the other hand, the number of the \(\nu _e\) events at NOMAD and FASER\(\nu \) are comparable so the bound on \(G_e\) may be improved by FASER\(\nu \). In Sect. 2.4, we will quantify the bound from NOMAD on \(G_e\). We shall study the bound that FASER\(\nu \) and its upgrades can set on \(\sigma (\nu _e +\mathrm{nucleus}\rightarrow \tau +X)\) in Sect. 6.

2.4 Connection to the charged current non-standard interaction formalism

There is a rich literature studying the Non-Standard Interaction (NSI) on neutrino oscillation experiments [46]. The effects of Charged Current NSI are often analyzed by introducing eigenstates of source and detector as follows

$$\begin{aligned} |\nu _\alpha ^s\rangle =|\nu _\alpha \rangle +\sum _{\gamma \in \{e,\mu ,\tau \}}\varepsilon ^s_{\alpha \gamma } |\nu _\gamma \rangle \end{aligned}$$
(13)

and

$$\begin{aligned} \langle \nu _\alpha ^d|=\langle \nu _\alpha |+\sum _{\gamma \in \{e,\mu ,\tau \}}\varepsilon ^d_{\gamma \alpha } \langle \nu _\gamma | \end{aligned}$$
(14)

where \(|\nu _\alpha ^s\rangle \) is the eigenstate produced in the source along with the charged lepton of flavor \(\alpha \) and \(|\nu _\alpha ^d\rangle \) is the eigenstate which can produce the charged lepton of flavor \(\alpha \) in the detector. Within the SM, \(|\nu _\alpha ^s\rangle =|\nu _\alpha ^d\rangle =|\nu _\alpha \rangle \). However, non-standard interaction can in principle induce nonzero \(\varepsilon _{\alpha \beta }^s\) and \(\varepsilon _{\alpha \beta }^d\). In recent years, a class of models have been developed based on a new light neutral U(1) gauge boson coupled to neutrinos and matter fields that induces a sizable neutral current NSI [47,48,49,50,51]. In case of CC NSI, the mediator has to be a charged particle so its mass must be heavier than a few 100 GeV to avoid direct production at the LEP and/or at the LHC. Since the relevant effective four-Fermi coupling is given by inverse of the square of the mediator mass, a strong lower bound on the mediator mass generally means small CC NSI. With this consideration, not many models are proposed to underly the CC NSI, despite the extensive efforts to study their phenomenological impact on the neutrino experiments. Indeed, \(G_{\nu \mu }\) obtained in Eq. (5) is quite suppressed \(G_{\nu \mu }\ll G_F\). Despite the smallness of \(G_{\nu \mu }\), thanks to the \(m_\pi /(m_u+m_d)\) enhancement in the amplitude of \(\pi ^+\rightarrow \mu ^+ \nu _\tau \) relative to that of the standard \(\pi ^+\rightarrow \mu ^+ \nu _\mu \), \(\mathrm{Br}(\pi ^+\rightarrow \mu ^+ \nu _\tau )\) can be still relatively large. Within the model introduced in Sect. 2.1, \(|\nu _\mu ^s\rangle \) can be written as

$$\begin{aligned} |\nu _\mu ^s\rangle =\frac{\mathscr {M}_1|\nu _\mu \rangle +\mathscr {M}_2|\nu _\tau \rangle }{\sqrt{|\mathscr {M}_1|^2+|\mathscr {M}_2|^2}}\simeq |\nu _\mu \rangle +\mathscr {M}_2/\mathscr {M}_1|\nu _\tau \rangle , \end{aligned}$$
(15)

where \(\mathscr {M}_1\) and \( \mathscr {M}_2\) are respectively the amplitudes of \(\pi ^+\rightarrow \mu ^+ \nu _\mu \) and \(\pi ^+\rightarrow \mu ^+ \nu _\tau \). Thus, in the model introduced in Sect. 2.1,

$$\begin{aligned} \varepsilon _{\mu \tau }^s=\frac{\mathscr {M}_2}{\mathscr {M}_1}. \end{aligned}$$

We can therefore write \(|\varepsilon _{\mu \tau }^s|^2={|\mathscr {M}_2|^2}/{|\mathscr {M}_1|^2}\simeq \mathrm{Br}(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\). In this model, \( \varepsilon _{\mu \tau }^d \ll 1\). If the baseline of the experiment is short such that \(\varDelta m_{atm}^2 L/E_\nu \ll 1\), the number of the \(\mu \) events and the excess of the \(\tau \) events in the detector will respectively be given by \(\mathrm{Br}(\pi ^+ \rightarrow \mu ^+ \nu _\mu )\sigma _{SM}(\nu _\mu \rightarrow \mu )\) and \(\mathrm{Br}(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\sigma _{SM}(\nu _\tau \rightarrow \tau )\). Thus, it is valid to analyze the FASER\(\nu \) results as well as the DUNE near detector data in terms of \(\mathrm{Br}(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\) rather than studying the evolution of the coherent state in Eq. (15). However, for the long baseline experiments, it is necessary to study the evolution of the full coherent state in Eq. (15); otherwise, we will miss the effect of the interference terms given by

$$\begin{aligned} 2Re[U_{\mu i}U_{\mu i}^*U_{\mu j}^*U_{\tau j}(\varepsilon _{\mu \tau }^s)^* e^{i(m_i^2-m_j^2)L/(2E_\nu )}] \end{aligned}$$

in case of the \(\mu \) detection and

$$\begin{aligned} 2Re[U_{\tau i}U_{\mu i}^*U_{\tau j}^*U_{\tau j}(\varepsilon _{\mu \tau }^s)^* e^{i(m_i^2-m_j^2)L/(2E_\nu )}] \end{aligned}$$

in case of the \(\tau \) detection. Notice that both these interference terms are linear in \(\varepsilon _{\mu \tau }^s\) and therefore dominate over the effect of \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )=|\varepsilon _{\mu \tau }^s|^2\).

In case of \(\nu _{e (\mu )}+\mathrm{nucleus}\rightarrow \tau +X\) within the model introduced in Sect. 2.3, we should pay attention that the chirality of \(\tau \) produced via the new coupling is opposite to that produced by \(\nu _\tau \) in the SM. As a result, the interference term will be suppressed by \(m_\tau /E_\nu \) and we cannot therefore simply equate \(\varepsilon _{e(\mu )\tau }^d\) with \(\mathscr {M}( \nu _{e (\mu )}+\mathrm{nucleus}\rightarrow \tau +X)/\mathscr {M}(\nu _{\tau }+\mathrm{nucleus}\rightarrow \tau +X)\). In fact, the helicity of the final \(\tau \) has to be considered, too. For short baseline experiment such as FASER\(\nu \) or NOMAD for which \(\varDelta m_{atm}^2 L/(2 E_\nu )\ll 1\), such interference is not relevant and we can use the bounds on \(\varepsilon _{e(\mu )\tau }^d\) and on \([\sigma ( \nu _{e (\mu )}+\mathrm{nucleus}\rightarrow \tau +X)/\sigma (\nu _{\tau }+\mathrm{nucleus}\rightarrow \tau +X)]^{1/2} \simeq G_{e(\mu )}/(\sqrt{96} G_F)\), interchangeably. As discussed in Sect. 2.3, the NOMAD experiment can constrain this model. From the NOMAD data, Ref. [52] finds \(\varepsilon _{e \tau }^d<0.087\) which implies \(\sigma ( \nu _{e (\mu )}+\mathrm{nucleus}\rightarrow \tau +X)/\sigma (\nu _{\tau }+\mathrm{nucleus}\rightarrow \tau +\mathrm{nucleus})<0.0075\) and \(G_{e(\mu )}/G_F<0.85\) which is readily satisfied in the model described in Sect. 2.3.

The far detector of DUNE can also constrain \(\varepsilon ^d\) and \(\varepsilon ^s\) [53]. To study the effects at far detector of DUNE, the coherent states \(|\nu ^s\rangle \) and \(|\nu ^d\rangle \) have to be used. We could also define a coherent state of \((\mathscr {M}_1|\nu _\mu \rangle +\mathscr {M}_2|\bar{\nu }_\tau \rangle )/\sqrt{|\mathscr {M}_1|^2+ |\mathscr {M}_2|^2}\) to describe the effects of the model in Sect. 2.2 but since no interference between evolved \(|\nu _\mu \rangle \) and \(|\bar{\nu }_\tau \rangle \) takes place even for long baselines, there is no point in introducing such a coherent state.

3 Spectrum of \(\tau \) produced at forward experiments

In this section, we compute the spectrum of \(\tau \) produced via different types of interaction introduced in this paper and compare with the tau spectrum produced via the standard CC electroweak interactions.

The \(G_e\) coupling defined in Eq. (12) leads to

$$\begin{aligned}&\langle \left| \mathscr {M}[\nu _e(E_\nu )+d(x)\rightarrow \tau ^{-}(E_\tau ) +u]\right| ^2\rangle =\langle \left| \mathscr {M}[\nu _e(E_\nu )\right. \nonumber \\&\quad \left. +\bar{u}(x)\rightarrow \tau ^{-}(E_\tau ) +\bar{d}]\right| ^2\rangle =\nonumber \\&\langle \left| \mathscr {M}[\bar{\nu }_e(E_\nu )+\bar{d}(x)\rightarrow \tau ^{+}(E_\tau ) +\bar{u}]\right| ^2\rangle =\langle \left| \mathscr {M}[\bar{\nu }_e(E_\nu )\right. \nonumber \\&\quad \left. +{u}(x)\rightarrow \tau ^{+}(E_\tau ) +{d}]\right| ^2\rangle =\nonumber \\&\quad 2G_{e}^2(P_\tau \cdot P_\nu )(P_u \cdot P_d) =2G_{e}^2 x^2 m_N^2(E_\nu -E_\tau )^2, \end{aligned}$$
(16)

where \(m_\tau ^2/(2xm_N)<E_\tau <E_\nu \). Thus, the differential cross sections of all these four processes can be written as

$$\begin{aligned} \frac{d \sigma }{dE_\tau }&=\frac{1}{16 \pi }\frac{1}{E_\nu s}|\mathscr {M}|^2=\frac{G_{e}^2}{16 \pi }x m_N \left( 1- \frac{E_\tau }{E_\nu }\right) ^2 ~~\frac{m_\tau ^2}{2xm_N }\nonumber \\&<E_\tau <E_\nu . \end{aligned}$$
(17)

We can then write

$$\begin{aligned} \sigma (\nu _e+\mathrm{nucleus}\rightarrow \tau +X) = \frac{G_e^2}{48 \pi }m_NE_\nu \int _{x_{min}}^1\nonumber \\ x\left( 1-\frac{m_\tau ^2}{2xm_NE_\nu }\right) ^3 [F_d(x,t)+F_{\bar{u}}(x,t)]dx \end{aligned}$$
(18)
$$\begin{aligned} \sigma (\bar{\nu }_e+\mathrm{nucleus}\rightarrow \bar{\tau } +X) = \frac{G_e^2}{48 \pi }m_NE_\nu \int _{x_{min}}^1\nonumber \\ x\left( 1-\frac{m_\tau ^2}{2xm_NE_\nu }\right) ^3 [F_{\bar{d},t }(x,t)+F_{u}(x,t)]dx \end{aligned}$$
(19)

where \(F_q\) is the q-quark parton distribution function and

$$\begin{aligned} x_{min}=\frac{m_\tau ^2}{2m_N E_\nu } \ \ \ \ \mathrm{and } \ \ \ \ t=2xm_N(E_\tau -E_\nu ). \end{aligned}$$

The spectrum of \(\tau +\bar{\tau }\) produced by \(\nu _e +\mathrm{nucleus}\rightarrow \tau +X\) and \(\bar{\nu }_e +\mathrm{nucleus}\rightarrow \bar{\tau }+X\) can be written as

$$\begin{aligned} S_{e}(E_\tau )=\frac{ \int _{E_{\tau }} \int _{x_{min}}^1 [F_{\nu _e}(E_{\nu })(F_d+F_{\bar{u}})+F_{\bar{\nu }_e}(E_\nu )(F_{\bar{d}}+ F_{{u}})]\frac{d \sigma }{dE_\tau } ~ d E_{\nu } dx}{\int \int _{x_{min}}^1 \int _{m_\tau ^2/(2xm_N)}^{E_\nu } [F_{\nu _e}(E_\nu )(F_d+F_{\bar{u}})+F_{\bar{\nu }_e}(E_\nu ) (F_{\bar{d}}+ F_{{u}})]\frac{d \sigma }{dE_\tau } dE_\tau ~dx~ d E_{\nu } }, \end{aligned}$$
(20)

where \(d\sigma /dE_\tau \) is given in Eq. (17). The parton distribution functions are functions of both x and the Mandelstam variable, t.

For comparison the standard model cross sections are

$$\begin{aligned}&\frac{d \sigma (\nu _\tau +d \rightarrow \tau ^- +u)}{dE_\tau }= \frac{d \sigma (\bar{\nu }_\tau +\bar{d} \rightarrow \tau ^+ +\bar{u})}{dE_\tau }\nonumber \\&\quad =\frac{2m_N xG_F^2}{ \pi }\frac{m_W^4}{[2(E_\nu -E_\tau )x m_N +m_W^2]^2} \end{aligned}$$
(21)

and

$$\begin{aligned}&\frac{d \sigma (\bar{\nu }_\tau +u \rightarrow \tau ^+ +d)}{dE_\tau }= \frac{d \sigma ({\nu }_\tau +\bar{u} \rightarrow \tau ^- +\bar{d})}{dE_\tau } \nonumber \\&\quad =\frac{2m_N xG_F^2}{ \pi } \left( \frac{E_\tau }{E_\nu }\right) ^2 \frac{m_W^4}{[2(E_\nu -E_\tau )x m_N +m_W^2]^2} \end{aligned}$$
(22)

where

$$\begin{aligned} \frac{m_\tau ^2}{2xm_N}<E_\tau <E_\nu . \end{aligned}$$

Notice that while in Eq. (17), we have used the effective four-Fermi coupling, \(G_e\), in Eqs. (21, 22), we have used the full propagator for W. This is understandable as for \(x\sim 0.1\), \(2xm_NE_\nu {\mathop {\sim }\limits ^{<}} m_W^2\ll m_{\phi ^+_1}^2, m_{\phi ^+_2}^2\). In fact, we have found that neglecting \(2(E_\nu -E_\tau )xm_N\) in the denominator induces an error of 3% in the total number of events.

The total cross section of \(\nu _\tau \) and \(\bar{\nu }_\tau \) scattering off the nucleon within the SM can be written as

$$\begin{aligned} \frac{d\sigma ^{SM}_\nu }{dE_\tau }= \int _{x_{min}}^1 [F_d(x,t)\frac{d\sigma (\nu _\tau +d \rightarrow \tau ^- +u)}{dE_\tau }\\ +F_{\bar{u}}(x,t)\frac{d\sigma (\nu _\tau +\bar{u} \rightarrow \tau ^- +\bar{d})}{dE_\tau } ]d x \end{aligned}$$

and

$$\begin{aligned} \frac{d\sigma ^{SM}_{\bar{\nu }}}{dE_\tau }= \int _{x_{min}}^1 [F_{\bar{d}}(x,t)\frac{d\sigma (\bar{\nu }_\tau +\bar{d} \rightarrow \tau ^+ +\bar{u})}{dE_\tau }\\ +F_{{u}}(x,t)\frac{d\sigma ( \bar{\nu }_\tau +{u} \rightarrow \tau ^+ +{d})}{dE_\tau } ]d x . \end{aligned}$$

Finally we can write

$$\begin{aligned} \sigma ^{SM}_\nu = \int _{x_{min}}^1\int ^{E_\nu }_{m_\tau ^2/(2xm_N)} [F_d(x,t)\frac{d\sigma (\nu _\tau +d \rightarrow \tau ^- +u)}{dE_\tau }\\ +F_{\bar{u}}(x,t)\frac{d\sigma (\nu _\tau +\bar{u} \rightarrow \tau ^- +\bar{d})}{dE_\tau } ]dE_\tau d x \end{aligned}$$

and a similar formula for \(\sigma ^{SM}_{\bar{\nu }}\) replacing particles with antiparticles.

As discussed in Sect. 2.1, the effective \(G_{\nu \mu }\) coupling introduced in Eq. (1) can also lead to the \(\tau \) production via charged pion decay. The signal from \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) and \(\pi ^- \rightarrow \mu ^- \bar{\nu }_\tau \) will have the following form

$$\begin{aligned} S_{\nu \mu } (E_\tau )\equiv \frac{\int _{E_\tau } [F^\pi _{\nu _\mu }(E_\nu ) \frac{d\sigma ^{SM}_{{\nu }}}{dE_\tau }+F^\pi _{\bar{\nu }_\mu }(E_\nu ) \frac{d\sigma ^{SM}_{\bar{\nu }} }{dE_\tau }]dE_\nu }{\int [F^\pi _{\nu _\mu }(E_\nu ) \sigma ^{SM}_{{\nu }}+F^\pi _{\bar{\nu }_\mu }(E_\nu ) \sigma ^{SM}_{\bar{\nu }} ] dE_\nu }, \end{aligned}$$
(23)

where \(F^\pi _{\nu _\mu } (E_\nu )\) and \(F^\pi _{\bar{\nu }_\mu } (E_\nu )\) are the spectra of neutrinos from the pion decay (rather than the whole flux from pion and Kaon decay).

Let us now discuss the spectrum of the tau produced by lepton number and lepton flavor violating pion decay mode caused by the effective coupling \(G_{\bar{\nu }\mu }\) introduced in Eq. (9) of Sect. 2.3. The signal from \(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau \) and \(\pi ^- \rightarrow \mu ^- {\nu }_\tau \) will have a form given by Eq. (23), swapping \(F_{\nu _\mu }\) and \(F_{\bar{\nu }_\mu }\):

$$\begin{aligned} S_{\bar{\nu }\mu } (E_\tau )\equiv \frac{\int _{E_\tau } [F_{\bar{\nu }_\mu }^\pi (E_\nu ) \frac{d\sigma ^{SM}_{{\nu }}}{dE_\tau }+F_{{\nu }_\mu }^\pi (E_\nu ) \frac{d\sigma ^{SM}_{\bar{\nu }} }{dE_\tau }]dE_\nu }{\int [F^\pi _{\bar{\nu }_\mu }(E_\nu ) \sigma ^{SM}_{{\nu }}+F^\pi _{{\nu }_\mu }(E_\nu ) \sigma ^{SM}_{\bar{\nu }} ] dE_\nu }. \end{aligned}$$
(24)

Finally the \(\tau \) spectrum within the standard model will have the form

$$\begin{aligned} B=\frac{\int _{E_\tau } [F_{\nu _\tau }(E_\nu ) \frac{d\sigma ^{SM}_{{\nu }}}{dE_\tau }+ F_{\bar{\nu }_\tau }(E_\nu ) \frac{d\sigma ^{SM}_{\bar{\nu }} }{dE_\tau }]dE_\nu }{\int [F_{{\nu }_\tau }(E_\nu ) \sigma ^{SM}_{{\nu }}+F_{\bar{\nu }_\tau }(E_\nu ) \sigma ^{SM}_{\bar{\nu }} ] dE_\nu }. \end{aligned}$$
(25)

From Eqs. (18, 19, 21, 22), we observe that the cross sections of all the processes are suppressed by x for small values of x. As a result, the main contribution to the cross section comes from \(x\sim \mathrm{few}\times 10^{-2}-1\). Thus, \(Q^2=-t=2(E_\nu -E_\tau )xm_N\sim 100~\mathrm{GeV}^2\).

Fig. 1
figure 1

Spectra of \(\tau +\bar{\tau }\) produced at FASER\(\nu \) normalized to 1. The curves marked with \(S_e\), \(S_{\nu \mu }\) and \(S_{\bar{\nu }\mu }\) show the spectra of \(\tau +\bar{\tau }\) from new physics scenarios \(\nu _e+\mathrm{nucleus}\rightarrow \tau +X\), \(\pi ^+\rightarrow \mu ^+\nu _\tau \), \(\pi ^+\rightarrow \mu ^+\bar{\nu }_\tau \), respectively. The standard model background is marked with B. The input neutrino spectra that we insert to draw the \(\tau +\bar{\tau }\) spectrum (i.e., \(F_{\nu _\tau }\), \(F_{\bar{\nu }_\tau }\) \(F_{\nu _\mu }^\pi \) and \(F_{\bar{\nu }_\mu }^\pi \)) are described in the last paragraph of Sect. 4

The normalized spectra of \(\tau +\bar{\tau }\) from each scenario are shown in Fig. 1. To draw the curves, we have averaged the scattering cross section over the protons and neutrons composing Tungsten nucleus. As seen from the figure, the background from SM is significantly harder than new physics. This is mostly due to the fact that the background comes from \(F(\nu _\tau )\) and \(F(\bar{\nu }_\tau )\) which are harder than the spectra of other neutrino flavors; cf., Eq. (25) with Eqs. (20, 23, 24). The spectrum of background is quite distinct from \(S_{\nu \mu }\) and \(S_{\bar{\nu }\mu }\) so as we shall see in the next section, using the information on spectra will considerably boost the sensitivity to the new physics. However, the spectra \(S_{\nu \mu }\) and \(S_{\bar{\nu }\mu }\) are very close to each other and cannot be distinguished. This is due to the fact that \(F_{\nu _\mu }^\pi \) and \(F_{\bar{\nu }_\mu }^\pi \) are almost equal to each other; see Eqs. (23, 24). If an excess of \(\tau +\bar{\tau }\) is discovered, it will not be possible to distinguish if it comes from the lepton number conserving \(\pi ^+\rightarrow \mu ^+ \nu _\tau \) process or from the lepton number violating \(\pi ^+\rightarrow \mu ^+ \bar{\nu }_\tau \) process at FASER\(\nu \) by studying the energy spectrum of the events. One suggestion is to attune Q1-3 quadrapole and D1 dipole (located close to the interaction point) such that the transverse distribution of neutrinos emitted from \(\pi ^+\) and \(\pi ^-\) decays can be distinguished from one another.

The uncertainties in the predictions of the fluxes of \(\nu _\mu \), \(\bar{\nu }_\mu \), \(\nu _e\) and \(\bar{\nu }_e\) are relatively small but the predictions for \(\nu _\tau \) and \(\bar{\nu }_\tau \) suffer from large uncertainties. Ref. [11] shows that the different simulators predict the \(\nu _\tau \) flux which can differ from each other by more than 100%. To draw the \(\tau +\bar{\tau }\) spectrum, we have used neutrino fluxes from a simulator whose prediction is close to the median of the predictions of other simulators and is therefore recommended by Ref. [11]. More details are described in the end of Sect. 4. We shall show in Sect. 5 that the number of events from new physics at FASER\(\nu \) will be too low to reconstruct the spectra but, at FASER\(\nu \)2 with about 400 times more statistics, reconstructing the spectra of the events from new physics may become possible. By then, more dedicated simulations can reduce uncertainties in the neutrino flux predictions. Moreover, as we discuss in the next section, the data from FASER\(\nu \) during the run III of the LHC can itself determine which simulator for the neutrino fluxes is valid. Thus, before the start of high luminosity run of the LHC and FASER\(\nu \)2 data taking, the uncertainty in the standard model prediction for the neutrino fluxes can be significantly reduced.

4 Characteristics of FASER\(\nu \), SND@LHC and FASER\(\nu \)2

The FASER\(\nu \) and SND@LHC detectors are respectively located in the side tunnels TI12 and TI18, 480 m downstream the ATLAS Interaction Point (IP). FASER\(\nu \) is composed of 1000 emulsion layers interleaved with 1 mm tungsten plates [9]. The effective masses of FASER\(\nu \) [9] and SND@LHC [10] are respectively 1.2 ton and 800 kg and their sizes are \(25 ~\mathrm{cm}\times 25 ~\mathrm{cm}\times 1.3~\mathrm{m}\) and \(41.6 ~\mathrm{cm}\times 38.7 ~\mathrm{cm}\times 32~\mathrm{cm}\), respectively. Both detectors boast having excellent spatial and angular resolution in reconstructing the tracks of charged particles which will enable them to resolve the \(\nu _\tau \) CC events. The details of the FASER\(\nu \) and SND@LHC detectors are presented in [9] and in [10], respectively. An updated prediction for the fluxes at these detectors can be found in [11]. The upgrade of FASER\(\nu \) for the high luminosity LHC will have a size of \(40 ~\mathrm{cm}\times 40 ~\mathrm{cm}\times 8~\mathrm{m}\) and a mass of 20 tonnes [54]. The proposed location of this detector could be slightly off-axis at a distance of 620 meter or on-axis at a distance of 480 meter from the interaction point [54]. Ref. [11] has also predicted the neutrino within the SM at this detector which is assumed to be placed 620 m downstream from the ATLAS IP.Footnote 6

According to [41], the efficiency of FASER\(\nu \) in detecting 1-prong \(\tau \) decay is 75% and that of 3-prong decay is 15%. Considering that the branching ratios of 1-prong and 3-prong are respectively 85% and 15%, we take the average efficiency of 67% for the \(\nu _\tau \) detection at FASER\(\nu \). We take similar efficiency for the tau neutrino detection at SND@LHC. Considering table II of [9], throughout our analysis, unless it is stated otherwise, we take 15% uncertainty in the neutrino flux normalization.

It is shown in Ref. [9] that the resolution of the \(\nu _\mu \) energy measurement will be 30%. However, to our best knowledge, similar analysis has not been carried out for the energy resolution of \(\nu _\tau \) and \(\nu _e\) or for the detected \(\tau \). The tau particles at forward experiments will travel a distance of \(L_{tr}\sim 1~\mathrm{cm} (E_\tau /200~\mathrm{GeV})\) before decay. The direction of the momentum of \(\tau \) (or equivalently, the direction of the line connecting decay and production vertices) can be reconstructed with a precision of \(0.06 (1~\mathrm{cm}/L_{tr})\) mrad [4]. Due to the lepton flavor conservation, all decay modes of \(\tau \) contain \(\nu _\tau \) which appears as missing energy momentum. In hadronic decay modes of \(\tau \), which constitute 65% of the decays, only one neutrino is emitted. By measuring the energy-momentums of visible particles and reconstructing the direction of \(\tau \) momentum and using energy-momentum conservation, it will be therefore possible to reconstruct the \(\tau \) energy for the hadronic decay modes. For example, in case of \(\tau ^+\rightarrow \pi ^+ \nu _\tau \), the energy of the final pion and the angle that it makes with the direction of the tau momentum determine the energy of the \(\tau \). Thanks to the sub-milliradian angular resolution of FASER\(\nu \), the tiny angle [\(O(10^{-2})\)] between the directions of tau and pion momenta can be measured with remarkable accuracy so the energy resolution in determining the tau energy will mainly be limited by the energy resolution in measuring the energy of \(\pi ^+\). We take nominal value of 30% for the \(\tau \) energy reconstruction when necessary. As we shall see in the next session, at FASER\(\nu \) and SND@LHC during the run III of the LHC, the maximum signal events will be too small to justify binning the data so we shall only analyze the total number of predicted \(\tau \) events for the run III in studying the sensitivity for new physics. We have examined the dependence of the results on the energy resolution. As expected in the case of no binning, the results appear to be independent of the resolution of the energy measurement.

Table 4 The number of \(\tau \) events within the SM at FASER\(\nu \) using the prediction of three different simulators [11]. The relative \(\chi ^2\) minimized over normalization uncertainty of 15% is shown in the last column, assuming Pythia8 (Hard) as the true model
Table 5 Total expected number of \(\tau \) events at FASER\(\nu \) and SND during the run III and at FASER\(\nu \)2 during the high luminosity run of the LHC. To compare the values, we have saturated the bounds on the new physics, setting \(G_e\) equal to \(5 \times 10^{-7}\) GeV\(^{-2}\) and the branching ratios to \(2.4 \times 10^{-3}\). The last column shows the prediction for the SM background. The computation is made using the neutrino fluxes as described in the last paragraph of Sect. 4

As is well-known, the predictions of different simulators for the \(\nu _\tau \) and \(\bar{\nu }_\tau \) spectra at the forward experiments are significantly different. The prediction of simulator, J for the number of \(\tau \) events at the ith bin can be written as

$$\begin{aligned} {B}_i^J= & {} \varepsilon _\tau N_W \int _{E^i_{min}}^{E^i_{max}} \int _{m_\tau }\int _{E_\tau } \Bigl [ F_{\nu _\tau }^J(E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }\nonumber \\&\times (\nu _\tau +\mathrm{nucleus}\rightarrow \tau +X) \nonumber \\&+ F_{\bar{\nu }_\tau }^J(E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\bar{\nu }_\tau +\mathrm{nucleus}\rightarrow \tau ^+ +X) \Bigr ]\nonumber \\&\times f(E'_\tau ,E_\tau ) dE_\nu dE_\tau dE'_\tau , \end{aligned}$$
(26)

where \(F^J_{\nu _\tau }\) and \(F^J_{\nu _\tau }\) are respectively \(\nu _\tau \) and \(\bar{\nu }_\tau \) energy spectra predicted by different simulators in Ref. [11]. The superscript J determines the simulator. \(\varepsilon _\tau =0.67\) is the efficiency of the \(\nu _\tau \) detection at the detector. \(f(E_\tau ',E_\tau )\) is the energy resolution function which we take to be a Gaussian with a 30% width. \((E_{min}^i,E_{max}^i)\) determine the limits of the ith energy bin. \(N_W\) is the number of tungsten nuclei inside the detector, \(N_W=M_D/M_W\) where \(M_W=183 m_p\) and \(M_D=1.2\) ton for run III detector. In Table 4, we show our prediction for the number of events in different energy bins at FASER\(\nu \).

Taking the uncertainty on the flux normalization to be \(\sigma _\eta =15\%\), we have computed \(\chi ^2_{rel}\) as defined below for each model, J, and minimized over the pull parameter, f:

$$\begin{aligned} \chi ^2_{rel}=\sum _i \left[ \frac{[(1+f)B_i^{true}-N_i^J]^2}{B_i^{true}} \right] +\frac{f^2}{\sigma _\eta ^2}. \end{aligned}$$
(27)

In computing \(\chi _{rel}^2\) that is shown in last column of Table 4, we take \(B_i^{true}\) to be equal to the prediction of Pythia8 (Hard). Notice that since our bin sizes are large, the results should be robust against the value of the energy resolution. As seen in Table 4, the FASER\(\nu \) experiment can discriminate between different simulators predicting the \(\nu _\tau \) flux with high confidence level. We therefore assume a well-known shape of the fluxes predicted within the SM in making forecast for FASER\(\nu \)2. We shall however study the impact of the normalization uncertainty.

We may wonder whether in the presence of the \(\tau \) excess from new physics, FASER\(\nu \) can still discriminate between the simulators. As we shall see in Table 5, the maximum tau excess from new physics at FASER\(\nu \) will be \(\sim 5\). If the true flux is close to the DPMJET 3.2017 prediction (which predicts the highest \(\tau \) background of 59 events), addition of the excess will make it even more distinct from the rest of simulators. The excess of 5 events is too small to compensate for the predicted difference between DPMJET 3.2017 with 59 background events and Pythia8 with 25 background events. However, the background plus signal for SIBYLL 2.3c may come close to the Pythia8 prediction. We have computed \(\chi ^2_{rel}\) for SIBYLL 2.3c prediction for background plus events from \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )=2.4\times 10^{-3}\) using the binning described in Table 4. The result is \(\chi _{rel}^2=6.2\). That is even in the presence of maximum excess from new physics, FASER\(\nu \) can discriminate between the simulators at better than 99% C.L.

Hereafter in this study, for computing the SM background for the \(\tau +\bar{\tau }\) events, we take the predictions of Pythia8 (Hard) given in Ref. [11] for \(F_{\nu _\tau }\) and \(F_{\bar{\nu }_\tau }\). The Pythia8 (Hard) prediction is close to the median of the predictions of the DPMJET 3.2017 and SIBYLL 2.3c simulators. The other input spectra that we require for our computations are the fluxes of \(\nu _\mu \) and \(\bar{\nu }_\mu \) that are sourced by charged pion mesons, \(F_{\nu _\mu }^\pi \) and \(F_{\bar{\nu }_\mu }^\pi \). Fortunately, the differences between the predictions by different simulators for \(F_{\nu _\mu }^\pi \) and \(F_{\bar{\nu }_\mu }^\pi \) are negligible. For our computations, we use the average of the predictions given in Ref. [11].

5 Signatures of the models in forward experiments

In this section, we study how FASER\(\nu \) and SND@LHC during LHC run III can constrain the models introduced in Sect. 2. We compare the bounds with the results of [22] which has performed a similar analysis within the framework of effective field theory. We also compare our bounds with the existing bounds and the one to be derived by upcoming DUNE experiment [55]. We then show how FASER\(\nu \)2 can improve the existing bounds with or without reconstructing the energy spectrum of \(\tau \).

$$\begin{aligned} A. \pi ^+ \rightarrow \mu ^+ \nu _\tau \end{aligned}$$

Similarly to Eq. (26), we compute the number of signal events per bin as follows

$$\begin{aligned} \mathscr {N}_s^i= & {} \varepsilon _\tau N_W Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\int _{E^i_{min}}^{E^i_{max}}\nonumber \\&\times \int _{m_\tau }\int _{E_\tau } \Bigl [ F_{\nu _\mu }^\pi (E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\nu _\tau +\mathrm{nucleus}\rightarrow \tau +X) \nonumber \\&+ F_{\bar{\nu }_\mu }^\pi (E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\bar{\nu }_\tau +\mathrm{nucleus}\rightarrow \tau ^+ +X) \Bigr ] \nonumber \\&\times f(E'_\tau ,E_\tau ) dE_\nu dE_\tau dE'_\tau \end{aligned}$$
(28)

To compute the SM background per bin, \(B_i\), we use Eq. (26). The total observed number in bin i is then \(N_i^{obs}=B_i +\mathscr {N}_s^i\). We define the \(\chi ^2\) as follows

$$\begin{aligned} \varDelta \chi ^2 =\sum _i \left[ \frac{\left( B_i (1+\eta ) -N_i^{obs}\right) ^2}{B_i} \right] + \frac{\eta ^2}{\sigma _\eta ^2} \end{aligned}$$
(29)

where \(\eta \) is the pull parameter that takes care of the uncertainty in normalization, mainly coming from the uncertainty in the cross section and the flux normalization, \(\sigma _\eta =15\%\).

Fig. 2
figure 2

\(\varDelta \chi _{min}^2\) versus \(Br(\pi ^+\rightarrow \mu ^+\nu _\tau )\) (solid curves) or versus \(Br(\pi ^+\rightarrow \mu ^+\bar{\nu }_\tau )\) (dash-dotted curves). Purple, blue and green curves respectively correspond to SND@LHC, FASER\(\nu \) and FASER\(\nu \)2 without binning. The red curve shows the \(\varDelta \chi _{min}^2\) for SND@LHC and FASER\(\nu \) combined. The curves marked with “FiB” and “CoB” show \(\varDelta \chi _{min}^2\) for FASER\(\nu \)2 with an energy resolution of 30% and with two different binning patterns corresponding to fine and coarse binning. More detail are described in the text. Drawing all these curves, except the black one(s), we have assumed a 15% uncertainty in the normalization of the flux and have minimized over the corresponding pull parameter. Drawing the black curve(s), we have assumed zero uncertainty in the flux normalization and fine binning. The vertical lines from left to right correspond to the expected DUNE bound [55], the present bound from the lepton flavor universality constraint [27] and the forecast by [22] for FASER\(\nu \), all at 90% C.L

The second column in Table 5 shows total \(\tau \) events (\(\sum _i \mathscr {N}_s^i\)) originated from \(\pi ^+\rightarrow \nu _\tau \mu ^+\) with a branching ratio saturating the present bound. As described in the end of Sect. 4, we take the median flux for \(\nu _\mu \) and \(\bar{\nu }_\mu \) for our computation. We have examined using the hardest and softest predictions for \(\nu _\mu \) and \(\bar{\nu }_\mu \) fluxes and have found a difference of about 5%, with total excess at FASER\(\nu \) varying between 4.56 to 5.04. Thus, the number of signal events at FASER\(\nu \) and SND@LHC during the run III of the LHC data cannot reach a statistical limit so for the purpose of searching for new physics, binning the data does not make sense. To compute \(\chi ^2\) for these experiments, we consider only one bin (i.e., the total events). However, at FASER\(\nu \)2 during the high luminosity run the statistics will be large enough to use binning.

The solid lines in Fig. 2 show the \(\chi ^2\) minimalized over \(\eta \) (the normalization uncertainty). For the solid curves, the horizontal axis is \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\) which in terms of \(G_{\nu \mu }\) can be written as \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )=(G_{\nu \mu }^2/8G_F^2)[m_\pi ^4/(m_\mu ^2 (m_u+m_d)^2)]\). The upper horizontal axis shows the corresponding effective coupling. The purple, blue and red curves show \(\chi ^2\) minimized over the flux normalization uncertainty for SND@LHC, FASER\(\nu \) and their combination, respectively. Since the normalization uncertainty mainly originates from the production rate at the interaction point which is common for SND@LHC and FASER\(\nu \), we treat the uncertainty with a single pull parameter when combining the SND@LHC and FASER\(\nu \) predictions. That is for the joint analysis, we also use the definition of \(\varDelta \chi ^2\) in Eq. (29) with i running over SND@LHC and FASER\(\nu \) experiments. For FASER\(\nu \)2, we have used three different binning schemes: (1) no binning; (2) coarse binning with bins divided as \(E_\tau <50\) GeV, \(50~\mathrm{GeV}< E_\tau <100\) GeV, \(100~\mathrm{GeV}< E_\tau <500\) GeV, \(500~\mathrm{GeV}< E_\tau <1\) TeV and \(1~\mathrm{TeV}<E_\tau \); (3) fine binning with three bins at each energy decade. Drawing all these curves, we have taken an energy resolution of 30% in the \(E_\tau \) determination. Of course, for no binning case, the results do not depend on the energy resolution. Even for the coarse binning, \(\chi ^2\) is robust against varying the energy resolution. In all curves, except for the black one(s), we have taken the flux normalization uncertainty equal to 15%. Comparing the FASER\(\nu \)2 curves with each other, it is clear that binning the data (or in other words, using the spectral information) dramatically increases the sensitivity to the new physics signal from \(\pi ^+ \rightarrow \mu ^+ {\mathop {\nu }\limits ^{(-)}}_\tau \). Studying Fig. 1, this is understandable as the spectral shape of the background is considerably harder than the signal spectrum so the new physics signal cannot be hidden in the normalization uncertainty once the spectral uncertainty has been taken into account. Comparing black curve(s) with the rest, we observe that the uncertainty in the normalization significantly reduces the sensitivity.

Notice that for computing \(\chi ^2\), we have set the “observed” number of events per bin equal to the “average” background plus the “average” predicted signal for “true” Br\((\pi ^+\rightarrow \mu ^+{\mathop {\nu }\limits ^{(-)}}_\tau \)) rather than having real data. Thus, if we want to minimize \(\chi ^2\) over the only free parameter of the model which is Br\((\pi ^+\rightarrow \mu ^+{\mathop {\nu }\limits ^{(-)}}_\tau \)) we will invariably obtain zero. In fact, the \(\chi ^2\) that we are computing will have a \(\chi ^2\) distribution with only 1 (=number of pull parameters) degrees of freedom. As a results, regardless of the number of bins, the horizontal line at 2.7 represents 90% C.L. For example, we find that FASER\(\nu \)2 with coarse binning can constrain \(Br(\pi ^+ \rightarrow \mu ^+\nu _\tau )<4.6\times 10^{-4}\) at 90% C.L.

The vertical line at \(6\times 10^{-3}\) is the forecast at 90% for FASER\(\nu \) found by [22] which is in qualitative agreement with our results.Footnote 7 The vertical line at \(2.4 \times 10^{-3}\) shows the present bound from the flavor universality of \(\pi ^+\) decay [27]. Finally the forecast for DUNE near detector sensitivity at 90% is shown as a vertical line at \(8\times 10^{-5}\) [55]. As seen from the figure, FASER\(\nu \) and SND@LHC will not be able to improve the present bounds but FASER\(\nu \)2 will have a good prospect of improving the bound or find a signal. Reconstructing the energy spectrum of \(\tau \) with a moderate energy resolution can dramatically improve the reach for new physics.

We have also studied the robustness of the results against varying the flux prediction uncertainty. We have found that increasing the total flux uncertainty from 15 to 50%, the FASER\(\nu \)2 bound on \(Br(\pi ^+\rightarrow \mu ^+ \nu _\tau \)) at 90% C.L. shifts from \(1.8\times 10^{-3}\) to \(6\times 10^{-3}\). Moreover, we examine the possibility of 25% and 50% uncertainties in the flux shape by assigning different nuisance parameters for each bin. To be precise, we redefine \(\varDelta \chi ^2\) as

$$\begin{aligned} \varDelta \chi ^2=\sum _i \left[ \frac{(B_i(1+\eta _i)-N_i^{obs})^2}{B_i}+\frac{\eta _i^2}{\sigma _\eta ^2}\right] \end{aligned}$$

and we take \(\sigma _\eta \) equal to 0.25 and 0.5. \(\eta _i\) are the nuisance parameters for each bin over which we minimize \(\varDelta \chi ^2\). Understandably, by introducing the flux shape uncertainty, the sensitivity decreases. For example, the 90% C.L. bound by FASER\(\nu \)2 coarse binning at \(4.6 \times 10^{-4}\) shifts to \(2\times 10^{-3}\) with \(\sigma _\eta =0.25\) and to \(4\times 10^{-3}\) with \(\sigma _\eta =0.5\). However, considering the advances to be made in the flux prediction by the time that FASER\(\nu \)2 starts data taking, it seems to be safe to take uncertainties well below 50% as we have done in drawing Fig. 2.

The effects appearing for \(\pi ^+ \rightarrow \nu _\tau \mu ^+\) can be reinterpreted for a variety of other models, too. For example, let us consider a model that leads to the decay of \(\pi ^+\) to \(\mu ^+\) and a sterile neutrino \(\nu _s\): \(\pi ^+ \rightarrow \mu ^+ \nu _s\). As long as \(\nu _s\) is lighter than \(\sim 1\) MeV, the bound from flavor universality of the pion decay applies for this decay mode, too: \(Br(\pi ^+ \rightarrow \mu ^+ \nu _s)<2.4 \times 10^{-3}\). In principle, \(\nu _s\) can have a mixing as large as \(|U_{\tau 4}|\sim 0.3\) with \(\nu _\tau \). Such scenario is motivated by the two anomalous \(\nu _\tau \) events observed by ANITA [56, 57]. If the mass of \(\nu _s\) is much smaller than \(20~\mathrm{eV} (E_\nu /\mathrm{TeV})^{1/2}(480~\mathrm{m}/L)^{1/2}\), oscillation will not take place before reaching the FASER\(\nu \) detector so we would not have any excess due to \(\nu _s \rightarrow \nu _\tau \). With a sterile neutrino mass of \(\sim 20\) eV, there will be a \(\tau \) excess with oscillatory behavior with \(E_\nu \). For sterile neutrino mass much larger than 20 eV, the \(\nu _s\rightarrow \nu _e\) oscillation probability will average to \(2|U_{\tau 4}|^2(1-|U_{\tau 4}|^2)\) so there will be a \(\tau \) excess with similar spectrum as that from \(\pi ^+ \rightarrow \nu _\tau \mu ^+\). The discussion and results on \(\pi ^+ \rightarrow \nu _\tau \mu ^+\) also applies for this case, replacing \(Br(\pi ^+ \rightarrow \nu _\tau \mu ^+)\) with \(2|U_{\tau 4}|^2(1-|U_{\tau 4}|^2)Br(\pi ^+ \rightarrow \nu _s \mu ^+)\). Notice that in the scenario described in this paragraph, unlike the canonical \(3+1\) scheme, the sterile neutrino is produced by new physics (e.g., an intermediate new scalar) at pion decay rather than by oscillation. For a study of 3+1 scheme for the FASER\(\nu \) detector, see [58].

$$\begin{aligned} B. \pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau \end{aligned}$$

The number of signal events in this case is given by Eq. (28) by replacing \(Br(\pi ^+ \rightarrow \mu ^+ {\nu }_\tau )\) with \(Br(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau )\) and swapping \(F_{\bar{\nu }_\mu }^\pi (E_\nu )\) with \(F_{{\nu }_\mu }^\pi (E_\nu )\) as follows

$$\begin{aligned} {\mathscr {N}}_s^i= & {} \varepsilon _\tau N_W Br(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau )\int _{E^i_{min}}^{E^i_{max}} \int _{m_\tau }\int _{E_\tau }\nonumber \\&\times \left[ F_{\bar{\nu }_\mu }^\pi (E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\nu _\tau +\mathrm{nucleus}\rightarrow \tau +X)\right. \nonumber \\&\left. + F_{{\nu }_\mu }^\pi (E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\bar{\nu }_\tau +\mathrm{nucleus}\rightarrow \tau ^+ +X)\right] \nonumber \\&\times f(E'_\tau ,E_\tau ) dE_\nu dE_\tau dE'_\tau , \end{aligned}$$
(30)

The total number of signal events (summed over all bins) are shown in the third column of Table 5 for \(Br(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau )\) saturating the bound from universality measurement [27]. Similarly to the lepton number conserving case, the uncertainty in the signal prediction induced by the \(\nu _\mu \) and \(\bar{\nu }_\mu \) flux uncertainties is only 5%. The corresponding \(\chi ^2\) is also shown in Fig. 2 with dash-dotted line. The bound at \(2.4 \times 10^{-3}\) from the flavor universality of the pion decay applies for this model, too. Since \(F_{\nu _\mu }\simeq F_{\bar{\nu }_\mu }\), the curves corresponding to \(Br(\pi ^+ \rightarrow \mu ^+ \nu _\tau )\) and \(Br(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau )\) are very close to each other.

$$\begin{aligned} C. \nu _e +\mathrm{nucleus}\rightarrow \tau +X \end{aligned}$$

Let us now assess the effects of \(G_e\) and the \(\nu _e +\mathrm{nucleus}\rightarrow \tau +X\) process introduced in Sect. 2.3. The excess in \(\tau \) events in this case can be written as

$$\begin{aligned} {\mathscr {N}}_s^i= & {} \varepsilon _\tau N_W\int _{E^i_{min}}^{E^i_{max}} \int _{m_\tau }\int _{E_\tau } [ F_{{\nu }_e}(E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }\nonumber \\&\times (\nu _e +\mathrm{nucleus}\rightarrow \tau +X) \nonumber \\&+ F_{\bar{\nu }_e}(E_\nu ) \frac{d\sigma _{CC}}{dE_\tau }(\bar{\nu }_e +\mathrm{nucleus}\rightarrow \tau ^+ +X)]\nonumber \\&\times f(E'_\tau ,E_\tau ) dE_\nu dE_\tau dE'_\tau , \end{aligned}$$
(31)

Equating \(G_e\) with the bound from \(\tau ^- \rightarrow \pi ^- \nu _e\) (i.e., setting \(G_e=5\times 10^{-7}\) GeV\(^{-2}\)), Table 5 shows the predicted number of events at FASER\(\nu \), SND@LHC and FASER\(\nu \)2. Even at FASER\(\nu \)2, the number of the signal events cannot exceed 10. The values displayed in Table 5 correspond to the median \(\nu _e\) and \(\bar{\nu }_e\) fluxes as described in the end of Sect. 4. Taking the hardest flux in [11] the signal prediction increases only by 25%. As a result, we confirm the conclusion of [22] that the planned forward experiments cannot improve the bounds on \(G_e\).

6 Summary and discussion

We have shown that the FASER\(\nu \) detector will provide a breakthrough on our understanding of the interactions of the third generation leptons. We have focused on three beyond SM LFV processes that can give rise to a \(\tau \) excess at the FASER\(\nu \) detector: (i) \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \), (ii) \(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau \) and (iii) \(\nu _e +\mathrm{nucleus}\rightarrow \tau +X\). We have introduced three models based on adding new scalars charged under electroweak symmetry that give rise to these processes. We have shown that by imposing proper global U(1) flavor symmetries, the desired flavor patterns of the Yukawa couplings between the SM fermions can be explained. The same symmetry can also explain the smallness of the masses of the first generation fermions of the SM.

Our model for \(\pi ^+\rightarrow \mu ^+ \nu _\tau \) contains two new scalar doublets coupled respectively to the leptons and the quarks. In the presence of a mixing between the charged components of the two doublets, integrating out the heavy intermediate states we obtain a pseudoscalar–scalar four Fermi effective coupling, \(G_{\nu \mu }\), shown in Eq. (1) which gives rise to \(\pi ^+ \rightarrow \mu ^+ \nu _\tau \) with a rate enhanced by \(m_\pi ^2/(m_u+m_d)^2\). The bound on the deviation of \(\varGamma (\pi ^+ \rightarrow e^+ \nu )/\varGamma (\pi ^+ \rightarrow \mu ^+ \nu )\) from the standard model prediction [27] then implies \(G_{\nu \mu }<4 \times 10^{-8}~\mathrm{GeV}^{-2}\). The \(G_{\nu \mu }\) coupling can also lead to \(\nu _\tau +\mathrm{nucleus} \rightarrow \mu ^+ +X\) in the neutrino scattering experiments but without the \(m_\pi ^2/(m_u+m_d)^2\) enhancement. The bound on \(G_{\nu \mu }\) renders this effect negligible while having \(Br(\pi ^+\rightarrow \nu _\tau \mu ^+)\sim 10^{-3}\) which can lead to a sizable \(\tau \) excess in the forward experiments. We have shown that in terms of the formalism developed to study the effects of charged current non-standard interaction on neutrino experiments, we can write \(\varepsilon ^s_{\mu \tau }=[Br(\pi ^+ \rightarrow \nu _\tau \mu ^+)]^{1/2}\) so thanks to the \(m_\pi ^2/(m_u+m_d)^2\) enhancement, \(\varepsilon ^s_{\mu \tau }\) can be as large as \(O(\mathrm{few} \times 10^{-2})\). We have argued that bounds from NOMAD on the \(\tau \) production do not constrain this model because the energy of the neutrino flux produced by the pion (rather than the Kaon) decay in the NOMAD experiment is too low to lead to the \(\tau \) production. In our model, the bounds from \(\tau ^+ \rightarrow \mu ^+ \pi ^0\) can be satisfied because the neutral components of the new scalar doublets do not mix. We have pointed out that the model can also explain the observed \((g-2)_\mu \) deviation from the SM prediction.

By changing the flavor U(1) charge assignment to the leptons in the model described above, we obtain a model giving rise to the \(G_e\) effective coupling in Eq. (12) which leads to \(\nu _e +\mathrm{nucleus}\rightarrow \tau +X\). The \(G_e\) coupling yields LFV decay mode \(\tau ^+ \rightarrow \pi ^+\nu \) with a rate enhanced by \(m_\pi ^2/(m_u+m_d)^2\). The strong bound on this decay mode severely constrains \(G_e\). The model for the lepton number violating \(\pi ^+ \rightarrow \mu ^+ \bar{\nu }_\tau \) is based on introducing a scalar doublet coupled to the quarks plus a singlet charged scalar with an off-diagonal coupling to left-handed doublets of the second and third generations. Such a singlet is also motivated with a small anomaly observed in the \(\tau \) decay [26, 43]. We find that in the same range of the parameter space that can explain this deviation from the SM prediction, a sizable rate of \(\pi ^- \rightarrow \mu ^- \nu _\tau \) (leading to discernible tau excess in the forward experiments) can be obtained.

The bounds on the effective coupling discussed from the (lepton+ missing energy) signal at the LHC [22, 59] do not apply for our models because the new intermediate states whose integrating out lead to these effective couplings have mass around O(300 GeV) which is close to the center of mass energy of the partons scattering at the LHC. That is at the LHC, we cannot use the effective coupling formalism to describe the effects of new physics described in the present paper. We have briefly discussed the production of the new scalars and their potential signals at CMS and ATLAS [60, 61].

We have then studied the potential of forward experiments to test these models by looking for the \(\tau \) event excess. The bound forecasts for FASER\(\nu \) and SND@LHC by the present work is in agreement with those found in Ref. [22]. We have proceeded by studying the energy spectrum of the tau events and showed that since the energy spectrum of the signal is going to be considerably softer than the \(\tau \) event background, constructing the energy spectrum at FASER\(\nu \)2 can significantly improve the sensitivity to the new physics. We have discussed the possible resolution of the \(\tau \) energy reconstruction at FASER\(\nu \)2 and demonstrated the dependence of sensitivity to the new physics signal on the energy binning scheme.