1 Introduction

A variety of models beyond the Standard Model (SM) feature the existence of new massive particles with lifetimes that can be long, compared to the SM particles at the weak scale. These so-called long-lived particles (LLP) appear, for example, in Supersymmetry or extensions to the SM that predict right-handed neutrinos [1]. The study presented in this paper focuses on the search for decays of neutral LLPs using three production mechanisms: direct pair production (DPP), pair production from the decay of a SM-like Higgs boson with a mass of \(125 \,\text {GeV/}c^2 \) (HIG), and from charged current (CC) processes. Diagrams for each production mode are shown in Fig. 1. The production of LLPs from the decay of a SM-like Higgs boson has been studied in several searches conducted by the CMS, ATLAS and LHCb experiments, using LLP decays to light-flavour jets [2,3,4,5,6], b-quark jets [7] and light leptons [8, 9]. In this study the LLP can be a neutralino \(\tilde{\chi }^{1}_{0}\), in R-parity-violating supersymmetric models [10], or a right-handed neutrino N decaying to two charged leptons and a neutrino [11,12,13]. Searches for \(\text {LLP}\!\rightarrow {e ^\pm } {\mu ^\mp } {\nu } \)  decays have been performed by the ATLAS experiment in the context of Supersymmetry [14], and also with right-handed neutrinos [15].

The first direct \(\text {LLP}\!\rightarrow {e ^\pm } {\mu ^\mp } {\nu } \)  search at the LHCb experiment is presented in this paper. The LHCb detector probes the forward rapidity region that is only partially covered by the other LHC experiments, and triggers on particles with low transverse momenta, which allows the experiment to explore relatively small LLP masses. In the present study, displaced vertices consisting of an electron and a muon of opposite charges are searched for in \({p} {p} \) collisions at a centre-of-mass energy of \(\sqrt{s} = 13 \,\text {TeV} \), using a data sample corresponding to an integrated luminosity of \(5.38 \pm 0.11 \,\text {fb} ^{-1} \) collected with the LHCb detector in 2016–2018. The momentum of the neutrino in the final state can be partly reconstructed from the misalignment between the LLP flight direction and the momentum of the electron and muon system. The explored masses of the LLP (\(m_{\mathrm{LLP}}\)) range from 7 to \(50 \,\text {GeV/}c^2 \) and lifetimes (\(\tau _{\mathrm{LLP}}\)) range from 2 to \(50 \,\text {ps} \). This search enlarges the domain of searches for heavy LLPs at LHCb, which previously probed for displaced jets [4,5,6] or displaced dimuons [16,17,18].

Fig. 1
figure 1

Production modes of the LLP considered in this search. From left to right: direct pair production (DPP), decay of a SM-like Higgs with a mass of \(125 \,\text {GeV/}c^2 \) produced by gluon-gluon fusion (HIG) and production by charged current (CC)

2 Detector description

The LHCb detector [19, 20] is a single-arm forward spectrometer covering the pseudorapidity range \(2< \eta < 5\), designed for the study of particles containing \(b \) or \(c \) quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the \({p} {p} \) interaction region (VELO), a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4 {\,\text {Tm}}\), and three stations of silicon-strip detectors and straw drift tubes, placed downstream of the magnet. The tracking system provides a measurement of momentum, \(p\), of charged particles with a relative uncertainty that varies from \(0.5\%\) at low momentum to \(1.0\%\) at \(200\,\text {GeV/}c \). The minimum distance of a track to a primary \({p} {p} \) collision vertex (PV), the impact parameter (IP), is measured with a resolution of \((15 + 29/p_{\mathrm {T}}) \,\upmu \text {m} \), where \(p_{\mathrm {T}}\) is the component of the momentum transverse to the beam axis, in \(\,\text {GeV/}c\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter (ECAL) and a hadronic calorimeter (HCAL). Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.

The online event selection is performed by a trigger, which consists of a hardware stage based on information from the calorimeter and muon systems, followed by a software stage that carries out a full event reconstruction. During data taking an alignment and calibration of the detector is performed in near real-time and used in the software trigger [21]. Events from \({p} {p} \) collisions fulfilling the muon or electron trigger are studied. At the hardware level the muon trigger requires a muon track identified by matching hits in the muon stations, for the electron trigger a cluster in the ECAL with large transverse energy deposit is required. At the software level the muon trigger selects muons with a minimum \(p_{\mathrm {T}} \) of \(10 \,\text {GeV/}c \), the electron trigger selects electrons with a minimum \(p_{\mathrm {T}} \) of \(15 \,\text {GeV/}c \).

3 Simulation

Simulated samples of \(\text {LLP}\!\rightarrow {e ^\pm } {\mu ^\mp } {\nu } \)  events are used to design and optimise the signal selection and to estimate the detection efficiency, but also for the construction of the signal model. Parton-level events with LLPs are generated at leading order with MadGraph [22] using Universal FeynRules Outputs (UFO) [23] for long-lived particle searches following Ref. [1]. For the DPP and HIG mechanisms, the UFO for the minimal supersymmetric standard model with R-parity violation [10] is chosen, and in this framework the signal is represented by the lightest neutralino \(\tilde{\chi }^{1}_{0}\). For the CC production the UFO of the Left-Right Symmetric model [24,25,26] is used, and here the LLP is represented by a heavy neutrino produced from an on-shell \(W \) boson. For all three modes, the LLP is allowed to decay into an electron and a muon with opposite charges, and a neutrino. The decay of the LLP is performed through the MadSpin package [27]. The parton shower of the events is simulated with Pythia8 [28, 29] using a specific LHCb configuration [30] and using the CTEQ6 leading-order set of parton density functions [31]. The interaction of the particles with the detector and its response are implemented using the Geant4 toolkit [32, 33] as described in Ref. [34]. Signal events with \(m_{\mathrm{LLP}} = 7, 10, 15, 20, 30, 38\) and \(50 \,\text {GeV/}c^2 \) and \(\tau _{\mathrm{LLP}} = 2, 5, 10, 25\) and \(50 \,\text {ps} \) are generated.

Samples are also generated for background studies and cross checks, although the background estimate in this study is based on data. The most relevant background in this analysis is from \({b} {\overline{{b}}} \) events. Two distinct topologies are observed with the two leptons from the same jet or from two different jets, as discussed in Sect. 5. Events generated from \({gg /{q} {\overline{{q}}} \rightarrow {{b} {\overline{{b}}}}}\) processes with Pythia8, with at least one muon with \(p_{\mathrm {T}} > 10 \,\text {GeV/}c \) in the LHCb acceptance are simulated and required to satisfy the muon trigger criteria.

4 Signal selection

The \(\text {LLP}\!\rightarrow {e ^\pm } {\mu ^\mp } {\nu } \)  candidates are reconstructed from the combination of a muon and an electron candidate of opposite charges forming a good-quality vertex within the VELO detector. The following selection of the candidates is developed and optimised using the DPP samples for each pair of \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) values. This selection is also adopted for the study of the HIG and CC processes.

The muon and electron candidates are required to have \(p_{\mathrm {T}} > 1.6 \,\text {GeV/}c \) and \(p > 10 \,\text {GeV/}c \). The measured momentum of the electron candidates is corrected for the loss of energy due to bremsstrahlung [35]. The muon and electron need to form a good-quality vertex displaced from any PV, with a flight distance greater than 15 times its uncertainty. In addition, the lifetime of the candidate is required to be greater than \(0.5 \,\text {ps} \). For the estimate of the lifetime, the Lorentz boost is calculated from the dilepton momentum, \(p (e\mu )\), neglecting the contribution of the neutrino. The mass of the candidate is obtained from the dilepton system with a correction to account for not reconstructing the neutrino. The correction is inferred from the misalignment of the dilepton reconstructed momentum and the flight direction from the PV to the decay vertex. The corrected invariant mass is computed as \({m_{\mathrm{corr}} = \sqrt{m(e\mu )^2 + p (e\mu )^2 \sin ^2\theta } + p (e\mu ) \sin \theta }\) [36], where \(\theta \) is the angle formed by the dilepton momentum and the LLP flight direction. Candidates with \(m_{\mathrm{corr}} < 3.3 \,\text {GeV/}c^2 \) are discarded.

To suppress the heavy-flavour background the leptons are required to be isolated from other charged particles. The isolation variable is defined as \(I = (\vec {p} - \vec {p}_{\mathrm{cone}})_{\mathrm{T}} \; / \; (\vec {p} + \vec {p}_{\mathrm{cone}})_{\mathrm{T}}\), where \(\vec {p}\) is the momentum of the lepton candidate and \(\vec {p}_{\mathrm{cone}}\) is the sum of all the momenta of charged tracks, excluding the lepton candidates, within a distance \(\Delta R = \sqrt{\Delta \eta ^2 + \Delta \phi ^2}\) of 0.5 around the lepton, where \(\Delta \eta \) and \(\Delta \phi \) are the pseudorapidity and azimuthal angle differences between the lepton candidate and the charged tracks. The subscript T indicates the momentum component in the transverse plane. A value of \(I = 1\) denotes a fully isolated lepton. Candidates with \(I(\mu ) > 0\) and \(I(e) > 0.4\) are selected. Particle identification criteria are applied to the muon and the electron candidates. A tighter identification criterion on the electron is needed to reject the background due to misidentified pions or kaons. This criterion is optimised to preserve signal efficiency while maximising the rejection power over a data sample of same-sign candidates, \({e} ^{\pm }{\mu } ^{\pm }\), used as background proxy. The signal selection is also applied on the same-sign candidates. Figure 2 compares distributions of observables for data and simulated \({b} {\overline{{b}}} \) candidates, and examples of signals with different \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) values, which survive the selection presented above. Figure 2a, b show the candidates \(m_{\mathrm{corr}}\) and flight distance distributions. These observables are used in the fit to determine the presence of signal, as explained in Sect. 5. Figure 2c, d show the transverse momentum distributions of the muon and electron, respectively. These muon and electron \(p_{\mathrm {T}}\) distributions show the effect of the \(p_{\mathrm {T}}\) threshold in the muon and the electron triggers. In Fig. 2e, f the distributions of the isolation variable, I, are displayed for the muon and electron, respectively. The leptons from the signal are expected to be more isolated than the ones from the \({b} {\overline{{b}}} \) background.

Fig. 2
figure 2

Distributions in data (dashed black histogram) compared to simulated \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\mp } X\) (green filled histogram), showing, a \(m_{\mathrm{corr}}\), b the LLP flight distance, c the transverse momentum of the muon, d the transverse momentum of the electron, e the isolation of the muon, and f the isolation of the electron. LLP signal distributions are also shown (coloured histograms) for different \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) values, where the LLP is produced through the DPP mechanism. The distributions from simulation are normalised to the number of candidates in data. There are no simulated \({b} {\overline{{b}}} \) candidates for \(p_{\mathrm {T}} (\mu ) < 10 \,\text {GeV/}c^2 \) due to a \(p_{\mathrm {T}}\) requirement at the generation. For the same reason there is a lack of simulated \({b} {\overline{{b}}} \) candidates for \(p_{\mathrm {T}} (e) > 15 \,\text {GeV/}c^2 \) as candidates are required to pass the muon or electron trigger

A boosted decision tree (BDT) classifier [37, 38] is used to further purify the \(\text {LLP}\!\rightarrow {e ^\pm } {\mu ^\mp } {\nu } \)   candidate sample. The BDT is trained using 70k signal decays from a combination of DPP samples, and background candidates drawn from the same-sign sample. The full signal sample contains 2000 candidates for each set of (\(m_{\mathrm{LLP}}\), \(\tau _{\mathrm{LLP}}\)) parameters. Using all simulated signal samples for the training phase allows to obtain a uniform BDT response across the (\(m_{\mathrm{LLP}}\), \(\tau _{\mathrm{LLP}}\)) space. Furthermore, the uniformity is enforced by using a special cost function described in Ref. [39]. This cost function has the objective to provide the best classification between the signal and the background, while keeping the BDT response uniform on \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\). The BDT input observables are: the muon \(p_{\mathrm {T}}\); the maximum between the momentum of the two leptons; the two isolation variables; the angle between the muon momentum in the \(e\mu \) rest frame and the \(e\mu \) momentum; the ratio of the energy deposited by the muon in the calorimeters and its momentum; the ratio of the energy deposited by the electron in the HCAL and its momentum; the distance of closest approach between the two lepton tracks; the \(\chi ^2\) of the LLP decay vertex; the difference between the muon and electron impact parameters divided by the LLP impact parameter; the impact parameter \(\chi ^2\) of the leptons, \(\chi ^2_{\mathrm{IP}} (l)\), divided by \(\chi ^2_{\mathrm{IP}} (\text {LLP})\). For a given particle, the impact parameter \(\chi ^2\) is defined as the difference between the \(\chi ^2\) of the PV reconstructed with and without that particle. The BDT response, shown in Fig. 3, is uniformly distributed between 0 and 1 for the signal, while peaking at zero for the background. Candidates with a BDT value below 0.1 are rejected, leaving 61116 signal candidates. The observed BDT distribution is consistent with a \({b} {\overline{{b}}} \) composition of the background. Using the \({b} {\overline{{b}}} \) cross-section at \(13 \,\text {TeV} \) measured by LHCb, \(144 \pm 1 \pm 21 \,\upmu \text {b} \) [40], \((60 \pm 14) \times 10^3\) \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\mp } X\) candidates are predicted after selection, consistent with the observed total yield.

Fig. 3
figure 3

Distribution of the BDT response in data (dashed black histogram) compared to simulated \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\mp } X\) (green filled histogram) and LLP signal samples (coloured histograms) for different a \(m_{\mathrm{LLP}}\) and b \(\tau _{\mathrm{LLP}}\) values, where the LLP is produced through the DPP mechanism. The distributions from simulation are normalised to the number of candidates in data

5 Determination of the signal yield

The signal yield is determined from a simultaneous extended maximum likelihood fit to the LLP corrected mass \(m_{\mathrm{corr}}\) and flight distance distributions selected into two BDT intervals (0.1, 0.5] and (0.5, 1.0]. The study of the simulated \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\mp } X\) background indicates the presence of two components that depend on whether the two leptons belong to the same heavy-flavour jet or two different jets. The two components have different \(m_{\mathrm{corr}}\) and flight distance distributions, and can be separated by the distance \(\Delta R \) between the two leptons. When leptons originate from the same heavy-flavour jet, they have relatively small \(\Delta R \), selected with \(\Delta R < 1\), while \(\Delta R \ge 1\) selects the complementary component. The background probability density functions of the \(m_{\mathrm{corr}}\) and flight distance needed in the global fit are inferred from the same-sign data. This choice has been validated by a comparison of the distributions of \(m_{\mathrm{corr}}\) and the flight distance in simulated \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\mp } X\) and \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\pm } X\) candidates.

When \(\Delta R < 1\), the background \(m_{\mathrm{corr}}\) values are mostly found below \(6 \,\text {GeV/}c^2 \). This component is modelled using a sum of a Gaussian and a Crystal Ball function [41]. The fraction between the two distributions is fixed to the value obtained in the fit to the same-sign data. The parameters describing the tail are free in each BDT bin. Other parameters are free but common to all the BDT bins. For the \(\Delta R \ge 1\) region \(m_{\mathrm{corr}}\) is mostly above \(10 \,\text {GeV/}c^2 \). This region is modelled using a Johnson S\(_{U}\) distribution [42] with shape parameters free in each BDT bin. To model the signal \(m_{\mathrm{corr}}\) distribution a sum of a modified Gaussian distribution, where the left tail is exponential and the right tail a power law, and another Gaussian distribution is used. The parameters of the model are fixed to the values obtained from the fits to the simulated samples, for each (\(m_{\mathrm{LLP}}\), \(\tau _{\mathrm{LLP}}\)) hypothesis. The same signal \(m_{\mathrm{corr}}\) models are used for each BDT bin and production mechanism.

The background candidates with \(\Delta R < 1\) have long flight distances, above \(10 \,\text {mm} \). The opposite is true for \(\Delta R \ge 1\). The two components are modelled using a Johnson S\(_{U}\) distribution, with all parameters kept free. In the \(\Delta R < 1\) region the parameters of the model are not shared across the BDT bins, while they are shared when \(\Delta R \ge 1\). A kernel density estimation algorithm is used to estimate the probability density function of the flight distance distribution in simulated signal for each BDT bin. The same signal flight distance model for a given (\(m_{\mathrm{LLP}}\), \(\tau _{\mathrm{LLP}}\)) hypothesis is used for each production mechanism.

In the final fit the fractions of signal yield in each BDT interval are constrained by Gaussian functions to the values and uncertainties that are estimated in the simulation. In order to explore a larger set of \(m_{\mathrm{LLP}}\) values than the simulated set, signal templates for the \(m_{\mathrm{corr}}\) and flight distance distributions are interpolated from the simulated distributions using a moment morphing algorithm [43]. Distributions of \(m_{\mathrm{corr}}\) and the flight distance in two BDT regions are shown in Fig. 4, with an example of a fit result for a signal with \(m_{\mathrm{LLP}} = 47 \,\text {GeV/}c^2 \) and \(\tau _{\mathrm{LLP}} = 50 \,\text {ps} \) overlaid. For each \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) hypothesis the fitted yields are consistent with no signal present.

Fig. 4
figure 4

Distributions of \(m_{\mathrm{corr}}\) (top) and the flight distance (bottom) of two BDT intervals (left and right), where a simultaneous fit result for a LLP signal with \(m_{\mathrm{LLP}} = 47 \,\text {GeV/}c^2 \) and \(\tau _{\mathrm{LLP}} = 50 \,\text {ps} \) is overlaid; the fitted signal yield in this example is \(14 \pm 14\)

6 Signal efficiencies and systematic uncertainties

The determination of the signal detection efficiency relies on simulation. Systematic effects are identified from differences between data and simulation. Regarding the electron, samples of \({{J /\psi }} \rightarrow {e ^+e ^-} \) and \({Z} \rightarrow {e ^+e ^-} \) decays are considered, and \({{J /\psi }} \rightarrow {\mu ^+\mu ^-} \), \(\varUpsilon \rightarrow \mu ^{+} \mu ^{-}\) and \({Z} \rightarrow {\mu ^+\mu ^-} \) decays are used for the muon. Samples of \({{b} {\overline{{b}}}} \rightarrow {e ^\pm } {\mu ^\pm } X \) candidates are used to compare distributions of the reconstructed dilepton system such as the corrected mass and the flight distance. Systematic uncertainties on the signal efficiency have been evaluated. They are summarised in Table 1 and discussed in more details below. Also reported in the table are the uncertainties on the integrated luminosity, evaluated to be \(2\%\) [44], on the signal fraction in each BDT bin, and on the signal yield associated with the fit procedure, discussed at the end of this section.

Table 1 Contributions to the relative systematic uncertainties in \(\%\). The contributions are grouped in three categories, the integrated luminosity, the detection efficiency and the signal yield, separated by horizontal lines. The detection efficiency is affected by the parton luminosity model and depends upon the production process, with a maximum uncertainty of \(6.1\%\) for the gluon-gluon fusion process HIG
Fig. 5
figure 5

Total detection efficiency for LLP produced through the DPP mechanism as a function of \(m_{\mathrm{LLP}}\) (central line) and its uncertainty (coloured band), obtained for different values of \(\tau _{\mathrm{LLP}}\)

To account for the mismodelling in the simulation used to compute the signal efficiency, a bias for each variable used in the selection is determined by comparing simulated and experimental distributions of \(Z \) and \({b} {\overline{{b}}} \) candidates. The correlations between the selection variables are computed using the signal samples. The effect of imperfect simulation is subsequently estimated by recomputing several times the signal efficiency after changing the selection requirements on the variables by factors drawn from a multivariate normal distribution, with biases and correlations between the variables as input. The standard deviation of the distribution of efficiencies is found in the range 4.9 to \(7.3\%\), depending on the signal mass, lifetime and production mechanism, which is taken as a contribution to the systematic uncertainty. In a similar way, systematic uncertainties ranging from 0.5 to \(2.4\%\) are assigned to the identification of the two leptons.

The systematic uncertainty due to the imprecision in the simulated signal sample used to train the BDT classifier is estimated by applying the classifier on modified signal distributions: each input variable is multiplied by a scale factor drawn from a multivariate normal distribution built with the variable biases and correlations, also inferred from the control samples. The standard deviation of the efficiency distribution is used as systematic uncertainty, ranging from 0.6 to \(1.0\%\) for the BDT \(> 0.1\) requirement, and from 3.3 to \(4.0\%\) on the signal fraction in the BDT bins.

The contribution to the systematic uncertainty from the statistical precision of the simulated signal samples is in the range 1.1–3.0\(\%\).

The theoretical uncertainties are dominated by the limited knowledge of the partonic luminosity. This contribution is estimated following the procedure explained in Ref. [45] and varies from \(1.1\%\) up to \(6.1\%\). The minimum systematic contribution is found for the DPP and CC processes while the maximum contribution is found for the gluon-gluon fusion process HIG.

Finally, the total systematic uncertainty is obtained as the sum in quadrature of all contributions, where the different components of the detection efficiency are assumed to be fully correlated. In order to uniformly cover the full \(m_{\mathrm{LLP}}\) range, a third-order polynomial is fitted to the signal detection efficiency as function of \(m_{\mathrm{LLP}}\) for each simulated \(\tau _{\mathrm{LLP}}\) value. A second order polynomial is also fitted to the efficiency. The difference between the two efficiencies is assigned as systematic uncertainty, a contribution that is always less than \(4 \%\). The interpolated signal efficiency for LLPs produced through the DPP mechanism is shown in Fig. 5, accounting for the geometrical acceptance. The criteria on the vertex displacement favour large lifetimes; however, above \(10 \,\text {ps} \) the probability that the LLP decays outside the VELO increases, leading to a loss of efficiency. The selection efficiency increases with \(m_{\mathrm{LLP}}\), however, this effect is counteracted by the loss of lepton candidates outside the spectrometer acceptance, which is more likely for heavier LLPs. Therefore the signal efficiencies are highest for masses between 20 and \(30 \,\text {GeV/}c^2 \) and lifetimes between 5 and \(10 \,\text {ps} \). The DPP mechanism has the highest detection efficiency. On average, the detection efficiency for the HIG (CC) mechanism is \(20\%\) (\(60\%\)) lower than the DPP mechanism.

Fig. 6
figure 6

a Expected (open circles and dotted line) and observed (filled circles and solid line) upper limits of the cross-section as a function of \(m_{\mathrm{LLP}}\) for \(\tau _{\mathrm{LLP}} = 10 \,\text {ps} \), for LLPs produced through the DPP mechanism. The green and yellow bands indicate the quantiles of the expected upper limit corresponding to \(\pm 1\sigma \) and \(\pm 2\sigma \) for a Gaussian distribution. b Observed limits on the cross-section as a function of \(\tau _{\mathrm{LLP}}\) for different \(m_{\mathrm{LLP}}\) values for LLPs produced through the DPP mechanism

Fig. 7
figure 7

Observed upper limits on the production cross-sections times branching fraction for a \(m_{\mathrm{LLP}} = 7 \,\text {GeV/}c^2 \) and b \(m_{\mathrm{LLP}} = 29.8 \,\text {GeV/}c^2 \) as function of \(\tau _{\mathrm{LLP}}\) for the DPP, HIG and CC production mechanisms

The choice of templates for the corrected mass and flight distance can affect the result of the fit. The uncertainty due to the signal model accounts for imperfect simulation of the scale and resolution of the \(m_{\mathrm{corr}}\) and flight distance, and that of the finite size of the simulated signal samples used to produce the probability density functions. Uncertainties of \(0.2\%\) on the \(m_{\mathrm{corr}}\) scale and \(1.6\%\) on the \(m_{\mathrm{corr}}\) resolution are estimated from the comparison between data and \({b} {\overline{{b}}} \) simulated candidates. For the flight distance a scale uncertainty of \(1.2 \%\) and a resolution uncertainty of \(1.1 \%\) are estimated. The propagation of uncertainties is performed using pseudoexperiments generated from the background model fitted to the same-sign data. Ten signal data points are drawn from modified signal \(m_{\mathrm{corr}}\) and flight distance distributions, modified by smearing or rescaling, and added to each pseudoexperiment. The fitted signal yield is compared to the result with ten signal data points drawn from a non-modified signal. Changing the \(m_{\mathrm{corr}}\) scale leads to a relative change on the signal yield from 0.1 to \(1.2\%\), and 0.1 to \(0.8\%\) for the flight distance, depending on the signal hypothesis. A relative variation of the signal yield from 0.1 to \(8.1\%\) is observed from an additional smearing of the signal \(m_{\mathrm{corr}}\) distribution, 0.1 to \(0.8\%\) for the flight distance. The effect of the limited sample size used to construct the signal model is addressed by replacing the parameter values of the signal model by values drawn from Gaussian distributions. For each parameter the mean of the Gaussian distribution is equal to its fitted value, and the standard deviation is equal to its uncertainty. A relative variation of the signal yield due to the limited sample size is found to be between 0.1 and \(1.7\%\). A total systematic uncertainty 0.7–8.1\(\%\) is accounted for the signal yield.

All the systematic uncertainties related to the integrated luminosity, the signal efficiency and the signal yield are included as nuisance parameters in the determination of the cross-section upper limits.

7 Results

The results of the simultaneous fits to the LLP corrected mass and flight distance distributions in the two BDT intervals (0.1, 0.5] and (0.5, 1.0], are found to be compatible with the background-only hypothesis for all signal hypotheses considered. Upper limits at 95\(\%\) confidence level (CL) on the production cross-sections times branching fraction are computed for each production mechanism,

$$\begin{aligned} \sigma _{\mathrm{DPP}}&= \sigma (q\bar{q} \rightarrow \tilde{\chi }^{0}_{1}\tilde{\chi }^{0}_{1}) \times \mathcal {B}(\tilde{\chi }^{0}_{1} \rightarrow e^{\pm }\mu ^{\mp }\nu ), \\ \sigma _{\mathrm{HIG}}&= \sigma (gg \rightarrow h) \times \mathcal {B}(h \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1}) \times \mathcal {B}(\tilde{\chi }^{0}_{1} \rightarrow e^{\pm }\mu ^{\mp }\nu ), \text { and}\\ \sigma _{\mathrm{CC}}&= \sigma ({W} \rightarrow l N) \times \mathcal {B}(N \rightarrow e^{\pm }\mu ^{\mp }\nu ), \end{aligned}$$

for each pair of \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) values using the CLs approach [46]. Upper limits for selected \(m_{\mathrm{LLP}}\) and \(\tau _{\mathrm{LLP}}\) values are shown in Fig. 67 and 8. Figure 6a gives examples of observed upper limits on \(\sigma _{\mathrm{DPP}}\), along with the range of limits expected for the background-only hypothesis, as a function of \(m_{\mathrm{LLP}}\) for \(\tau _{\mathrm{LLP}} = 10 \,\text {ps} \). Figure 6b shows the observed upper limits on \(\sigma _{\mathrm{DPP}}\) as a function of \(\tau _{\mathrm{LLP}}\), for a selection of \(m_{\mathrm{LLP}}\) values that shows the range of limit values. The best observed limits on \(\sigma _{\mathrm{DPP}}\) are of the order of \(0.06 \,\text {pb} \) for a mass of \(29.8 \,\text {GeV/}c^2 \). A comparison of observed upper limits on \(\sigma _{\mathrm{DPP}}\), \(\sigma _{\mathrm{HIG}}\) and \(\sigma _{\mathrm{CC}}\) as a function of \(\tau _{\mathrm{LLP}}\) for the lowest mass studied, \(m_{\mathrm{LLP}} = 7\), and \(29.8 \,\text {GeV/}c^2 \) is shown in Fig. 7. The best and worst limits are obtained for the DPP and CC mechanisms, respectively. The differences between the sensitivities for each production mechanism are principally due to detection efficiency. The limits obtained by the ATLAS experiment on the squark-antisquark production cross-section [14], where the squark has a mass of 700 or \(1600\,\text {GeV/}c^2 \) and decays to \({q} \, (\tilde{\chi }^{0}_{1} \rightarrow ee\nu / e\mu \nu / \mu \mu \nu )\), have values from 1 to \(10 \,\text {fb} \) for \(m(\tilde{\chi }^{0}_{1}) = 50 \,\text {GeV/}c^2 \) in the lifetime range studied. These results are complementary to the results obtained by the ATLAS experiment, extend to lower mass and lifetime regions and explore different LLP production mechanisms.

Finally, the limits on \(\sigma _{\mathrm{HIG}}\) are compared to the value of the SM Higgs boson production cross-section from gluon-gluon fusion of \(48.6 \pm 3.5 \,\text {pb} \) [47], which is illustrated in Fig. 8. These limits are placed on \((\sigma / \sigma ^{SM}_{gg \rightarrow H}) \times \mathcal {B}({H ^0} \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1})\), assuming \(\mathcal {B}(\tilde{\chi }^{0}_{1} \rightarrow e^{\pm }\mu ^{\mp }\nu ) = 1\), as a function of \(\tau _{\mathrm{LLP}}\) for a selection of \(m_{\mathrm{LLP}}\) values. Under this assumption the limits on \(\mathcal {B}({H ^0} \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1})\) have a minimum of \(\sim 0.15 \%\). Decays of \(\text {LLP} \rightarrow {\mu ^+\mu ^-} \), produced in pairs from SM Higgs bosons, were searched by the CMS experiment [8]. Assuming \(\mathcal {B}(\text {LLP} \rightarrow {\mu ^+\mu ^-}) = 1\), the limits on \(\mathcal {B}({H ^0} \rightarrow \text {LLP} \, \text {LLP})\) for \(m_{\mathrm{LLP}} = 50 \,\text {GeV/}c^2 \) are the best for lifetimes between \(1\,\text {ps} \) and \(10 \,\text {ns} \) with a minimum of \(0.05 \%\) [48], which is approximately 3 times lower than the minimum limits on \(\mathcal {B}({H ^0} \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1})\) presented in this paper.

Fig. 8
figure 8

Observed limits on the \((\sigma / \sigma ^{SM}_{gg \rightarrow H}) \times \mathcal {B}({H ^0} \rightarrow \tilde{\chi }^{0}_{1} \tilde{\chi }^{0}_{1})\), assuming \(\mathcal {B}(\tilde{\chi }^{0}_{1} \rightarrow e^{\pm }\mu ^{\mp }\nu ) = 1\) as a function of \(\tau _{\mathrm{LLP}}\) for different \(m_{\mathrm{LLP}}\) values. The value of the gluon-gluon fusion production cross-section used is \(48.6 \pm 3.5 \,\text {pb} \) [47]

8 Conclusion

A search for decays of long-lived massive particles, in the \({e ^\pm } {\mu ^\mp } {\nu } \) final state, is performed using \({p} {p} \) collisions at \(\sqrt{s} =13 \,\text {TeV} \) recorded with the LHCb detector, for a total integrated luminosity of \(5.38 \pm 0.11 \,\text {fb} ^{-1} \). The search covers LLP masses from 7 to \(50 \,\text {GeV/}c^2 \), lifetimes from 2 to \(50 \,\text {ps} \) and considers three production mechanisms: the direct pair production from the interaction of quarks, the pair production from the decay of a SM-like Higgs boson with a mass of \(125 \,\text {GeV/}c^2 \), and the charged current production from an on-shell \(W \) boson with an additional lepton.

Fully simulated signal events are used to define the signal selection criteria and the signal detection efficiency. The background is dominated by \({{b} {\overline{{b}}}} \) candidates. A BDT, taking as input properties of the leptons and displaced vertex of the LLP, is used to purify the signal from the heavy hadron background. The signal yield is determined by a simultaneous fit of the LLP corrected mass and flight distance, using signal templates derived from simulation. All the results of the fits are compatible with the absence of signal, and upper limits on the cross-section times branching fraction for each production mechanism are computed. The best upper limits are achieved for the pair production, from interaction of quarks or the decay of a SM-like Higgs boson, for lifetimes below \(10 \,\text {ps} \) and masses above \(10 \,\text {GeV/}c^2 \), and are of the order of \(0.1 \,\text {pb} \).