1 Introduction

Supersymmetry (SUSY) [1,2,3,4,5,6] is a theoretical extension of the Standard Model (SM) which fundamentally relates fermions and bosons. It is an alluring theoretical possibility given its potential to solve the hierarchy problem [7,8,9,10]. An ad hoc conserved quantity, R-parity [11], is often introduced in SUSY models to avoid rapid proton decay, rendering the lightest supersymmetric particle (LSP) stable and therefore a potential dark-matter candidate [12, 13]. There is no fundamental theoretical reason to impose strict R-parity conservation, and R-parity-violating (RPV) SUSY models are well motivated, with fewer experimental constraints than many R-parity-conserving (RPC) models [14, 15], and allow for more natural supersymmetric mass spectra.

This article presents a search for pair production of supersymmetric particles with subsequent RPV decays in a final state with at least one isolated lepton (electron or muon), at least eight to fifteen jets (depending on the jet transverse momentum threshold), several of which may contain b-flavoured hadrons (b-jets), and with no requirement on the missing transverse momentum. Such a final state is commonly predicted in RPV models with either baryon-number-violating [16, 17] or lepton-number-violating couplings [18]. Events are assigned to one of two categories according to their lepton content. The first category contains events with two leptons with the same electric charge (\(2\ell ^\text {sc} \)), while the second category contains all other events and is dominated by single-lepton events (\(1\ell \)). Electrons and muons from \(\tau \)-lepton decays are also considered. A multi-bin fit in each lepton category to the two-dimensional space of jet multiplicity and b-jet multiplicity is used to constrain parameters of benchmark RPV simplified signal models [19,20,21]. A third variable, based on a machine-learning discriminant, is introduced to improve the sensitivity of the search when only testing for the electroweak production of supersymmetric particles. This search has potential sensitivity to a large number of beyond the Standard Model (BSM) physics models, and model-independent limits on the possible contribution of BSM physics to several single-bin signal regions are shown.

The dominant Standard Model background in the \(1\ell \) category arises from top-quark pair production and W/Z + jets production, with at least one lepton produced in the vector-boson decay. In the \(2\ell ^\text {sc} \) category, the production of a top-quark pair in association with a W boson (\(t\bar{t}W\)), or with a misidentified lepton, constitutes the main background. The theoretical modelling of these backgrounds at high jet multiplicity suffers from large uncertainties, so they are estimated from the data by extrapolating the b-jet multiplicity distribution extracted at moderate jet multiplicities to the high jet multiplicities of the search region.

This analysis is an update to a previous ATLAS search [22] for new phenomena in a final state with a lepton and high jet multiplicity, which was performed with \(36\hbox { fb}^{-1}\) of \(\sqrt{s} = 13\hbox { TeV}\) proton–proton collision data. It improves upon the previous result owing to the larger luminosity, the dedicated categorization and analysis of events with two leptons with the same electric charge, and the introduction of multivariate discriminants. This search represents the first LHC result to obtain sensitivity to electroweak production of SUSY particles promptly decaying to quarks, as predicted in baryon-number-violating RPV models. Previous searches targeting similar RPV SUSY models have been carried out by the ATLAS and CMS collaborations [23,24,25,26,27,28,29,30,31,32,33]. The result is also sensitive to SM four-top-quark production, and a validation of the background estimation methods is performed by fitting the normalization of the four-top-quark process relative to the Standard Model value. Previous searches for four-top-quark production were carried out by the ATLAS [34] and CMS [35] collaborations.

Fig. 1
figure 1

Examples of signal diagrams for the simplified RPV models considered in this article. Cases where both of the gluinos (or the stops) decay in the same way are also considered, and pair production is also considered for the higgsino LSP type. For simplicity particles and anti-particles are shown using the same symbols, omitting the anti-particle notation

2 Signal models

Simulated signal events from five SUSY benchmark simplified models (representative production diagrams shown in Fig. 1) are used to guide the analysis selections and to estimate the expected signal yields for different signal-mass hypotheses used to interpret the analysis results. In all models considered, the RPV couplings and the SUSY particle masses are chosen to ensure prompt decays of the SUSY particles. The supersymmetric particle content of the models is the partner of the SM gluon (gluino), the partner of the right-handed top quark (stop), and one or more electroweakinos. The electroweakinos are massive fermions resulting from the mixing between the partners of SM electroweak and Higgs bosons.Footnote 1 Three different possibilities for the electroweakino composition are tested: pure bino, pure wino or pure higgsino. In all cases the lightest neutralino ( ) is the LSP. When considering a wino (higgsino) LSP, the corresponding chargino (and second neutralino ) is assumed to be effectively mass degenerate with the LSP, as predicted by theory [36, 37], and share the same composition as the LSP. All the electroweakinos that are present under the hypothesis of a given composition are considered in both the production and decay processes. All other electroweakinos are assumed to be decoupled and not considered in the model. The gluino and stop branching ratios, as well as the electroweakino production cross-section, are determined by the nature of the electroweakino. Table 1 summarizes the gluino and stop branching ratios, and shows example cross-sections for direct electroweakino production [38,39,40,41,42], for each electroweakino type. In each case, the electroweakinos decay through a non-zero RPV coupling large enough to ensure prompt decays for the particle masses considered, and small enough to avoid more complex decay patterns involving mixtures of both RPC and RPV decays that are not considered here. Within this scenario the analysis results are independent of the value of the coupling.

Table 1 Stop and gluino branching ratios, as well as cross-sections for direct electroweakino production, as a function of the LSP type. For a pure bino/wino/higgsino LSP, the electroweakino states considered are \({\tilde{\chi }}^0_{1}/{\tilde{\chi }}^0_{1}{\tilde{\chi }}^\pm _{1}\)/\({\tilde{\chi }}^0_{1}{\tilde{\chi }}^\pm _{1}{\tilde{\chi }}^0_{2}\), respectively. When relevant, decays to and are merged as they are assumed to be mass degenerate and both decay in the same way. The production cross-sections are given for an electroweakino mass of

Four of the simplified models are inspired by a common natural RPV SUSY model assuming the minimal flavour violation hypothesis [16, 17]. The coupling \(\lambda _{323}''\) is chosen, as it is predicted to be dominant.Footnote 2 With the chosen model parameters, the electroweakinos decay as \({\tilde{\chi }}^0_{1/2} \rightarrow t b s\) and \({\tilde{\chi }}^\pm _1 \rightarrow b b s\), with a branching ratio of 100%. At the lowest order in perturbation theory, signal events in these models contain four, six, or eight b-jets in the final state, depending on the production mode. In the first model, gluino production is considered, with decays to heavy-flavour quarks and the electroweakino, which in turn decays via the RPV coupling. The stop, with a mass assumed to be above the gluino mass, is not considered in the model. A signal diagram for this model is shown in Fig. 1a. The second model considers gluino pair production, with each gluino decaying into a top quark and a stop, as shown in Fig. 1b. In this model the RPV coupling is assumed to be large, so that the stop decays via an RPV mode into an s-quark and a b-quark. The absence of RPC decays of the stop render the electroweakino mass irrelevant in this model. The third scenario considered involves stop pair production with the stop decaying into an electroweakino and a top or bottom quark, while the gluino is set to a very high mass and not considered in the model. An example signal diagram is shown in Fig. 1c. In the fourth model, only electroweakino production is considered, with the stop and gluino assigned a very high mass and not considered in the model. Figure 1d considers the production of . In the case of a higgsino LSP, a similar diagram produces , and a further diagram producing is enabled, as shown in Fig. 1e. Production of contributes only to the \(1\ell \) category, while the production of contributes to both the \(1\ell \) and \(2\ell ^\text {sc} \) categories. Production of is not considered because the decays produce a final state with no leptons. Production of vanishes and is not considered. Therefore electroweak production in the pure bino scenario is not considered.

The fifth and last simplified RPV model considers gluino pair production, where each gluino decays into two first- or second-generation quarks (\(q \equiv (u,d,s,c)\)) and a , which is the LSP. The decays into two additional first- or second-generation quarks and a charged lepton or a neutrino ( or , labelled as ). The decay proceeds via a \(\lambda '\) RPV coupling, where each RPV decay can produce any of the four first- or second-generation leptons (\(e^{\pm },\mu ^{\pm },\nu _e,\nu _\mu \)) with equal probability. An example signal diagram is shown in Fig. 1f. Signal decays from this model yield a final state with high jet multiplicity and zero b-jets.

3 ATLAS detector

The ATLAS experiment [43] at the LHC is a multipurpose particle detector with a forward–backward symmetric cylindrical geometry and nearly \(4\pi \) coverage in solid angle.Footnote 3 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic (EM) and hadron calorimeters, and a muon spectrometer (MS). The inner tracking detector covers the pseudorapidity range \(|\eta | < 2.5\). It consists of silicon pixel, silicon microstrip and transition radiation tracking detectors; the innermost layer is 33 mm from the beamline [44, 45]. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic energy measurements with high granularity. A steel/scintillator-tile hadron calorimeter covers the central pseudorapidity range (\(|\eta | < 1.7\)). The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to \(|\eta | = 4.9\). The muon spectrometer surrounds the calorimeters and is based on three large superconducting air-core toroidal magnets with eight coils each. The field integral of the toroids ranges between 2.0 and 6.0 T\(\cdot \)m across most of the detector. The muon spectrometer includes a system of precision tracking chambers and fast detectors for triggering. A two-level trigger system is used to select events [46]. The first-level trigger is implemented in hardware and uses a subset of the detector information to keep the accepted rate below approximately 100 kHz. This is followed by a software-based trigger that reduces the accepted event rate to approximately 1 kHz on average depending on the data-taking conditions. An extensive software suite [47] is used for real and simulated data reconstruction and analysis, for operation and in the trigger and data acquisition systems of the experiment.

4 Monte Carlo event simulation

Signal and background events produced in proton–proton collisions were simulated with various Monte Carlo (MC) generators. The simulated events are used in the optimization of event selection criteria, in the neural network training, to estimate systematic uncertainties, to validate the background estimation procedure employed for the dominant background sources, and to predict yields for the subdominant background contributions and for possible signals. The signal and background events were passed through the \(\textsc {Geant4} \) [48] simulation of the ATLAS detector [49] and reconstructed using the same algorithms as are used for the data.

The generation of the simulated event samples includes the effect of multiple proton–proton interactions per bunch crossing, as well as the impact on the detector response due to interactions from bunch crossings before or after the one containing the hard interaction. The effect of multiple interactions in the same and neighbouring bunch crossings (pile-up) is modelled by overlaying the hard-scattering event with simulated inelastic proton–proton events generated by Pythia 8.186 [50] using the NNPDF2.3lo set of parton distribution functions (PDF) [51] and the A3 set of tuned parameters [52]. The MC events are weighted to reproduce the distribution of the average number of interactions per bunch crossing (\(\left<\mu \right>\)) observed in the data. The EvtGen [53] program was used to simulate properties of the b- and c- flavoured hadron decays.

The signal event samples were generated using the MadGraph5_aMC@NLO [54] generator interfaced to Pythia 8 for the modelling of the parton showering, hadronization, and underlying event. The matrix element (ME) calculation was performed at tree level and includes the emission of up to two additional partons. The signal samples were processed through a fast simulation of the ATLAS detector [49, 55]. Gluino and stop signal cross-sections are calculated to approximate next-to-next-to-leading order in the strong coupling constant, adding the resummation of soft gluon emission at next-to-next-to-leading-logarithm accuracy (approximate NNLO+NNLL) [56,57,58,59,60,61,62,63,64,65,66]. The cross-sections for electroweakino production are calculated to next-to-leading order in the strong coupling constant, adding the resummation of soft gluon emission at next-to-leading-logarithm accuracy (NLO+NLL) [38,39,40,41,42].

Table 2 Simulated background event samples: the corresponding event generator, parton-shower modelling, cross-section normalization, PDF set and underlying-event parameter tune are shown. The samples marked with (*) are alternative samples used to validate the background estimation method or to assess systematic uncertainties in the modelling. The abbreviation MG5_aMC is used to label the MadGraph5_aMC@NLO generator. Samples produced with Sherpa use the default set of tuned parameters of the generator

The production of \(t\bar{t}\), \(t\bar{t}H\), and single-top events was modelled at NLO using the PowhegBox [72,73,74,75, 93] generator. Additional \(t\bar{t}\) samples were generated with MadGraph5_aMC@NLO interfaced with Pythia 8, and with PowhegBox interfaced with Herwig 7 [83, 84], for modelling comparisons and evaluation of systematic uncertainties.

The production of \(t\bar{t}V\) (V = W, Z) events was modelled using the Sherpa generator. The ME was calculated for up to one additional parton at NLO and up to two partons at LO using the Comix [94] and OpenLoops libraries [95, 96], and merged with the Sherpa parton shower using the MEPS@NLO prescription [97,98,99,100,101]. Alternative \(t\bar{t}V \) samples produced with the MadGraph5_aMC@NLO generator at NLO were used to evaluate systematic uncertainties associated with the modelling of additional QCD radiation.

The production of \(t\bar{t}t\bar{t}\), \(t \bar{t} t\), \(tWZ\), \(tZ\), \(t\bar{t}WW\), and \(t\bar{t}WZ\) events was modelled using the MadGraph5_aMC@NLO generator at NLO, and interfaced with Pythia 8. An alternative \(t\bar{t}t\bar{t}\) sample showered with Herwig 7, was used to evaluate systematic uncertainties related to the choice of parton-shower model.

The production of an electroweak gauge boson or virtual photon in association with jets (V+jets) was simulated with the Sherpa generator using NLO matrix elements for up to two partons, and LO matrix elements for up to four partons. Alternative V+jets samples used to validate the analysis methods were simulated with MadGraph5_aMC@NLO using LO-accurate MEs with up to four final-state partons.

Event samples with diboson (VV) and triboson (VVV) final states were simulated with the Sherpa generator, including off-shell effects and Higgs boson contributions, where appropriate. The VV processes were simulated using matrix elements at NLO accuracy in QCD for up to one additional parton and at LO accuracy for up to three additional parton emissions. The production of triboson (VVV) events was simulated with the Sherpa generator using factorized gauge-boson decays.

A summary of the background samples used, together with the event generator configurations, can be found in Table 2.

5 Event reconstruction and object identification

Proton–proton collision data recorded by the ATLAS detector between 2015 and 2018 are used to perform the analysis. In this period, the LHC delivered colliding beams with a peak instantaneous luminosity up to \(L=2.1 \times 10^{34}\hbox { cm}^{-2}\hbox {s}^{-1}\), achieved in 2018, and an average number of pp interactions per bunch crossing of 33.7. After applying beam, detector, and data-quality criteria the total integrated luminosity of the dataset is \(139\hbox { fb}^{-1}\) [102]. The uncertainty in the combined 2015–2018 integrated luminosity is 1.7% [103], obtained using the LUCID-2 detector [104] for the primary luminosity measurements.

Proton–proton interaction vertices are reconstructed from charged-particle tracks with \(p_{T}>500\,\hbox {MeV}\) [105, 106] in the ID. The presence of at least one such vertex with a minimum of two associated tracks is required, and the vertex with the largest sum of \(p_{\text {T}}^{2}\) of associated tracks is chosen as the primary vertex.

Jet candidates are reconstructed up to \(|\eta | =4.9\) using the anti-\(k_t\) algorithm [107, 108] with radius parameter \(R=0.4\). It uses particle-flow objects as inputs, combining tracking and calorimetric information as detailed in Ref. [109]. The jets are calibrated using the methodology described in Ref. [110]. Any event that contains jets induced by calorimeter noise or non-collision background, according to criteria similar to those described in Ref. [111], is removed. Jets up to \(p_{T}={60}\,\hbox {GeV}\) containing a large energy contribution from pile-up interactions are suppressed with the jet-vertex tagging (JVT) algorithm that uses tracking and primary vertex information to determine if a given jet originates from the primary vertex [112]. Jets with \(p_{T}> 20\,\hbox {GeV}\) and \(|\eta | <2.5\) are defined as signal jets, and used further in the analysis.

Signal jets containing b-flavoured hadrons are identified with the DL1r \(b\text {-tagging}\) algorithm [113, 114] with an average identification efficiency of 70% in simulated \(t{{\bar{t}}}\) events. The rejection factor is measured to be approximately 300 for jets initiated by light quarks and gluons and approximately 9 for jets initiated by charm quarks [113].

Electron candidates are reconstructed as tracks in the ID matched to energy clusters in the EM calorimeter, within \(|\eta | < 2.47\) [115]. The analysis considers only candidate electrons with \(p_{\mathrm{T}} > 10\,\hbox {GeV}\) and not in the transition region between the barrel and endcap calorimeters (\(1.37< |\eta | < 1.52\)). The electron identification is based on a multivariate likelihood-based discriminant that uses the shower shapes in the EM calorimeter and the associated track properties measured in the ID. The electron candidates must satisfy the ‘Medium’ identification criteria described in Ref. [115], while the signal electrons must satisfy the ‘Tight’ identification for better rejection of non-prompt or misidentified electrons. The electron identification efficiency varies with increasing \(p_{\mathrm{T}}\) in \(Z \rightarrow ee\) events, from 65% at \(p_{T} = 10\,\hbox {GeV}\) to 88% at \(100\,\hbox {GeV}\) for the Tight operating point, and from 75% at \(20\,\hbox {GeV}\) to 94% at \(100\,\hbox {GeV}\) for the Medium operating point.

For candidate and signal electrons, the longitudinal impact parameter of the electron track, \(z_0\), is required to satisfy \(|z_{0}\sin \theta | < 0.5\,\hbox {mm}\), where \(\theta \) is the polar angle of the track. For signal electrons, the transverse impact parameter divided by its uncertainty, \(|d_{0}|/\sigma (d_{0})\), is required to be at most five.

For all signal electrons there must be no association with a vertex from a reconstructed photon conversion [115] in the detector material. To further reduce the photon conversion background, additional requirements are applied to the signal electrons [34]: (i) the candidate must not have a reconstructed displaced vertex with a conversion radius \(r > 20\,\hbox {mm}\) whose reconstruction uses the track associated with the electron, (ii) the invariant mass of the system formed by the track associated with the electron and the closest track (in \(\Delta R\)) at the primary vertex or a conversion vertex is required to be larger than \(100\,\hbox {MeV}\). This photon conversion veto has an average efficiency of 99% for prompt electrons while providing a rejection factor of 4 for electrons from photon conversion.

In the \(2\ell ^\text {sc} \) category, signal electrons with wrongly reconstructed charge (charge-flip) are suppressed using a boosted decision tree (BDT) discriminant exploiting additional tracks in the vicinity of the electron and track-to-cluster matching variables [115]. A rejection factor of around 9 for electrons with a wrong charge assignment is achieved, while selecting properly measured electrons with an efficiency of 98%, in simulated \(Z \rightarrow ee\) events selected with the Tight identification and isolation operating points [115].

Muon candidates are reconstructed in the region \(|\eta | <2.5\) from MS tracks matching ID tracks. Only muons with \(p_{T}>10\,\hbox {GeV}\) satisfying the ‘Medium’ quality requirements defined in Ref. [116] are considered. The muon reconstruction efficiency is approximately 98% in simulated \(Z \rightarrow \mu \mu \) events. The same longitudinal impact parameter selection as for candidate and signal electrons is applied, while \(|d_{0}|/\sigma (d_{0})\) is required to be at most three.

For signal electrons and muons the identification criteria are complemented by an isolation requirement, which is based on the energy in a cone around the lepton candidate calculated using either reconstructed tracks or energy clusters. Non-prompt electrons and muons from the decays of b- and c-flavoured hadrons are further rejected using a BDT discriminant based on isolation and secondary vertex information, referred to as the non-prompt-lepton veto [117]. The efficiency of the combined isolation and non-prompt-lepton veto is on average above 80% for prompt leptons with \(p_{T}>30\,\hbox {GeV}\) in simulated diboson events. Finally, all signal leptons are required to have \(p_{T}>15\,\hbox {GeV}\).

A sequential overlap removal procedure to resolve ambiguities between candidate jets and candidate leptons is carried out before the signal selection as follows. First, candidate electrons sharing their track with a muon candidate are removed. Furthermore, any non-b-jet candidate lying within an angular distance \(\Delta R = 0.2\) of a candidate electron is discarded; non-b-jets within \(\Delta R = 0.4\) of candidate muons are removed if the number of tracks associated with the jet is less than three. Finally, any lepton candidate remaining within a distance \(\Delta R = \min \{0.4, 0.04 + 10\,\hbox {GeV}/p_{T}(\ell )\}\) of any surviving jet candidate is discarded since they likely arise from decays of b- or c-flavoured hadrons.

Similar to the electron candidates, the photon candidates are reconstructed from calorimeter energy clusters and identified using the ‘Tight’ criteria [115]. They are required to be in the region \(|\eta | <2.37\) and have \(p_T > 145\,\hbox {GeV}\). Signal photons must satisfy the ‘Tight’ calorimeter-based isolation requirements [115], and are used to validate the background estimation technique detailed in Sect. 7.1.

The missing transverse momentum, with magnitude \(E_{\text {T}}^{\text {miss}}\), is defined as the negative vector sum of the transverse momenta of all identified objects (muon, electron and jet candidates) and an additional soft term [118, 119]. The soft term is added to recover the contributions from other low-\(p_{\mathrm{T}}\) objects, and is constructed from all tracks that are matched to the primary vertex but are not associated with any other object. A dedicated overlap removal procedure, based on removing duplicated energy contributions, is applied. The \(E_{\text {T}}^{\text {miss}}\) variable is used to define control regions enriched in certain types of background, as discussed in Sect. 7.4, and as input for the multivariate discriminant.

6 Event selection and analysis strategy

Two complementary analysis strategies are defined, namely the ‘jet counting analysis’ and the electroweak analysis, labelled ‘EWK analysis’. While the first approach is designed to be very generic and offers a large variety of signal interpretations for strong production models, the second approach is specifically tailored to reach sensitivity for electroweakino production. In both analyses, events are assigned to one of two categories according to their lepton content, and further categorized into regions based on jet multiplicity and b-jet multiplicity. This categorization provides a set of regions that are sensitive to decays from all the possible signal models considered in this search, amplifying the ability of the search to discriminate signal from background. The EWK analysis is an extension of the jet counting analysis, where a third variable, a neural network (NN) discriminant, is introduced in some of the jet and b-jet multiplicity regions in order to improve the separation of signal from background. Additional kinematic selections are also applied at the preselection level, tailored to the electroweakino signals.

Events were selected for read-out using single-lepton triggers that require the electron or muon to satisfy identification criteria similar to those used in the offline reconstruction and isolation requirements [120, 121]. For the analysis selection, at least four jets and at least one signal electron or muon, matched to the trigger lepton, are required in the event. The highest-\(p_{\mathrm{T}}\) lepton in the event has to pass the signal requirements and satisfy \(p_{\mathrm{T}}>27\,\hbox {GeV}\), in order to be above the trigger threshold. Two disjoint event categories are defined according to the lepton content: a same-charge dilepton selection (\(2\ell ^\text {sc} \)), and all other events with at least one lepton (\(1\ell \)). Events are placed in the \(2\ell ^\text {sc} \) category if they contain exactly two signal leptons with same electric charge, and no additional candidate leptons. In order to reduce backgrounds containing a Z boson decaying into electrons, where one electron has its charge misidentified, events with two electrons have to satisfy a \(|m_{ee} - m_Z| > 10\,\hbox {GeV}\) requirement. Events with at least three signal leptons, with one same-flavour pair satisfying \(|m_{\ell \ell } - m_Z| < 10\,\hbox {GeV}\) and exactly zero b-jets, are included as a separate subregion of the \(2\ell ^\text {sc} \) category, in order to be used for background estimation. All events passing further selections, but which do not enter the \(2\ell ^\text {sc} \) category, are assigned to the \(1\ell \) category, including events containing more than one candidate lepton. The regions with seven or fewer jets and zero b-jets in the \(1\ell \) category are further divided into three subregions. The first subregion is defined by selecting events with two same-flavour candidate leptons fulfilling an invariant-mass requirement, \(|m_{\ell \ell } - m_Z| < 10\,\hbox {GeV}\). The remaining events are divided into two subregions according to the electric charge of the highest-\(p_{\mathrm{T}}\) lepton. This division in subregions provides additional information that allows more accurate constraints to be placed on the W+jets and Z+jets backgrounds.

The jet counting analysis is carried out with five jet-\(p_{\mathrm{T}}\) thresholds to provide sensitivity to a broad range of possible signals. These thresholds are applied to all jets in the event and are at \(p_{\mathrm{T}} = 20\), 40, 60, 80, and \(100~\hbox {GeV}\). The optimal jet \(p_{\mathrm{T}}\) threshold depends on the model and particle masses being tested. The jet multiplicity is binned from a minimum of four jets to a maximum number (\(N_{\text {last}}\)) that depends on the \(p_{\mathrm{T}}\) threshold and the lepton category. In the \(1\ell \) category the last bin corresponds to 15 or more jets for the 20 GeV threshold, and 12, 11, 10, and 8 or more jets for the other thresholds in increasing order. In the \(2\ell ^\text {sc} \) category it corresponds to 10, 8, 7, 7, and 6 or more jets respectively. The highest jet-multiplicity bin for each \(p_{\mathrm{T}}\) threshold is inclusive of larger jet multiplicities. For each jet multiplicity bin, there are five exclusive bins in the b-jet multiplicity (four exclusive bins from zero to three b-jets, with an additional inclusive four-or-more bin). The regions defined in the jet counting analysis are summarised in Table 3. For a given jet \(p_{\mathrm{T}}\) threshold all regions are orthogonal and are analysed simultaneously. However, regions defined for different jet \(p_{\mathrm{T}}\) thresholds can overlap. The number of bins used in the search ranges from 110 when considering the \(20~\hbox {GeV}\) jet threshold, including the different subregions with zero b-jets, to 51 bins when considering the \(100~\hbox {GeV}\) jet threshold. In this article, the notation \(N^{\text {process}}_{j,b}\) is used to denote the number of events predicted by the background fit model, with j jets and b b-jets for a given process, e.g. \(N^{t\bar{t} \text {+jets}}_{j,b}\) for \(t\bar{t} \)+jets events. The quantity \(N^{\text {process}}_{j}\), referred to as a jet slice, is the number of events with j jets for the considered physics process, and it is inclusive in the number of b-jets.

Table 3 Summary of regions considered in the jet counting analysis. The notation Nb is used to indicate a requirement on the b-jet multiplicity. The highest jet multiplicity considered (\(N_{\text {last}}\)) depends on the jet \(p_{\mathrm{T}}\) threshold and the lepton category. In the \(1\ell \) category it corresponds to 15, 12, 11, 10, and 8 jets for the different jet \(p_{\mathrm{T}}\) thresholds in increasing order. In the \(2\ell ^\text {sc} \) category it corresponds to 10, 8, 7, 7, and 6 jets respectively

In order to improve the sensitivity of the search to the model with electroweakino production, the EWK analysis is introduced, which extends the jet counting analysis at the 20 GeV jet-\(p_{\mathrm{T}}\) threshold. In the \(1\ell \) category only, a separate NN discriminant is trained in each jet slice with eight or fewer jets, to discriminate the higgsino signal from the \(t\bar{t}\) background. The full distribution of the NN output, binned in four even-width bins with approximately equal signal fraction, is fitted in each of the regions with at least one b-jet. The NN training is performed with the constraint that the NN output distribution of the \(t\bar{t}\) background be invariant with respect to the b-jet multiplicity. This property is later exploited in order to estimate the background from data, as described in Sect. 7. The invariance of the NN output with respect to the b-jet multiplicity is achieved with distance-correlation training [122, 123]. The NNs are trained on a mixture of higgsino samples as the signal, and \(t\bar{t}\) as the only background.

The NN discriminant is constructed from a combination of low-level and high-level inputs. The low-level variables considered are the jet and leading-lepton momenta, the individual pseudo-continuous b-jet score [113] of all jets, and the \(E_{\text {T}}^{\text {miss}}\) magnitude and direction. The high-level inputs correspond to the jet and b-jet multiplicity of the event, minimum distance between the leading lepton and any jet, scalar \(p_{\mathrm{T}}\) sum of all jets (\(H_{\text {T}}\)), scalar \(p_{\mathrm{T}}\) sum of all b-jets, \(m^{\text {jets}}\) (defined below), invariant mass of the three-jet system with highest system \(p_{\mathrm{T}}\), and invariant mass of the 3j + \(\ell \) + \({\mathbf {p}}^{\text {miss}}\) system with highest system \(p_{\mathrm{T}}\) (assuming that the z-component of the missing momentum \({\mathbf {p}}^{\text {miss}}\) is zero). The two invariant mass variables attempt to reconstruct and decays respectively. The \(m^{\text {jets}}\) variable is defined as follows. All the jets in the event are split into two groups, where both groups have to contain at least one jet. All possible combinations are tested, including those where the number of jets in each group is very different. For each grouping, the higher of the masses of the two groups is selected, and then the minimum across all possible groupings is taken. The \(m^{\text {jets}}\) distribution has an endpoint for signal events at that is reached in events where all partons were reconstructed. For most events, the value is lower, since the lepton and \(E_{\text {T}}^{\text {miss}}\) components are ignored. Backgrounds, however, do not show such an endpoint. In addition, the shape of this variable has only a weak dependence on the number of b-jets, which helps the NN to achieve separation while not introducing sensitivity to the b-jet multiplicity.

The inputs are connected to a single output node via two fully connected hidden layers of 100 neurons. The NNs are trained using PyTorch [124] and the Adam optimizer [125]. Events in the training dataset are sampled according to the inverse of the b-jet fraction (defined as the fraction of events in a given b-jet bin with respect to the total number of events in the jet slice) in order to flatten the b-jet spectrum. In order to achieve invariance of the NN output with respect to the \(b\text {-jet}\) multiplicity, the loss function of the training contains a term that penalizes a high distance correlation between the output and the b-jet multiplicity [123]. A hyperparameter \(\lambda \) controls the weight of this penalty term, with a value \(\lambda =15\) which was optimized to achieve the highest sensitivity to the signal, accounting for both the separation and the systematic uncertainties derived from non-invariance of the NN, as described in Sect. 8. The invariance and separation achieved is shown in Fig. 2. After training, the variables ranked highest in importance (using the integrated gradients method described in Ref. [126]) are \(H_{\text {T}}\), the individual pseudo-continuous b-jet score of all jets, number of b-jets, invariant mass of the 3j + \(\ell \) + \({\mathbf {p}}^{\text {miss}}\) system with highest system \(p_{\mathrm{T}}\), and \(m^{\text {jets}}\). The b-jet multiplicity is highly ranked despite the NN output being independent of it since it can be used to offset the effect from variables that are correlated with the number of b-jets such as \(H_{\text {T}}\).

Fig. 2
figure 2

Output distribution of the NN discriminant in the six-jet slice, evaluated over a signal sample and \(t\bar{t}\) background split into the different b-jet regions. The bottom panel shows the ratio of the NN output distribution for \(t\bar{t}\) background in each b-jet region to the distribution in an inclusive region with at least one b-jet. The NN output distribution is invariant with respect to the number of b-jets in the selection, with differences per bin below 4%

In the \(2\ell ^\text {sc} \) category, signal events are produced via the leptonic decays of two top quarks. However, the dominant backgrounds contain only one leptonic top decay, while the second lepton is a misidentified or non-prompt lepton, or originates from a W boson that is not produced in a top decay (\(t\bar{t}W\)). This property is exploited by introducing an additional requirement of \(m^{\ell j} < 155\,\hbox {GeV}\), where the observable \(m^{\ell j}\) is defined as \({m^{\ell j} = \min _{a,b} \left\{ \max \left( m(\ell _0,\text {jet}_a), m(\ell _1,\text {jet}_b)\right) \right\} }\), with \(\text {jet}_a \ne \text {jet}_b\), for all possible permutations of \(\text {jet}_a\) and \(\text {jet}_b\) taken from the four leading jets. No b-tag information is used in the selection of the jets, to avoid differences in the variable across the different b-tag regions. The signal has an endpoint in this variable at \({m^{\ell j} = \sqrt{m_{\text {top}}^2 - m_{W}^2} \approx 152\,\hbox {GeV}}\), while background events tend to have larger values.

Table 4 The discovery signal regions used in jet counting and NN analyses, in the search for a generic BSM signal. For every jet \(p_{\mathrm{T}}\) threshold, four signal regions are defined in the jet counting analysis, leading to a total of 20 discovery signal regions in the jet counting analysis. Two additional discovery signal regions are defined in the EWK analysis

In order to probe a specific BSM model, all the regions in both lepton categories are simultaneously fit to data to constrain the model, in what is called a model-dependent fit. Separate fits are performed for each analysis and jet \(p_{\mathrm{T}}\) threshold, and the configuration providing the best expected sensitivity is used to probe the model. In the search for a generic BSM signal, dedicated discovery signal regions (SRs) are defined which could be populated by a signal, and where the SM contribution is expected to be small. The background in these SRs is estimated from a fit excluding the SR being tested, in what is called a model-independent fit. The discovery SR definitions used in the jet counting analysis are shown in Table 4.

Two additional discovery SRs are defined targeting a possible electroweakino signal making use of the EWK analysis. The first SR is defined in the \(1\ell \) category, with exactly six jets with \(p_{\mathrm{T}} > 20\,\hbox {GeV}\), at least four b-jets, and a selection on the NN discriminant. The NN selection corresponds to a signal efficiency of 25% on the higgsino model with a mass of 300 GeV and a background rejection of 40 for the \(t\bar{t}\) background, which corresponds to bin four in Fig. 2. The second SR is defined in the \(2\ell ^\text {sc} \) category, with exactly six jets with \(p_{\text {T}} > 20\,\hbox {GeV}\), and at least three b-jets. The discovery SR definitions used in the EWK analysis are also shown in Table 4.

The dominant background processes are \(t{\bar{t}}\)+jets and W/Z+jets in the \(1\ell \) category, and \(t\bar{t}W\), \(t\bar{t}\) with a misidentified lepton, and diboson production in the \(2\ell ^\text {sc} \) category. The estimation of the dominant backgrounds is carried out using a combined fit to the jet and b-jet multiplicity bins described above. For these backgrounds, the normalization per jet slice is derived using parameterized extrapolations from lower jet multiplicities. The b-jet multiplicity shape per jet slice is taken from MC simulation for the W/Z+jets and diboson backgrounds, whereas for background processes involving top quarks it is predicted from the data using a parameterized extrapolation based on observables at medium jet multiplicities. A separate likelihood fit is carried out for each jet \(p_{\mathrm{T}}\) threshold, with the fit parameters of the background model determined separately in each fit. The assumptions used in the parameterization are validated using data and MC simulation.

7 Background estimation

The dominant background in the \(1\ell \) category arises from W/Z+jets production in the zero b-jet regions, and top-quark pair production in the regions with at least one b-jet. In the \(2\ell ^\text {sc} \) category the dominant background in the zero b-jet regions originates from diboson production with fully leptonic decays, in particular WZ where one lepton from the Z boson decay is lost. In the regions with at least one b-jet the main backgrounds are the associated production of a top-quark pair and a W boson, dileptonic \(t\bar{t}\) where an electron has its charge misidentified, and semileptonic \(t\bar{t}\) with a jet misidentified as a lepton, or with a non-prompt lepton. These three background components are merged and labelled as \(t\bar{t}X^\text {sc}\), and estimated simultaneously.

The theoretical modelling of all these backgrounds at high jet multiplicity suffers from large uncertainties, so they are estimated from the data by extrapolating the jet and b-jet multiplicity distributions extracted at moderate jet multiplicities to the high jet multiplicities of the search regions.

7.1 Jet multiplicity prediction

A data-driven approach is used to estimate the contribution of the main backgrounds in each jet multiplicity slice. The estimate of the normalization relies on assuming a functional form to describe the evolution of the number of background events for process X as a function of the jet multiplicity, \(r^X(j) \equiv N^{X}_{j+1}/N^{X}_{j}\).

Above a certain number of jets, r(j) is assumed to be constant, implying a fixed probability of additional jet radiation, referred to as ‘staircase scaling’ [127,128,129,130]. This behaviour has been observed in W/Z+jets by the ATLAS [131, 132] and CMS [133] collaborations. For lower jet multiplicities, a different scaling is expected with \(r(j) = k / (j + 1)\) where k is a constant, referred to as ‘Poisson scaling’ [130]. The transition point between these scaling behaviours depends on the jet kinematic selections.

For the kinematic phase space relevant for this search, a combination of the two scalings is found to describe the data in dedicated validation regions (described later in this section), as well as in simulated MC samples with an integrated luminosity much larger than that of the data. This combined scaling is parameterized as

$$\begin{aligned} r^X(j) = c^X_0 + c^X_1 / (j + c^X_2), \end{aligned}$$

where \(c^X_0\), \(c^X_1\) and \(c^X_2\) are process-dependent constants that are extracted from the data. The \(c^X_2\) parameter is introduced to take into account the ambiguity in the counting of jets originating from the decay products of the process X and the additional jets. The parameter is fixed to \(c^X_2=1\) in the estimation of W+jets, Z+jets, and fully leptonic diboson events, as there is no ambiguity in the counting of jets for these processes. However, \(c^X_2\) is a free parameter in the estimation of backgrounds containing top quarks, where the jet counting ambiguity remains.

Studies using simulated event samples, both at generator level and after event reconstruction, demonstrate that the flexibility of this parameterization is also able to absorb reconstruction effects related to the decrease in event reconstruction efficiency with increasing jet multiplicity, which are mainly due to the lepton–jet overlap and lepton isolation requirements.

The number of background events from process X in the j jets slice is then parameterized as follows:

$$\begin{aligned} N^{X}_{j} = N^{X}_{4} \cdot \prod \limits _{j'=4}^{j'=j-1} r^X(j') , \end{aligned}$$

where \(N^{X}_{4}\) is a free parameter for the absolute normalization in four-jet events. Since the last jet-multiplicity bin used in the analysis is inclusive in the number of jets, the model is used to predict this by iterating to higher jet multiplicities and summing the contribution for each jet multiplicity above the maximum used in the analysis. The four parameters per process, i.e. \(N^{X}_{4}\), \(c^X_0\), \(c^X_1\), and \(c^X_2\) (if not fixed to one), are allowed to float in the fit, and are therefore extracted from the data along with the other background contributions. Studies in data and MC simulation indicate that the \(c^X_0\) and \(c^X_1\) parameters for W+jets and Z+jets are statistically compatible, and are therefore combined into common parameters \(c^\text {W/Z+jets}_0\) and \(c^\text {W/Z+jets}_1\). The normalization parameters \(N^{\text {W+jets}}_{4}\) and \(N^{\text {Z+jets}}_{4}\) are kept independent. The \(c^X_i\) parameters are independent among the rest of the backgrounds, including \(t\bar{t}\) in the \(1\ell \) category and \(t\bar{t}X^\text {sc}\) in the \(2\ell ^\text {sc} \) category.

The jet-scaling assumption is validated in data, using \(\gamma \)+jets and dileptonic \(t\bar{t}\) events. The \(\gamma \)+jets events are selected using a high-\(p_{\mathrm{T}}\) photon trigger, and a high-\(p_{\mathrm{T}}\) signal photon is required in the event selection. The dileptonic \(t\bar{t}\) data sample is selected by requiring an electron candidate and a muon candidate in the event, with at least two jets of which at least one is a b-jet, and the small background predicted by MC simulation is subtracted. The possible signal contamination in this sample is negligible as the selection is inclusive over the number of b-jets. In this sample, the scaling behaviour can be tested for up to 13 jets, which corresponds to 15 jets for a semileptonic \(t\bar{t} \)+jets sample. Simulated W+jets, Z+jets, semileptonic \(t\bar{t}\) (both the nominal sample and the alternative samples described in Sect. 4), and \(t\bar{t}X^\text {sc}\) samples are also found to be consistent with the jets-scaling assumption.

Fig. 3
figure 3

The ratio of the number of events with \((j+1)\) jets to the number with j jets in various event samples (details in the legend), used to validate the jet-scaling parameterization. In the MC samples of W/Z/WZ+jets the vector bosons are forced to decay to leptons. Each panel shows the ratio for data or MC simulation with the fitted parameterization overlaid as a dashed line. The uncertainties shown are statistical

Figure 3 shows the r(j) ratio for various processes used to validate the jet-scaling parameterization. Each panel shows the r(j) ratio for data or MC simulation with the fitted parameterization overlaid as a line.

7.2 Prediction of b-jet multiplicity

The number of background events from process X in a given jet and b-jet multiplicity region can be expressed as follows:

$$\begin{aligned} N^{X}_{j,b} = f^X_{j,b} \cdot N^{X}_{j} \end{aligned}$$

where \(f^X_{j,b}\) is the fraction of events from process X with b number of b-jets in the j jet slice, and satisfies \(\sum _{b=0}^4 f^X_{j,b}=1\). A data-driven model is used to estimate the b-jet fraction in background processes containing top quarks. The basic concept of this model is based on the extraction of an initial template of the b-jet fraction distribution in events with four jets and the parameterization of the evolution of this template to higher jet multiplicities. Each jet slice is constrained in the fit as discussed later in this section. The b-jet fractions for W+jets, Z+jets and diboson backgrounds are taken from MC simulation.

The extrapolation of the b-jet multiplicity distribution to higher jet multiplicities starts from the assumption that the difference between the b-jet multiplicity distribution in events with j and \(j+1\) jets arises mainly from the production of additional jets, and can be described by a fixed probability that the additional jet is a b-jet. Given the small mis-tag rate, this probability is dominated by the probability that the additional jet is a heavy-flavour jet which is b-tagged. In order to account for acceptance effects due to the different kinematics in events with high jet multiplicity, the probability of further b-tagged jets entering the acceptance is also taken into account. The extrapolation to one additional jet can be parameterized as:

$$\begin{aligned} \begin{aligned} f_{(j+1),b}&= f_{j,b} \cdot x_0 + f_{j, (b-1)} \cdot x_1 + f_{j, (b-2)} \cdot x_2, \end{aligned} \end{aligned}$$
(1)

where the parameters \(x_i\) describe the probability of one additional jet to be either not b-tagged (\(x_0\)), b-tagged (\(x_1\)), or b-tagged and causing a second jet to be b-tagged (\(x_2\)). The latter is dominated by cases where the extra jet influences the event kinematics such that a second b-jet, previously not b-tagged, becomes b-tagged. This parameter aims to model the increase in b-tagging efficiency with jet \(p_{\mathrm{T}}\). Given that the \(x_i\) parameters describe probabilities, the sum \(\sum _{i} x_i\) is normalized to unity. Terms with a negative number of b-tagged jets \((f_{j,b<0})\) are set to zero. Subsequent application of this parameterization produces a b-jet template for arbitrarily high jet multiplicities.

Studies based on MC simulated events with sample sizes corresponding to equivalent luminosities much larger than the collected dataset, as well as studies using fully efficient generator-level b-tagging, indicate the necessity to add a fit parameter that allows for correlated production of two b-jets as may be expected with b-jet production from gluon splitting. This is implemented by changing the evolution described in Eq. (1) such that any term with \(x_1 \cdot x_1\) is replaced by \(x_1 \cdot x_1 \cdot \rho _{11}\), where \(\rho _{11}\) describes the correlated production of two b-jets. The value of \(\rho _{11}\) is a free parameter and is determined in the fit.

The initial b-jet multiplicity template is extracted from data events with four jets after subtracting all non-\(t\bar{t} \) background processes, and is denoted by \(f_{4,b}\) and scaled by the absolute normalization in order to obtain the model in the four-jet bin:

$$\begin{aligned} N^{t\bar{t} \text {+jets}}_{4,b} = N^{t\bar{t} \text {+jets}}_{4} \cdot f_{4,b}, \end{aligned}$$

where the sum of \(f_{4,b}\) over the four b-jet bins is normalized to unity. The zero b-jet component of the initial \(t\bar{t} \) template, exhibits an anti-correlation with the absolute W+jets normalization, which is extracted in the same region. The division into subregions separated in leading-lepton charge, detailed in Sect. 6, provides a handle to extract the absolute W+jets normalization, due to the charge asymmetry in \(W^\pm \) production. The remaining anti-correlation does not affect the total background estimate. For these regions, the \(t\bar{t} \)+jets process is assumed to be charge symmetric and the model is simply split into two halves for these bins.

The model described above is based on the assumption that any change in the b-jet multiplicity distribution is due to additional jet radiation with a certain probability to lead to b-jets. There is, however, also a small increase in the acceptance for b-jets produced in the decay of the \(t\bar{t}\) system, when increasing the jet multiplicity, due to the higher jet momentum on average. The effect amounts to up to 5% in the one- and two-b-jet bins for high jet multiplicities, and is taken into account using a correction to the initial template extracted from simulated \(t\bar{t} \) events.

The parameters that model the production of additional b-jets (\(x_i\), \(\rho _{11}\)) are correlated in the \(1\ell \) and \(2\ell ^\text {sc} \) categories. The initial b-jet multiplicity parameters (\(f_{4,b}\)), and the acceptance correction to the initial template are independent in each lepton category.

Fig. 4
figure 4

Observed data and the corresponding background estimation in regions with one electron, one muon, and four jets (left) or five jets (right). All uncertainties, which may be correlated across the bins, are included in the error bands (shaded regions). The shape of the NN template for the \(t\bar{t}\) background is required to be identical in all b-jet regions. Good agreement between the data and estimated background confirms the invariance of the NN output with respect to the b-jet region

7.3 Neural network template prediction

The NN is only introduced in regions with at least one b-jet, where the dominant background is \(t\bar{t}\) production. The NN output distribution is obtained from MC simulation for all the subdominant backgrounds. A data-driven method is developed in order to predict the NN output distribution for \(t\bar{t}\) background events, making use of the invariance of the NN output with respect to the number of b-jets in the event. The \(t\bar{t}\) background in a given bin of the NN output is parameterized as:

$$\begin{aligned} N^{t\bar{t} \text {+jets}}_{j,b,i} = n_{j,i} \cdot N_{j,b} \end{aligned}$$

where \(n_{j,i}\) is the fraction of \(t\bar{t}\) events in bin i of the NN output in the j jet slice, and is independent of the b-jet region. The fractions in each jet slice are free parameters and are fitted simultaneously to all b-jet regions, constrained by the sum \(\sum _{i} n^i_{j}\) being normalized to unity. Given the large statistical power of the one- and two-b-jet regions, the fitted NN templates are determined in these regions and not biased by a possible signal entering the high b-jet regions.

This method relies on the invariance of the NN output with respect to the number of b-jets. This property is validated in large samples of MC simulated \(t\bar{t}\) events, including the alternative samples described in Sect. 4. The invariance is also confirmed in data by using a pure sample of dileptonic \(t\bar{t}\) events, with a selection requiring one electron, one muon, and at least one b-jet. The signal contamination in this dataset is at most 2% in the last NN bin of the 4 b-tag region. The only process that is not negligible given this selection is production, which has a lower cross-section than the process, that contributes most to the 1-lepton selection. Figure 4 shows that good agreement between data and estimated background is observed in this region, confirming in data the invariance of the NN output with respect to the b-jet region.

7.4 Fake and non-prompt lepton background

The contribution from events with a fake or non-prompt (FNP) lepton (such as hadrons misidentified as leptons, leptons originating from the decay of heavy-flavour hadrons, and electrons from photon conversions), constitutes a minor but non-negligible background.

The multi-jet background in the \(1\ell \) category is estimated from the data with a matrix method similar to that described in Ref. [134]. In this method, two types of lepton identification criteria are defined: ‘tight’, corresponding to the default signal lepton criteria described in Sect. 5, and ‘loose’, corresponding to candidate leptons after overlap removal. The matrix method relates the numbers of observed events in which a loose lepton candidate does or does not satisfy the tight selection criteria. The probability for loose prompt leptons to satisfy the tight selection criteria is obtained using a \(Z\rightarrow \ell \ell \) data sample and is modelled as a function of the lepton \(p_{\text {T}} \). The probability for loose FNP leptons to satisfy the tight selection criteria is determined from a data control region enriched in non-prompt leptons that requires a loose lepton, multiple jets, low \(E_{\text {T}}^{\text {miss}}\) [135, 136], and low transverse mass.Footnote 4 This data sample is recorded with prescaled lepton triggers without an isolation requirement. The efficiencies are measured as a function of lepton candidate \(p_{\mathrm{T}}\) after subtracting the contribution from prompt-lepton processes, and are assumed to be independent of the jet multiplicity.Footnote 5

In the \(2\ell ^\text {sc} \) category, the background from FNP leptons in association with a top-quark pair is estimated as part of the \(t\bar{t}X^\text {sc}\) background as described above. Other contributions from FNP leptons constitute less than 10% of the total background in the zero-b-jet bin, with a negligible contribution in the b-jet regions, and are taken from MC simulation. The estimation is validated in a dedicated validation region requiring zero b-jets, two same-flavour leptons satisfying \(|m_{\ell \ell } - m_Z| < 10\,\hbox {GeV}\), and an additional candidate lepton failing the signal requirement. This region is dominated by Z boson events containing a FNP lepton, and is used to verify the modelling of FNP leptons in the MC simulation.

7.5 Minor backgrounds

The minor background contributions from single-top production, \(t\bar{t}H\), and SM four-top-quark production are estimated using MC simulation. In the \(1\ell \) category the diboson and \(t\bar{t}V\) backgrounds are also estimated from MC simulation, while the estimates are both data-driven (\(t\bar{t}V\) as part of the \(t\bar{t}X^\text {sc}\) background) in the \(2\ell ^\text {sc} \) category. In all but the highest jet slices considered, the sum of these backgrounds contributes no more than 10% of the SM expectation in any of the b-jet bins; for the highest jet slices this can rise to 35%.

7.6 Fit configuration and validation

Two different fit configurations are used in the search. When testing a specific BSM model the model-dependent fit set-up is used, where for each jet \(p_{\mathrm{T}}\) threshold all the regions in both lepton categories are simultaneously fit to data to constrain the model. The expected signal contribution in all bins is taken into account, including bins with low jet or b-jet multiplicity. The jet counting analysis is used to constrain the models with strong production and the EWK analysis for the models with electroweakino production. Separate fits are performed for each jet \(p_{\mathrm{T}}\) threshold, and the threshold with best expected sensitivity for the given signal mass point is used to set limits. In the search for a generic BSM signal in a particular SR the model-independent fit set-up is used, where for each jet \(p_{\mathrm{T}}\) threshold the simultaneous fit includes all regions except the SR being tested. Possible signal leakage to the control regions can produce a bias in the background estimation, leading to conservative limits. Such limits have hence been obtained assuming negligible signal contributions in regions outside of the SR. Signal processes with final states that the search is targeting generally have negligible leakage into these regions, as is the case for the benchmark models considered.

The number of freely floating parameters in the background model is 26 in the jet counting analysis, and 41 in the EWK analysis. The different parameters for each background are summarized in Table 5. The number of fitted bins varies between 51 and 110 in the jet counting analysis, depending on the jet \(p_{\mathrm{T}}\) threshold used, and is 170 in the EWK analysis, leading to an over-constrained system in all cases.

The fit set-up was extensively tested using MC simulated events, and was demonstrated to give excellent agreement on a background-only dataset, and a negligible bias in the fitted signal yields. This level of agreement is seen both in the cases where the background-only distributions are fit, and when a signal is injected into the fitted data.

Table 5 Summary of all the free floating parameters in the background model. There are a total of 26 such parameters in the jet counting analysis and 41 in the EWK analysis

8 Systematic uncertainties

The dominant backgrounds are estimated from the data without the use of MC simulation, and therefore the main systematic uncertainties related to the estimation of these backgrounds arise from the assumptions made in the construction of the parameterized model. Uncertainties related to the theoretical modelling of the specific processes and due to the modelling of the detector response in simulated events are only relevant for the minor backgrounds, which are taken from MC simulation, and for the estimates of the signal yields after selections.

For the W/Z+jets background estimation, the uncertainty related to the assumed jet scaling is taken from studies of this behaviour in W+jets and Z+jets MC simulation, as well as in \(\gamma \)+jets data control regions chosen to be kinematically similar to the search selection. Deviations from the assumed scaling behaviour are assigned as a systematic uncertainty, uncorrelated in each jet slice. In the case of no deviation the statistical precision of the validation is assigned. The uncertainty in the lower jet multiplicities is at the percent level for all jet \(p_{\mathrm{T}}\) thresholds, and up to 50% in the highest jet multiplicities, driven by the statistical precision of the method. The uncertainty related to the parameterization of the jet multiplicity of the \(t\bar{t}\) background is determined with the same strategy, and is derived from MC simulation closure tests (including alternative MC generators), as well as dileptonic \(t\bar{t}\) control regions in data. No evidence of a deviation from the assumed scaling behaviour is seen, and the statistical precision of the closure in data is used as an uncertainty.

The expected uncertainty of the charge asymmetry in W+jets production is \(3\%\)\(5\%\) from PDF variations [137], but in the seven-jet region the uncertainty is dominated by the limited number of MC events (up to 10% for the 80 GeV jet-\(p_{\mathrm{T}}\) threshold). The uncertainty in the shape of the b-jet multiplicity distribution in W+jets, Z+jets, and diboson events is derived by comparing different MC generator configurations (e.g. varying the renormalization and factorization scale and the parton-shower model parameters). It is seen to grow as a function of jet multiplicity and is about 50% for events with five jets, after which the MC statistical uncertainty becomes very large. A conservative uncertainty of 50% per additional heavy-flavour quark that is generated is assigned to the fractional contribution from V(V)+b and V(V)+c events, uncorrelated among the three backgrounds. This uncertainty has a negligible impact on the final result as the background from W/Z boson or diboson production with additional heavy-flavour jets is small compared to that from top-quark pair production. In addition, the uncertainties related to the b-tagging efficiency and mis-tag rate are taken into account in the uncertainty in the W/Z+jets b-jet template.

The b-jet fraction estimation method exhibits good closure in studies based on MC simulated events with sample sizes corresponding to integrated luminosities much larger than that of the collected dataset, as well as studies using fully efficient generator-level b-tagging, so no systematic uncertainty related to these studies is assigned. A small uncertainty related to the acceptance correction for the initial b-jet multiplicity template is derived by varying the MC generator configuration for the \(t{\bar{t}}\) sample used to estimate the correction. This leads to a 3% uncertainty in the correction, and has no significant effect on the final uncertainty.

The prediction of the NN template in \(t\bar{t}\) events relies on the invariance of the NN output with respect to the number of b-jets in the event. This assumption is tested in MC simulation (both the nominal sample and the alternative samples described in Sect. 4), and seen to hold within 5% in the five-jet and six-jet slices, where the best signal-to-background ratio is expected, and within 10% in the rest. The largest deviation from the b-jet-inclusive template that is seen per bin across b-jet regions is assigned as an uncorrelated uncertainty in each bin and ranges from 1% to 10%.

The dominant uncertainty in the multi-jet background estimate arises from the number of data events in the control regions. An uncertainty in the subtraction of electroweak backgrounds from these control regions is estimated at 20% of the expected yield of these background processes. Additional uncertainties are assessed to cover the possible dependencies of the prompt and FNP lepton efficiencies [134] on variables other than lepton \(p_{\mathrm{T}}\) (for example the dependence on the number of jets in the event). The total uncertainty in the multi-jet background yields is about 50%.

Fig. 5
figure 5

The observed data event yields and the corresponding estimates for the backgrounds in the different b-jet multiplicity bins for the 20 GeV jet-\(p_{\mathrm{T}}\) threshold regions defined for the EWK analysis in the \(1\ell \) category. The background shown is estimated by including all bins in the fit. All uncertainties, which may be correlated across the bins, are included in the error bands (shaded regions). The expected signal distribution for the higgsino LSP m( ) = 300 GeV hypothesis normalized to 20 times its expected cross-section is also overlaid

Fig. 6
figure 6

The observed data event yields and the corresponding estimates for the backgrounds in the different b-jet multiplicity bins for the 20 GeV jet-\(p_{\mathrm{T}}\) threshold regions defined for the EWK analysis in the \(2\ell ^\text {sc} \) category. The background shown is estimated by including all bins in the fit. All uncertainties, which may be correlated across the bins, are included in the error bands (shaded regions). The expected signal distribution for the higgsino LSP m( ) = 300 GeV hypothesis normalized to 20 times its expected cross-section is also overlaid

Fig. 7
figure 7

The observed pulls in all the regions considered in the EWK analysis, in the \(1\ell \) category (left) and the \(2\ell ^\text {sc} \) category (right). The pull is defined as the difference between the observed number of events and the total expected number of events determined by the fit divided by the total uncertainty. The total uncertainty is the sum in quadrature of the statistical error of the observed data and the uncertainty in the predicted background

Fig. 8
figure 8

The observed data event yields and the corresponding estimates for the backgrounds in the different last jet-multiplicity bins defined for the \(1\ell \) (left) and \(2\ell ^\text {sc} \) (right) categories. The background shown is estimated by including all bins in the fit. All uncertainties, which may be correlated across the bins, are included in the error bands (shaded regions). Hypothetical contributions from representative RPV SUSY scenarios are displayed as dashed and dotted lines

The uncertainty in the expected yields of the minor backgrounds includes theoretical uncertainties in the cross-sections and in the modelling of the kinematics by the MC generator, as well as experimental uncertainties related to the modelling of the detector response in the simulation. The uncertainties assigned to cover the theoretical estimate of these backgrounds in the relevant regions are 50% for diboson in the \(1\ell \) category and single top-quark production, and 30% for \(t\bar{t} V/H\) production. An additional uncertainty of 50% is assigned to the contribution from \(t\bar{t}V +b\) and \(t\bar{t}V +c\) events. These uncertainties are conservative estimates based on the impact seen from renormalization and factorization scale variations, PDF variations, and comparisons with samples with an alternative parton-shower model.

Uncertainties in the modelling of \(t\bar{t}t\bar{t}\) are assigned from renormalization and factorization scale variations, as well as from a comparison with simulated samples with an alternative parton-shower model. An uncertainty of 100% is assigned to the cross-section to cover the difference between the predicted and measured cross-sections [34]. Using instead the central value and uncertainty from the ATLAS measurement leads to a 1% decrease in expected sensitivity.

The final uncertainty in the background estimate in the SRs is dominated by the uncertainty in the fitted model parameter values, which stems from the statistical uncertainty of the data events in the different jet slices. Systematic uncertainties do not contribute significantly in the jet counting analysis, and cause only a 1% loss in sensitivity. The leading systematic uncertainty in the EWK analysis is related to the NN invariance assumption and causes a 30% loss in sensitivity, while other systematic uncertainties are subdominant.

The uncertainties assigned to the expected signal yields for the SUSY benchmark processes considered include the experimental uncertainties related to the detector modelling, which are dominated by the modelling of the jet energy scale, and the b-tagging efficiencies and mis-tagging rates. For example, for a signal model with four b-quarks, the b-tagging uncertainties are \(\approx \) 10%, and the jet-related uncertainties are typically \(\approx \) 5%. The uncertainty in the signal cross-sections used is discussed in Sect. 4. The uncertainty in the signal yields related to the modelling of additional jet radiation is studied by varying the factorization, renormalization, and jet-matching scales as well as the parton-shower tune in the simulation. The corresponding uncertainty is small for most of the signal parameter space, but is as large as 30% for very light or very heavy LSPs, where the contribution from additional jet radiation is relevant. The difference between fast and full simulation is evaluated for selected signal points. The jet multiplicity, b-jet multiplicity, and NN output distributions are found to be statistically compatible, so no additional uncertainty is considered due to the usage of fast simulation.

9 Results

Results are provided both as model-independent limits on the contribution from BSM physics to the dedicated SRs, and in the context of the five SUSY benchmark models discussed in Sect. 2. As described in Sect. 7.6, two different fit configurations are used for the two sets of results. In both cases, the profile likelihood-ratio test [138] is used to establish 95% confidence intervals using the \(\hbox {CL}_\text {s}\) prescription [139]. The parameter of interest is the signal strength, defined as the cross-section of the hypothetical contribution from physics beyond the SM in units of the cross-section of the benchmark model.

The b-jet multiplicity distributions are shown in Figs. 5 and 6 for the EWK analysis defined with a 20 GeV jet-\(p_{\mathrm{T}}\) threshold, for the \(1\ell \) and \(2\ell ^\text {sc} \) categories respectively. Figure 7 summarizes the observed pulls in all analysis regions, defined as the difference between the observed number of events and the total expected number of events determined by the fit divided by the total uncertainty. The total uncertainty is the sum in quadrature of the statistical error of the observed data and the uncertainty in the predicted background. The pulls follow a gaussian distribution centered at zero. Figure 8 shows the b-jet multiplicity distribution for the last jet-multiplicity bin defined for each of the jet \(p_{\mathrm{T}}\) thresholds, in both the \(1\ell \) and \(2\ell ^\text {sc} \) categories, which contains the discovery SRs at zero b-jet and high b-jet multiplicity. The likelihood fit is configured using the model-dependent configuration where all bins are input to the fit, and fixing the signal-strength parameter to zero.

Table 6 Data event yields compared with the expected contributions from relevant background sources, in the discovery signal regions defined for the \(1\ell \) category. The \(p_0\)-value, and corresponding significance (Z), as well as the observed and expected 95% CL model-independent upper limits on the product of cross-section, acceptance and efficiency (in ab) are also shown (\(\sigma ^\text {excl}\)). In SRs with a deficit of data compared to the background prediction the \(p_0\)-value is capped at 0.5. The parameters of the background model are determined in a fit to a reduced set of bins, corresponding to the model-independent fit discussed in the text
Table 7 Data event yields compared with the expected contributions from relevant background sources, in the discovery signal regions defined for the \(2\ell ^\text {sc} \) category. The \(p_0\)-value, and corresponding significance (Z), as well as the observed and expected 95% CL model-independent upper limits on the product of cross-section, acceptance and efficiency (in ab) are also shown. In SRs with a deficit of data compared to the background prediction the \(p_0\)-value is capped at 0.5. The parameters of the background model are determined in a fit to a reduced set of bins, corresponding to the model-independent fit discussed in the text

9.1 Model-independent results

The observed data event yields and the corresponding estimates for the backgrounds in the discovery SRs defined for the \(1\ell \) and \(2\ell ^\text {sc} \) categories are shown in Tables 6 and 7. For each SR a fit is performed to predict the background using the model-independent set-up, which excludes the SR under consideration. In addition, the discovery \(p_0\)-values and corresponding gaussian significance (Z) are shown, which measure the compatibility of the observed data with the background-only (zero signal strength) hypothesis relative to fluctuations of the background. Larger values indicate greater relative compatibility. No significant excess of data over the expected event yields is observed in any of the SRs. The two largest excesses are observed in the 60 GeV, \(\ge 11\) jets, \(\ge 3\) b-jets SR defined for the \(1\ell \) category and in the 60 GeV, \(\ge 7\) jets, \(\ge 3\) b-jets SR defined for the \(2\ell ^\text {sc} \) category, and both correspond to a significance of 1.3 standard deviations. Upper limits on the product of cross-section, acceptance, and efficiency are set at 95% CL, ranging from 22 to 200 ab, depending on the SR.

Fig. 9
figure 9

Observed and expected exclusion contours for the RPV models with strong production. The results are shown for a \(\tilde{g} \rightarrow q \bar{q} \tilde{\chi }_1^0 \rightarrow q \bar{q} q \bar{q} \ell /\nu \), b \(\tilde{g} \rightarrow \bar{t} \tilde{t} \rightarrow \bar{t} \bar{b} \bar{s}\), c \({\tilde{g} \rightarrow t \bar{t} \tilde{\chi }_{1,2}^0 \rightarrow t \bar{t} tbs}\) and d stop pair production. The contours of the band around the expected limit are the \(\pm 1 \sigma \) variations, including all uncertainties. The dotted lines around the observed limit illustrate the change in the observed limit as the nominal signal cross-section is scaled up and down by the theoretical uncertainty. All limits are computed at 95% CL. The diagonal line indicates the kinematic limit for the decays in each specified scenario. The limits on direct electroweakino production obtained with the EWK analysis are displayed as horizontal hatched bands in c and d. When relevant, the limit on the stop mass from Refs. [23, 27] is also shown. A small range in stop mass between 460 and 470 GeV is not excluded by the search for \({\tilde{t}} \rightarrow bs\) [27]

Fig. 10
figure 10

Observed and expected exclusion contours for the RPV models with electroweakino production models with a higgsino and b wino LSP hypotheses. The yellow and green contours of the band around the expected limit are the \(\pm 1 \sigma \) and \(\pm 2 \sigma \) variations including all uncertainties, respectively. The theoretical prediction is also shown, with the uncertainties in the prediction shown as a coloured band. The production of is not considered as it decays to a final state with no leptons

9.2 Model-dependent results

For each signal model probed, the fit is configured using the model-dependent configuration. No significant excess is observed in any of the model-dependent fits. Figure 9 shows the observed and expected exclusion limits for the strong production signal models featuring gluino and stop pair production, as a function of the gluino mass or stop mass. Gluino masses up to 2.4 TeV are excluded for high LSP masses, and up to 2 TeV for low LSP masses. Stop masses are excluded up to 1–1.3 TeV, depending on the LSP mass. The sensitivity decreases for low LSP masses due to the high boost of the LSP, resulting in close-by decay products that lead to reconstruction and isolation inefficiencies. The best sensitivity is achieved for a bino LSP, while the wino LSP exhibits the worst sensitivity due to the lower number of top quarks in the final state. The exclusion limits for the models with gluino or stop production are stronger than in the previous version of the analysis documented in Ref. [22], thanks to the larger dataset and the inclusion of the same-sign lepton category.

Figure 10 shows exclusion limits in the electroweakino pair-production model, versus the LSP mass. The limits for pure higgsino and wino LSPs are shown separately, taking into account the processes discussed in Sect. 2. The wino signal can only contribute to the \(1\ell \) category, via production, while the higgsino signal is also present in the \(2\ell ^\text {sc} \) category through production. This leads to differences in the observed limits between the two models. Depending on the LSP hypothesis, LSP masses between 200 (197) GeV and 320 (365) GeV are excluded for a higgsino (wino) LSP.

The analysis is also sensitive to SM \(t\bar{t}t\bar{t}\) production, which produces a final state similar to the targeted signals. An additional model-dependent fit is performed where the \(t\bar{t}t\bar{t}\) normalization is a free parameter. The fitted normalization of the four-top process relative to the Standard Model value is \({\mu _{t\bar{t}t\bar{t}} = 2.0^{+0.9}_{-0.7}}\). Modelling uncertainties due to scale variations and parton-shower variation are taken into account, as is a 20% cross-section uncertainty for the reference SM prediction of \(\sigma _{t\bar{t}t\bar{t}}\) [92]. This is in agreement with the measured value in Ref. [34] of \({\mu _{t\bar{t}t\bar{t}} = 2.0^{+0.8}_{-0.6}}\). Both analyses are based on the same dataset and have overlapping selections, but have completely different background estimation methods. The best sensitivity is obtained with the 40 GeV jet-\(p_{\mathrm{T}}\) threshold. Compatible results, albeit with larger uncertainties, are obtained with the 20 GeV and 60 GeV jet-\(p_{\mathrm{T}}\) thresholds. A fit with two independent signal strengths in each lepton category yields consistent values in both categories.

10 Conclusion

A search for RPV supersymmetry events with at least one isolated lepton (electron or muon) and high jet multiplicity is presented. In order to improve the sensitivity of the search, events with two leptons with the same electric charge, and events with at least one lepton are analysed separately. The selection also relies on the number of b-jets in the event. In order to ensure the highest sensitivity to the electroweak production models, a neural-network-based analysis was introduced. Data-based techniques are used to estimate the dominant backgrounds from \(t{\bar{t}}\)+jets, W/Z+jets, diboson, and \(t\bar{t}W \) production. The analysis is performed with proton–proton collision data at \(\sqrt{s}=13~\hbox {TeV}\) collected from 2015 to 2018 with the ATLAS detector at the LHC, corresponding to an integrated luminosity of 139 fb\(^{-1}\).

With no significant excess over the Standard Model expectation observed, results are interpreted in the framework of simplified models featuring gluino, stop, or electroweakino pair production in RPV supersymmetry scenarios. In a benchmark model with , gluino masses up to 2.38 TeV are excluded at 95% confidence level. Top squarks with masses up to 1.36 TeV are excluded in a model with direct stop production and RPV decays of the LSP. In both models, three hypotheses for the LSP are tested: pure bino, pure wino, and pure higgsino. In a model with \(\tilde{g} \rightarrow {{\bar{t}}\tilde{t}}\) and \(\tilde{t}\rightarrow {{\bar{b}}} {{\bar{s}}}\), gluino masses up to 1.83 TeV are excluded, whereas in a model with , gluino masses up to 2.25 TeV are excluded. Direct electroweak production of electroweakinos is also tested, and higgsino (wino) masses between 200 (197) GeV and 320 (365) GeV are excluded.

These results improve upon the previously existing LHC limits for the gluino and stop production models considered, owing to the larger luminosity, the dedicated categorization and analysis of events with two leptons with the same electric charge, and the introduction of multivariate discriminants. The results for the electroweak production model also improve upon the limits on hadronic RPV decays of electroweakinos from LEP [140,141,142]. Model-independent limits are also set on the contribution of new phenomena to the signal-region yields.