1 Introduction

The standard model (SM) of particle physics describes subatomic phenomena with outstanding precision. However, the SM cannot address several open questions such as the hierarchy problem [1, 2] and the absence of a suitable particle candidate for dark matter (DM) [3, 4]. Supersymmetry (SUSY) [5,6,7,8,9,10,11,12] is a well-known extension of the SM that can resolve both of these problems by introducing a relation between bosons and fermions. For each known particle, it assigns a new SUSY partner that differs by a half unit of spin. SUSY provides a natural solution to the gauge hierarchy problem provided that the SUSY partners of the top quark, gluon, and Higgs boson are not too massive. While difficult to quantify precisely, values of the top squark mass up to the 1\(\,\text {Te}\text {V}\) range are favored [1, 13,14,15]. The lightest SUSY particle (LSP), which is potentially massive, may be a viable DM candidate if it is stable and electrically neutral.

This paper presents the combination of previously published searches [16,17,18] for the pair production of SUSY top quark partners in final states without leptons, with one, or with two charged leptons, in events from proton–proton (\({\text {p}}{\text {p}}\)) collisions at a center-of-mass energy (\(\sqrt{s}\)) of 13\(\,\text {Te}\text {V}\) at the CERN LHC, corresponding to an integrated luminosity of 137\(\,\text {fb}^{-1}\), and referred to here as inclusive analyses. It also includes a new analysis targeting a parameter space where the mass difference between the top squark and the neutralino is close to the top quark mass, whose results are combined with the previously published studies. All analyses are performed with the data set collected in 2016–2018 (Run 2) by CMS.

The inclusive searches are interpreted in terms of top squark pair production with two different subsequent decays, as described in the simplified model context [19,20,21]. Two decay chains are considered, both of which lead to a signature with a neutralino (), which is the LSP, a \({\text {W }}\)boson and a bottom quark. These are the direct decay of the top squark to a top quark and a neutralino, and the decay of the top squark to a chargino () and a bottom quark where the chargino decays to a \({\text {W }}\)boson and a neutralino. Three simplified models are used for interpretation. In the first model, both top squarks decay according to the first decay chain; in the second model, both decay according to the second decay chain; in the third model, these two decays occur with equal probability. The mass of the chargino in the second model is chosen to be an arithmetic average of the top squark mass () and the LSP mass (\({m}_{\tilde{\upchi }^{0}_1}\)), while in the third model the mass splitting between the neutralino and chargino is assumed to be 5\(\,\text {Ge}\text {V}\). Typical diagrams are shown in Fig. 1. In previous analyses by the CMS collaboration top squark masses up to 1310\(\,\text {Ge}\text {V}\) have been excluded [16,17,18, 22,23,24,25,26,27,28,29]. Limits on the production of top squark pairs with masses up to 1260\(\,\text {Ge}\text {V}\) have been set by the ATLAS Collaboration [30,31,32,33,34,35].

Fig. 1
figure 1

Diagrams of top squark pair production with further decay of each top squark into a top quark and a neutralino (left), of each top squark into a chargino and a neutralino, with the chargino decaying then into a bottom quark and a \({\text {W }}\)boson (center), and with a combination of the two top squark decay scenarios (right)

If the mass difference between the top squark and the lightest neutralino in the model is close to the mass of the top quark (\(m_\text {t}\)), the kinematic distributions of the final states of the SUSY signal are very similar to those of SM top quark pair (\({\mathrm{t}\overline{{ \mathrm t}}}\)) production processes. Therefore, this is a difficult region in which to search for a signal. In this case, the signal acceptance strongly depends on and . The boundaries of the corridor are taken to be \(\varDelta m_\text {cor} = 30\,\text {Ge}\text {V} \) and , where and 175\(\,\text {Ge}\text {V}\) is the reference value of the top quark mass. The top quark corridor was not included in the parameter space addressed by the previous inclusive searches by the CMS Collaboration [16,17,18, 22,23,24,25,26,27,28,29].

In the top quark corridor region, the signal could be observed as an excess over the \({\mathrm{t}\overline{{ \mathrm t}}}\) background prediction [36], but the sensitivity to is limited. A dedicated search was performed with the data set collected in 2016 by CMS [37], that excluded the presence of top squark production up to for \(\varDelta m_\text {cor} = 0\) and up to about for \(\varDelta m_\text {cor} = 7.5\,\text {Ge}\text {V} \) at 95% confidence level. An analysis of the top quark corridor by the ATLAS Collaboration has set exclusion limits for top squark masses between 170 and 230\(\,\text {Ge}\text {V}\) [38].

This paper presents a new dedicated search in events with an opposite-charge lepton pair that is sensitive to the top quark corridor region. The sensitivity in the top quark corridor is extended by using a larger data set and a more sophisticated strategy, using a deep neural network (DNN) [39] to exploit the differences between the signal and the SM \({\mathrm{t}\overline{{ \mathrm t}}}\) production, which is the main background.

In order to reduce the background from \({\mathrm{t}\overline{{ \mathrm t}}}\) events, the missing transverse momentum (\({\vec p}_{\mathrm {T}}^{\text {miss}}\)) is used together with the so-called “stransverse” mass of the leptons (\(m_{\mathrm {T2}} (\ell \ell )\)) [40], defined as

$$\begin{aligned} m_{\mathrm {T2}} (\ell \ell )= & {} \min _{\vec {p}_{\text {T1}}^{\text {miss}} + \vec {p}_{\text {T2}}^{\text {miss}} = {\vec p}_{\mathrm {T}}^{\text {miss}}} \\&\left( \max \left[ m_{\mathrm {T}} (p_{\mathrm {T}} ^{{\ell }1},\vec {p}_{\text {T1}}^{\text {miss}}) , m_{\mathrm {T}} (p_{\mathrm {T}} ^{{\ell }2},\vec {p}_{\text {T2}}^{\text {miss}}) \right] \right) , \end{aligned}$$

where \(\ell \) refers to an electron or a muon, \(m_{\mathrm {T}}\) is the transverse mass, and \(\vec {p}_\text {T1}^\text {miss}\) and \(\vec {p}_\text {T2}^\text {miss}\) correspond to the estimated transverse momenta of the two invisible particles (neutrinos in the case of \({\mathrm{t}\overline{{ \mathrm t}}}\) events) that are presumed to determine the total \({\vec p}_{\mathrm {T}}^{\text {miss}}\) of an SM event. The transverse mass is calculated for each lepton–neutrino pair, for different assumptions of the neutrino transverse momentum (\(\vec {p}_{\text {T}i}^{\text {miss}}\)). The computation of \(m_{\mathrm {T2}} (\ell \ell ) \) is done using the algorithm discussed in Ref. [41]. A signal region is defined applying requirements on \(m_{\mathrm {T2}} (\ell \ell )\) and on \(p_{\mathrm {T}} ^\text {miss}\), the magnitude of \({\vec p}_{\mathrm {T}}^{\text {miss}}\). A DNN is used to optimize the sensitivity for signal at each mass point.

We also consider the alternative model \({\mathrm{t}\overline{{ \mathrm t}}} +\text {DM}\) shown in Fig. 2, in which a DM particle is produced in association with a pair of top quarks. In this simplified model, a scalar (\(\phi \)) or pseudoscalar (a) particle mediates the interaction between SM quarks and a new Dirac fermion (\(\chi \)), which is the DM candidate particle [42,43,44,45,46]. Under the assumption of minimal flavor violation [47, 48] the spin-0 mediators couple to quarks having mass \(m_\text {q}\) with SM-like Yukawa couplings proportional to \(g_\text {q}m_\text {q}\), where the coupling strength \(g_\text {q}\) is taken to be 1. The coupling strength \(g_{\text {DM}}\) of the mediator to the DM particles is also set to 1. In the case of a scalar mediator, mixing with the SM Higgs boson is neglected. Prior searches by the ATLAS and CMS Collaborations excluded scalar and pseudoscalar mediator particles with a mass of up to 290 and 300\(\,\text {Ge}\text {V}\), respectively [30, 49,50,51,52].

Fig. 2
figure 2

Feynman diagram of direct DM production through a scalar (\(\phi \)) or pseudoscalar (a) mediator particle, in association with a top quark pair

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionization chambers embedded in the steel flux-return yoke outside the solenoid.

Events of interest are selected using a two-tiered trigger system. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed latency of about 4 \(\upmu \)s [53]. The second level, known as the high-level trigger, consists of a farm of processors running a version of the full event reconstruction software optimized for fast processing, and reduces the event rate to around 1 kHz before data storage [54].

A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. [55].

3 Monte Carlo simulation

Monte Carlo (MC) simulation is used to design the searches, predict or aid the prediction of the background events from SM processes, and to provide estimations of the expected SUSY and \({\mathrm{t}\overline{{ \mathrm t}}} +\text {DM}\) signal event yields.

Several models from the simplified model spectra [19, 21] are used to simulate the SUSY signals. The helicity states of the produced top quarks are not considered in these models, and in the simulation the top quarks are treated as unpolarized. The generation of signal samples is performed using the MadGraph 5_amc@nlo  generator (MadGraph) [56, 57] (version 2.2.2 for 2016 and version 2.4.2 for 2017 and 2018) at leading order (LO) in quantum chromodynamics (QCD) with up to two additional partons from initial-state radiation (ISR). To improve on the MadGraph modeling of the multiplicity of additional jets from ISR, MadGraph signal events are reweighted based on the number of ISR jets (\(N_\mathrm {J}^\mathrm {ISR}\)). These weights are obtained using a \({\mathrm{t}\overline{{ \mathrm t}}}\) MadGraph MC sample, so as to make the \({\mathrm{t}\overline{{ \mathrm t}}}\) jet multiplicity agree with data. The reweighting factors vary between 0.92 and 0.51 for \(N_\mathrm {J}^\mathrm {ISR}\) between 1 and 6, respectively.

Signal samples of the \({\mathrm{t}\overline{{ \mathrm t}}} +\text {DM}\) model [58] are generated using MadGraph  v2.4.2 at LO with at most one additional parton in the matrix element calculations. Samples for mediator masses of 50, 100, 200, 300, and 500\(\,\text {Ge}\text {V}\) have been generated for both the scalar and pseudoscalar models. The mass of the DM particle is set to 1\(\,\text {Ge}\text {V}\) while a value of 1 is chosen for the couplings.

The SM \({\mathrm{t}\overline{{ \mathrm t}}}\) process is simulated using the powheg (v2) [59,60,61] generator at next-to-leading order (NLO) for the dilepton analyses or the MadGraph  generator at LO for the analyses of zero or one lepton events. In the top quark corridor analysis the powheg generator is used, as this analysis relies on a precise estimate of the \({\mathrm{t}\overline{{ \mathrm t}}}\) background and its associated modeling uncertainties, which are better described in CMS by the powheg generator [36, 62]. This sample is also used to calculate the dependence of the \({\mathrm{t}\overline{{ \mathrm t}}}\) acceptance on \(m_\text {t}\) and on the factorization and renormalization scales (\(\mu _\mathrm {F}\), \(\mu _\mathrm {R}\), respectively). A parameter denoted as \(h_\mathrm {damp}\) is used in the modeling of the parton shower matrix element [63, 64]. The central value and uncertainties in \(h_{\mathrm {damp}}\) are discussed in Sect. 6.4.2.

The powheg v1 [65] generator is used for the single top quark and antiquark production in association with a \({\text {W }}\) boson (\(\text {t}{\text {W }}\)) at NLO. The MadGraph  v2.2.2 [56] generator is used at NLO for modeling the Drell–Yan (DY) process, the production of \({\text {W }}\) or \({\text {Z }}\) bosons in association with \({\mathrm{t}\overline{{ \mathrm t}}}\) events (\({\mathrm{t}\overline{{ \mathrm t}}} {\text {W }}\), \({\mathrm{t}\overline{{ \mathrm t}}} {\text {Z }}\)), and the \({\text {W }}{\text {W }}\), \({\text {W }}{\text {Z }}\), and \({\text {Z }}{\text {Z }}\) production processes. The production of the DY process is simulated with up to two additional partons [66], and the FxFx scheme is used for the matching of jets from the matrix element calculations and from parton showers. Samples of \({\text {W }}\)+jets, \({\text {Z }}\)+jets events (with ), , and QCD multijet production are simulated with up to four extra partons in the matrix element calculations using the MadGraph  (v2.2.2 in 2016 and v2.4.2 in 2017 and 2018) event generator at LO. Double counting of the partons generated with MadGraph and via the parton shower is removed using the MLM [57] matching scheme.

The NNPDF 3.0 [67] parton distribution function (PDF) set is used for generating the samples corresponding to the 2016 period, while the NNPDF 3.1 NNLO [68] PDF is used for the 2017 and 2018 samples. Parton showering and hadronization are handled by pythia  v8.226 (8.230) [69, 70] using the underlying event tune CUETP8M2T4 [63] for SM \({\mathrm{t}\overline{{ \mathrm t}}}\)  events for the 2016 (2017, 2018) period, the CUETP8M1 [71] tune for all other background and signal events in the 2016 period, and the CP5 tune [64] for all background and signal events of the 2017 and 2018 periods. The nominal top quark mass is 172.5\(\,\text {Ge}\text {V}\) in all the samples.

The Geant4 package [72] is used to simulate the CMS detector for samples of the SM processes, the \({\mathrm{t}\overline{{ \mathrm t}}} +\text {DM}\) signal processes, and SUSY signal samples where is close to the top quark mass. The CMS fast simulation program [73, 74] is used to simulate the detector response for the remaining signal samples. The effect of additional interactions in the same event (referred to as pileup) is accounted for by simulating additional minimum bias interactions for each hard scattering event. The observed distribution of the number of pileup interactions, which has an average of 23 and 32 collisions per bunch crossing for the 2016 period, and for the 2017 and 2018 periods, respectively, is reproduced by the simulation.

Simulated background events are normalized according to the integrated luminosity and the theoretical cross section of each process. The latter are computed at next-to-next-to-leading order (NNLO) in QCD for DY [75], approximately NNLO for \(\text {t}{\text {W }}\)  [76], and NLO for \({\text {W }}{\text {W }}\), \({\text {W }}{\text {Z }}\), \({\text {Z }}{\text {Z }}\) [77], \({\mathrm{t}\overline{{ \mathrm t}}} {\text {W }}\) and \({\mathrm{t}\overline{{ \mathrm t}}} {\text {Z }}\) [78]. For the normalization of the simulated \({\mathrm{t}\overline{{ \mathrm t}}}\) sample, the full NNLO plus next-to-next-to-leading logarithmic (NNLL) accurate calculation [79], performed with the Top++ 2.0 program [80], is used. The PDF uncertainties are added in quadrature to the uncertainty associated with the strong coupling constant (\(\alpha _\mathrm {S}\)) to obtain a \({\mathrm{t}\overline{{ \mathrm t}}}\) production cross section of \(832~^{+20}_{-29}\,\text {(scale)}\pm 35\,\)(PDF+\(\alpha _\mathrm {S}\))pb assuming \(m_\text {t} = 172.5\,\text {Ge}\text {V} \).

The SUSY signal events are normalized to cross sections calculated at approximate NNLO+NNLL accuracy [81,82,83,84,85,86,87,88,89,90] obtained from the simplified model for the direct pair production of top squarks. The cross sections of the \({\mathrm{t}\overline{{ \mathrm t}}} +\text {DM}\) model are calculated at LO using MadGraph  v2.4.2.

4 Event reconstruction

In this section, the event reconstruction common to all the analyses presented in this paper is described.

An event may contain multiple primary vertices, corresponding to multiple \({\text {p}}{\text {p}}\) collisions occurring in the same bunch crossing. The candidate vertex with the largest value of summed physics-object \(p_{\mathrm {T}} ^2\) is taken to be the primary \({\text {p}}{\text {p}}\) interaction vertex. The physics objects for this determination are the jets, clustered using the jet finding algorithm [91, 92] using tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the transverse momentum of those jets.

The particle-flow algorithm [93] aims to reconstruct and identify each individual particle in an event, with an optimized combination of information from the various elements of the CMS detector. The energy of photons is obtained from the ECAL measurement. The energy of electrons is determined from a combination of the electron momentum at the primary interaction vertex as determined by the tracker, the energy of the corresponding ECAL cluster, and the energy sum of all bremsstrahlung photons spatially compatible with originating from the electron track. The energy of muons is obtained from the curvature of the corresponding track. The energy of charged hadrons is determined from a combination of their momentum measured in the tracker and the matching ECAL and HCAL energy deposits, corrected for the response function of the calorimeters to hadronic showers. Finally, the energy of neutral hadrons is obtained from the corresponding corrected ECAL and HCAL energies.

For each event, hadronic jets are clustered from these reconstructed particles using the infrared and collinear safe anti-\(k_{\mathrm {T}}\) algorithm [91, 92] with a distance parameter of 0.4. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found from simulation to be, on average, within 5–10% of the generated momentum over the whole \(p_{\mathrm {T}}\) spectrum and detector acceptance.

Additional \({\text {p}}{\text {p}}\) interactions within the same or nearby bunch crossings can contribute with additional tracks and calorimetric energy depositions to the jet momentum. To mitigate this effect, charged particles identified as originating from pileup vertices are discarded, and an offset correction is applied to correct for the contribution from neutral particles [94]. Jet energy corrections are derived from simulation to bring the energy of a jet measured from the detector response to that of a particle-level jet on average. In situ measurements of the momentum balance in dijet, photon+jets, \({\text {Z }}\)+jets, and multijet events are used to account for any residual differences in jet energy scale between data and simulation [95]. The jet energy resolution amounts typically to 15% at 10\(\,\text {Ge}\text {V}\), 8% at 100\(\,\text {Ge}\text {V}\), and 4% at 1\(\,\text {Te}\text {V}\)  [95]. Additional selection criteria are applied to each jet to remove jets potentially dominated by anomalous contributions from various subdetector components or reconstruction failures [96]. Jets produced by the hadronization of b quarks are identified using btagging multivariate algorithms: DeepCSV [97] for the inclusive searches and DeepJet [98, 99] for the corridor search. The more recently developed DeepJet algorithm has slightly better performance for some parts of the phase space than the DeepCSV algorithm. All analyses use a medium working point for the tagger, corresponding to a a misidentification probability for jets originating from gluons or up, down, and strange quarks of 1%, and a btagging efficiency of about 70%. A tight working point, corresponding to a misidentification rate of 0.1%, is also used in the analysis of Sect. 5.2.

The missing transverse momentum vector is defined as the negative vector \(p_{\mathrm {T}}\) sum of all particle-flow candidates reconstructed in an event with jet energy corrections applied. Events with serious \(p_{\mathrm {T}} ^\text {miss}\) reconstruction failures are rejected using dedicated filters [100].

The requirements imposed to select reconstructed particle objects specific to the separate search strategies incorporated into the present combination are given in the following sections. In Sect. 5 we give brief summaries of the previously published searches, and in Sect. 6 the detailed presentation of the new top quark corridor search.

5 Inclusive top squark searches

Three analyses targeting final states without leptons [16], with one [17], or with two charged leptons [18] have been previously published. The main features are briefly discussed in this section.

5.1 Fully hadronic analysis

The search in the fully hadronic final state [16] targets events with hadronic jets and large reconstructed \(p_{\mathrm {T}} ^\text {miss}\). The SM backgrounds with intrinsic \(p_{\mathrm {T}} ^\text {miss}\) generated through the leptonic decay of a \({\text {W }}\) boson, where the neutrino escapes detection, are significantly suppressed by rejecting events containing isolated electrons or muons. The contribution from events in which a \({\text {W }}\) boson decays to a \(\tau \) lepton is suppressed by rejecting events containing isolated hadronically decaying \(\tau \) candidates [101, 102]. A central feature of this analysis is the deployment of advanced jet tagging algorithms to identify hadronically decaying top quarks and \({\text {W }}\) bosons, with different algorithms covering both the highly Lorentz-boosted regime and the resolved regime. For the highly Lorentz-boosted regime, where the decay products of the particle in quest are expected to merge into a single large-R jet with a distance parameter of \(R = 0.8\), the DeepAK8 algorithm [103] is used to identify these large-R jets originating from top quarks or \({\text {W }}\) bosons. In the resolved regime, where the decay products of the top quark are separately reconstructed using jets with \(R = 0.4\), the DeepResolved algorithm [17] is used to tag these top quarks with intermediate \(p_{\mathrm {T}}\), ranging from 150 to 450\(\,\text {Ge}\text {V}\).

To enhance sensitivity to signal models with a compressed mass spectrum where the mass of the top squark is close to the sum of the masses of the LSP and the \({\text {W }}\)boson, a dedicated “soft b tag” algorithm developed to identify very low \(p_{\mathrm {T}}\) hadrons is also used for the event categorization [104]. The analysis includes a total of 183 nonoverlapping signal regions, defined in Ref. [16] and optimized for different SUSY models and ranges of mass splittings between SUSY particles. A large \(p_{\mathrm {T}} ^\text {miss}\), due to the presence of a pair of neutralinos in the signal model, is required.

The dominant sources of SM background with intrinsic \(p_{\mathrm {T}} ^\text {miss}\) are the inclusive production of top quark pairs, \({\text {W }}\)and \({\text {Z }}\)bosons, single top quark production, and the \({\mathrm{t}\overline{{ \mathrm t}}}\) \({\text {Z }}\)process. The contribution from \({\mathrm{t}\overline{{ \mathrm t}}}\), \({\text {W }}\)+jets, \({\mathrm{t}\overline{{ \mathrm t}}}\) \({\text {W }}\), and single top quark processes arises from events in which a \({\text {W }}\) boson decays leptonically to produce \(p_{\mathrm {T}} ^\text {miss}\) associated with an energetic neutrino, but the charged lepton either falls outside of the kinematic acceptance or fails the lepton identification criteria. This background is collectively referred to as “lost-lepton” background. The contributions from \({\text {Z }}\)+jets and \({\mathrm{t}\overline{{ \mathrm t}}}\) \({\text {Z }}\) events arise when the \({\text {Z }}\)boson decays to neutrinos, resulting in large genuine \(p_{\mathrm {T}} ^\text {miss}\). Contributions from the QCD multijet process enter the search sample in cases where severe mismeasurements of jet momenta (i.e., jets passing through poorly instrumented regions of the detector) produce significant artificial \(p_{\mathrm {T}} ^\text {miss}\), or when neutrinos arise from leptonic decays of heavy-flavor hadrons produced during the jet fragmentation. The contribution of each SM background process to the search sample is estimated through measurements of event rates in dedicated background control samples that are translated to predicted event counts in the corresponding search sample with the aid of simulation. The data are found to be in good agreement with the predicted backgrounds.

5.2 Single-lepton analysis

The search for top squark pair production in the single-lepton final state [17] focuses on final states with exactly 1 lepton, coming from the decay of a \({\text {W }}\) boson from the decay chain or . Since the in the final state of the signal gives rise to substantial \(p_{\mathrm {T}} ^\text {miss}\) compared with SM processes, \(p_{\mathrm {T}} ^\text {miss} >250\,\text {Ge}\text {V} \) is required. The transverse mass computed from the lepton \({\vec p}_{\mathrm {T}}\) and \({\vec p}_{\mathrm {T}}^{\text {miss}}\) is required to be larger than \(150\,\text {Ge}\text {V} \) to reduce the lepton+jets background from \({\mathrm{t}\overline{{ \mathrm t}}} \) and \({\text {W }}\)+jets processes, for which \(m_{\mathrm {T}}\) has a natural cutoff at the \({\text {W }}\)boson mass (\(m_{{\text {W }}}\)).

The dileptonic \({\mathrm{t}\overline{{ \mathrm t}}}\) process, where one of the leptons is lost, is the largest remaining SM background. In these lost-lepton events \(m_{\mathrm {T}}\) is not bound by \(m_{{\text {W }}}\) because of the additional \(p_{\mathrm {T}} ^\text {miss}\) arising from the presence of a second neutrino. The modified topness (\(t_{\text {mod}} \)) variable, introduced in Ref. [17], is a measure for the likelihood of a single lepton event to originate from dileptonic \({\mathrm{t}\overline{{ \mathrm t}}}\) and is used to introduce better discrimination against this background.

The dileptonic \({\mathrm{t}\overline{{ \mathrm t}}}\) background is estimated through a set of dedicated control regions that require two isolated leptons. The second lepton is added to \(p_{\mathrm {T}} ^\text {miss}\) in the calculation of variables that depend on \(p_{\mathrm {T}} ^\text {miss}\), e.g. \(m_{\mathrm {T}}\) and \(t_{\text {mod}}\), to mimic the lost-lepton scenario.

The subleading SM background comes from the process of \({\text {W }}\)+jets production, where the \({\text {W }}\) boson decays leptonically. While the single-lepton events from the \({\text {W }}\)boson are largely suppressed by the \(m_{\mathrm {T}}\) requirement, events where the \({\text {W }}\) boson is produced off-shell can still enter the signal regions. The requirement of at least one b-tagged jet significantly reduces this type of background. Events are further categorized in terms of the invariant mass of the lepton and the b-tagged jet, which helps to further reduce the \({\text {W }}\)+jets background. The \({\text {W }}\)+jets background is estimated using control regions with an inverted b-tagged jet requirement which yields a high-purity sample of \({\text {W }}\)+jets events.

Irreducible SM backgrounds arise from the and \({\text {W }}\) \({\text {Z }}\) processes when the \({\text {Z }}\) boson decays into a pair of neutrinos. These rare backgrounds and the remaining events from the single lepton \({\mathrm{t}\overline{{ \mathrm t}}}\) process are sub-dominant contributions in most search regions and are estimated using simulated samples.

This analysis also makes use of the same jet tagging algorithms, described above in the fully hadronic channel, to identify hadronic top quark decays in the final state. This is motivated by the fact that none of the leading SM backgrounds, except , has a hadronically decaying top quark in the final state, while in some signal scenarios one hadronically decaying top quark is expected. Events in the lower \(p_{\mathrm {T}} ^\text {miss}\) search regions are categorized into different regions according to the presence of at least one merged or resolved top quark candidate.

Finally, a dedicated search strategy is used for signal scenarios with small mass splitting between the top squark and the LSP to optimize sensitivity. In these compressed scenarios with close to \(m_{{\text {W }}}\) or , \(p_{\mathrm {T}} ^\text {miss}\) can be small when neutralinos are back-to-back, and therefore \(t_{\text {mod}}\) and the merged and resolved top quark tags are not used. Instead, one non-b-tagged jet, which could arise from ISR for a signal event, is required and a requirement on the proximity of the lepton to the \(p_{\mathrm {T}} ^\text {miss}\) is introduced. In the case of at least one “soft b tag”, such as a secondary vertex, is required instead of the standard b-tagged jets, to improve the acceptance for b quarks that do not carry sufficient momentum to be reconstructed as a jet. In order to enhance the sensitivity to different signal scenarios events are categorized into 39 non-overlapping signal regions based on the values of \(p_{\mathrm {T}} ^\text {miss}\) and several of the variables introduced above.

5.3 Dilepton analysis

The search in the dilepton final state [18] is carried out using events containing a pair of leptons (electron or muons) with opposite charges. The invariant mass of the lepton pair (\(m_{\ell \ell } \)) is required to be greater than 20\(\,\text {Ge}\text {V}\) to suppress backgrounds with misidentified or nonprompt leptons from the hadronization of heavy-flavor jets in multijet events. Events with additional leptons, including candidates with looser requirements on transverse momentum, and isolation are rejected. Events with a same-flavor lepton pair that is consistent with the SM DY production are removed by requiring \(|m_{{\text {Z }}} - m_{\ell \ell } | > 15\,\text {Ge}\text {V} \), where \(m_{{\text {Z }}}\) is the mass of the \({\text {Z }}\)boson. To further suppress DY and other vector boson backgrounds, the number of jets is required to be at least two and, among them, the number of b-tagged jets to be at least one.

The \(p_{\mathrm {T}} ^\text {miss}\) significance, denoted as \(\mathcal {S}\), is used to suppress events where detector effects and misreconstruction of particles from pileup interactions are the main source of reconstructed \(p_{\mathrm {T}} ^\text {miss}\). The algorithm used to obtain \(\mathcal {S}\) is described in Ref. [100]. A requirement of \(\mathcal {S} > 12\) is used in order to suppress the otherwise overwhelming DY background in the same-flavor channel. This requirement exploits the stability of \(\mathcal {S}\) with respect to the pileup rate for events with no genuine \(p_{\mathrm {T}} ^\text {miss}\). The DY background is further reduced through a requirement on the azimuthal angular separation between \({\vec p}_{\mathrm {T}}^{\text {miss}}\) and the momentum of the leading (subleading) jet of \(\cos \varDelta \phi (p_{\mathrm {T}} ^\text {miss}, \mathrm {j}) < 0.80\,(0.96)\). These criteria reject a small background of DY events with significantly mismeasured jets.

The main variable in this analysis is \(m_{\mathrm {T2}} (\ell \ell )\), which is defined in equation (1), and extensively discussed in Ref. [23]. The key feature of the \(m_{\mathrm {T2}} (\ell \ell )\) observable is that it retains a kinematic endpoint at the \({\text {W }}\)boson mass for background events from the leptonic decays of two \({\text {W }}\)bosons, produced directly or through a top quark decay. Similarly, the \(m_{\mathrm {T2}} (\text {b}\ell \text {b}\ell )\) observable, defined with equation (1), but using the vector sum of the leptons and the -jets instead of leptons alone [18], is bounded by the top quark mass if the leptons, neutrinos and b-tagged jets originate from the decay of top quarks. By contrast, signal events do not have the respective endpoints and are expected to populate the tails of these distributions.

Signal regions based on \(m_{\mathrm {T2}} (\ell \ell )\), \(m_{\mathrm {T2}} (\text {b}\ell \text {b}\ell )\), and \(\mathcal {S}\) are defined to enhance the sensitivity to different signal scenarios. The regions are further divided into different categories separately for events with a same-flavor and a different-flavor lepton pair, to account for the different SM background composition. The signal regions are defined such that there is no overlap between them, nor with the background-enriched control regions.

Events with an opposite-charge lepton pair are abundantly produced by the DY and \({\mathrm{t}\overline{{ \mathrm t}}}\) processes. The event selection rejects the vast majority of DY events. Therefore, the major backgrounds from SM processes in the search regions are top quark pair and single top events that pass the \(m_{\mathrm {T2}} (\ell \ell )\) threshold because of severely mismeasured \(p_{\mathrm {T}} ^\text {miss}\) or a misidentified lepton. In high \(m_{\mathrm {T2}} (\ell \ell )\) and \(\mathcal {S}\) signal regions, events with are the main SM background. Remaining DY events with large \(p_{\mathrm {T}} ^\text {miss}\) from mismeasurement, multiboson production and other \({\mathrm{t}\overline{{ \mathrm t}}}\)/single  processes in association with a \({\text {W }}\), a \({\text {Z }}\) or a Higgs boson (, or ) are sources of smaller contributions. A detailed description of the background estimation method is given in Ref. [18].

6 Top quark corridor analysis

The top quark corridor analysis is discussed in this section in more detail, as it is presented for the first time in this paper. In this search, events containing a dilepton pair with opposite charge and \(p_{\mathrm {T}} ^\text {miss}\) are selected, and a DNN algorithm is used to increase the sensitivity to the signal. The full DNN score distribution for events in the signal region is used to test the presence of the signal.

6.1 Object and event selection

The object selection and baseline requirements of the event selection are the same as those for the dilepton analysis summarized in the first paragraph of Sect. 5.3, and are detailed in this section. Electron and muon candidates are required to have \(p_{\mathrm {T}} > 20\,\text {Ge}\text {V} \) and \(|\eta | < 2.4\). In addition, the \(p_{\mathrm {T}}\) of the leading lepton must be at least 25\(\,\text {Ge}\text {V}\). The leptons are required to be isolated by measuring their relative isolation as the scalar \(p_{\mathrm {T}}\) sum, normalized to the lepton \(p_{\mathrm {T}}\), of the photons and of the neutral and charged hadrons within a cone of radius \(\varDelta R=\sqrt{\smash [b]{(\varDelta \eta )^2+(\varDelta \phi )^2}}=0.3\) (0.4) around the candidate electron (muon). In order to reduce the dependence on the number of pileup interactions, charged hadron candidates are included in the sum only if they are consistent with originating from the selected primary vertex in the event. The expected contribution from neutral hadrons due to pileup is estimated following the method described in Ref. [105]. For an electron candidate the relative isolation requirement depends on \(\eta \) (values close to 0.04) and for a muon it is required to be smaller than 0.15.

Selected jets are required to have \(p_{\mathrm {T}} > 30\,\text {Ge}\text {V} \) and \(|\eta | < 2.4\). Additionally, jets that are found within a cone of \(\varDelta R=0.4\) around the selected leptons are rejected. Jets originating from the hadronization of bottom quarks are identified as -tagged jets by using the medium working point of the DeepJet algorithm [98, 99].

Simulated events are corrected to account for differences with respect to data in the lepton reconstruction, identification, and isolation efficiencies, as well as efficiencies in the performance of tagging. The values of the data-to-simulation correction factors are parameterized as functions of the \(p_{\mathrm {T}}\) and \(\eta \) of the object and deviate from unity by less than 1% for leptons and less than 10% for -tagged jets.

Selected events are classified in categories according to the flavor of the two leading leptons () and the data-taking period (2016, 2017, 2018). Moreover, events are required to contain at least two jets, one of which must be tagged. This set of requirements is referred to as the baseline selection.

After the baseline selection, most of the background events (about 98%) are expected to come from \({\mathrm{t}\overline{{ \mathrm t}}}\), \(\text {t}{\text {W }}\), and DY processes. To suppress these backgrounds, the signal region is defined with the requirements \(p_{\mathrm {T}} ^\text {miss} >50\,\text {Ge}\text {V} \) and \(m_{\mathrm {T2}} (\ell \ell ) > 80\,\text {Ge}\text {V} \). As described in Sect. 5.3, \(m_{\mathrm {T2}} (\ell \ell ) \) serves to account for the multiple sources of \(p_{\mathrm {T}} ^\text {miss}\) in the signal process and to exploit the differences with respect to the background processes. For \({\mathrm{t}\overline{{ \mathrm t}}}\), \(\text {t}{\text {W }}\) or \({\text {W }}\)+jets events this variable’s distribution has a kinematic endpoint at the \({\text {W }}\)boson mass, because the transverse mass of each lepton–neutrino pair corresponds to the transverse mass of the \({\text {W }}\)boson, whereas signal events have neutralinos contributing to the total \(p_{\mathrm {T}} ^\text {miss}\), so they populate larger \(m_{\mathrm {T2}} (\ell \ell ) \).

6.2 Background estimation

Although most of the \({\mathrm{t}\overline{{ \mathrm t}}}\) events are rejected by requiring \(m_{\mathrm {T2}} (\ell \ell ) > 80\,\text {Ge}\text {V} \), it is still the dominant background contribution in the signal region, where most of the events have a large \(m_{\mathrm {T2}} (\ell \ell )\) value because of resolution effects when computing \({\vec p}_{\mathrm {T}}^{\text {miss}}\). In this region, some of the \({\mathrm{t}\overline{{ \mathrm t}}}\) events contain jets with a mismeasured energy and, in a smaller proportion, there are events where one of the leptons is missed and a lepton that is not from a \({\text {W }}\)boson decay (nonprompt lepton) is taken as the second lepton in the event. The effect of the jet mismeasurements is checked in MC and an uncertainty is assigned. Events containing nonprompt leptons are considered in a different background category.

The second-largest contribution is \(\text {t}{\text {W }}\)  background, which is approximately 4% of the total background, and is also estimated from MC simulation. The DY events give the third-largest background contribution in the same-flavor channel, while their contribution is negligible in the channel.

Background with nonprompt leptons is estimated from MC simulation and validated in a control region with the same selection as the signal region, but requiring two same-sign leptons. These events include the contribution from jets misidentified as leptons or with leptons coming from the decay of a bottom quark mistakenly identified as coming from the hard process. In the same-charge region, most of the events come from \({\mathrm{t}\overline{{ \mathrm t}}}\) with nonprompt leptons, with a smaller contribution of events with prompt leptons from and  production, and dileptonic \({\mathrm{t}\overline{{ \mathrm t}}}\) with prompt leptons and a mismeasurement of the electron charge. A reasonable agreement with same-charge data, within 10–15%, is observed in this validation region. Minor background contributions are also estimated from MC simulation and come from diboson (\({\text {W }}{\text {W }}\), \({\text {W }}{\text {Z }}\), and \({\text {Z }}{\text {Z }}\)), , and  events, with a total contribution of about 1%.

The distributions of the main observables in data, the leading lepton \(p_{\mathrm {T}}\), \(m_{\mathrm {T2}} (\ell \ell )\), the scalar sum of the \(p_{\mathrm {T}}\) of all the selected jets (\(H_{\mathrm {T}}\)) and \(p_{\mathrm {T}} ^\text {miss}\) in the signal region, are shown in Fig. 3. The simulation and data are generally in agreement within the uncertainties. The uncertainties are described in Sect. 6.4.

Fig. 3
figure 3

Pre-fit distributions of data and MC events in the signal region with the signal stacked on above the background prediction for a mass hypothesis of and . Events from \({\mathrm{t}\overline{{ \mathrm t}}} {\text {W }}\), \({\mathrm{t}\overline{{ \mathrm t}}} {\text {Z }}\), DY, nonprompt leptons, and diboson processes are grouped into the ‘Other’ category. The lower panel contains the data-to-SM prediction ratio. The uncertainty band includes statistical, background normalization and all systematic uncertainties described in Sect. 6.4. From upper left to lower right: leading lepton \(p_{\mathrm {T}} \), \(m_{\mathrm {T2}} (\ell \ell ) \), \(H_{\mathrm {T}}\), and \(p_{\mathrm {T}} ^\text {miss} \)

6.3 Search strategy

In order to maximize the sensitivity and to exploit all the differences between the signal and \({\mathrm{t}\overline{{ \mathrm t}}}\) background, a multivariate analysis is implemented using a DNN, trained with events passing the baseline selection. The DNN takes into account all the shape differences between signal and background distributions for the training variables, including correlations, in turn achieving a strong final discriminator. The signal model used was the direct pair production of top squarks, for a sequence of mass values in the range 145–295\(\,\text {Ge}\text {V}\) and \(\varDelta m_\text {cor}\) ranging from 0 to 30\(\,\text {Ge}\text {V}\). The background input to the training was simulated \({\mathrm{t}\overline{{ \mathrm t}}}\) with decays. To avoid overfitting, 40% of the total \({\mathrm{t}\overline{{ \mathrm t}}}\) and signal events are used for the training and the rest for the signal extraction.

The training was done using events passing the baseline selection in order to use the separation power of different observables over a large range. A total of 13 variables are selected for the training: top squark and neutralino masses (), the transverse momentum of the electron–muon pair (), the angle between the momentum of the leptons in the transverse plane (), the pseudorapidity difference between the leptons (), the momenta and pseudorapidities of the individual leptons, \(m_{\ell \ell } \), \(m_{\mathrm {T2}} (\ell \ell )\), \(p_{\mathrm {T}} ^\text {miss}\), and \(H_{\mathrm {T}}\).

Figure 4 shows the normalized distributions of the most discriminating variables for \({\mathrm{t}\overline{{ \mathrm t}}}\) and signal samples for different values of and \({m}_{\tilde{\upchi }^{0}_1}\)  , after the baseline selection. This figure also shows that, in some variables, the shape of the distributions does not have the same behavior for all the signal points. The differences in \(p_{\mathrm {T}} ^\text {miss}\) and \(m_{\mathrm {T2}} (\ell \ell ) \) between signal and background are larger for signal points with large \({m}_{\tilde{\upchi }^{0}_1}\). To exploit these differences and improve the sensitivity, a parametric DNN [39] is used, in which the top squark and neutralino masses are introduced in the training. In this way, a specific model for each signal point training a single DNN is achieved. For background events, and \({m}_{\tilde{\upchi }^{0}_1}\)  are randomly taken, to avoid introducing correlations, using a probability distribution that matches the values of  and \({m}_{\tilde{\upchi }^{0}_1}\)  in the signal sample.

Fig. 4
figure 4

Normalized distributions for some of the training variables in the baseline selection. Distributions for signal points with different top squark and neutralino masses and SM \({\mathrm{t}\overline{{ \mathrm t}}}\) events are compared. From upper left to lower right: \(p_{\mathrm {T}} ^\text {miss}\), , , and

The training was performed with TensorFlow [106] using the Keras interface [107]. All the hyperparameters are optimized with the aim of avoiding overfitting and achieving the highest possible accuracy on the classification. The final DNN structure is sequential: 7 hidden layers with a ReLU activation function [107] (300, 200, 100, 100, 100, 100, 10 neurons). The output consists of two neurons with a softmax normalization function [107], which allows one to interpret the output numbers as probabilities. The optimizer that is selected corresponds to Adam [108] with a learning rate of 0.0001. Out of the 40% of events used for the DNN implementation, 60% are used for training, 15% for validation, and the rest to check that the DNN works properly and there is no overfitting.

Figure 5 shows the DNN output for two different mass parameters in the signal region for signal and \({\mathrm{t}\overline{{ \mathrm t}}}\) background. Since both masses are introduced in the training, the DNN score shape is different for both signal and background. This figure shows that the DNN score is a good discriminator between signal and background, especially at high values of the distribution.

Fig. 5
figure 5

Normalized DNN score distribution comparing the signal and the \({\mathrm{t}\overline{{ \mathrm t}}}\)  background in the signal region for two mass hypotheses: 50 (100)\(\,\text {Ge}\text {V}\) and 225 (275)\(\,\text {Ge}\text {V}\)

The gain in sensitivity by using the DNN score instead of using only the \(p_{\mathrm {T}} ^\text {miss}\) distribution increases with increasing \(\varDelta m_\text {cor}\) and with increasing for a fixed \(\varDelta m_\text {cor}\). For the fully degenerate case (, ) the kinematics of the SUSY process are very similar to the \({\mathrm{t}\overline{{ \mathrm t}}}\) background, so using the DNN does not help to improve the separation. The sensitivity to that point relies completely on the total measured cross section. For larger and , even if \(\varDelta m_\text {cor} = 0\), the DNN starts to improve the sensitivity (as shown in Fig. 5). The score shape separation between signal and background also starts to increase for relatively low and if \(\varDelta m_\text {cor} > 0\).

The modeling of the DNN is checked in a validation region in which the signal region selection requirements are applied, except that \(p_{\mathrm {T}} ^\text {miss} < 50\,\text {Ge}\text {V} \) and \(m_{\mathrm {T2}} (\ell \ell ) < 80\,\text {Ge}\text {V} \) are required, and that only the channel is used. This region is orthogonal to the signal region, and the signal contamination is expected to be small for the signal masses in which the sensitivity relies on the DNN discriminant. This region is highly dominated by \({\mathrm{t}\overline{{ \mathrm t}}}\) and \(\text {t}{\text {W }}\) events and a good agreement with data is observed. Furthermore, the DY modeling of the DNN output distribution is also checked in a validation region where the invariant mass of the same-flavor lepton pairs is close to the mass of the \({\text {Z }}\) boson. The DY background is observed to be well modeled and populates preferentially low DNN score bins.

6.4 Systematic uncertainties

Several sources of systematic uncertainty affect the background prediction and signal yields. Modeling of the trigger, efficiencies of the lepton reconstruction, identification and isolation, the jet energy scale and resolution, efficiencies of the  tagging and mistag rate, and the pileup modeling have uncertainties that are considered in the estimate of both background and signal yields. All these sources are described in Sect. 6.4.1.

As the \({\mathrm{t}\overline{{ \mathrm t}}}\) background plays an essential role and needs to be accurately estimated, various modeling uncertainties are taken into account. These uncertainties consider variations of the main theoretical parameters used in the simulation and have been studied previously by the CMS Collaboration [62, 63]. These uncertainties are explained in detail in Sect. 6.4.2.

Uncertainties in signal modeling are described in Sect. 6.4.3. Section 6.4.4 includes other sources of uncertainty as the background and signal normalization uncertainties.

6.4.1 Experimental uncertainties

The following experimental uncertainties are calculated for every background and signal estimate and are propagated to the final DNN output distribution in the signal region.

The uncertainties in the trigger, lepton identification, and isolation efficiencies used in simulation are estimated by varying data-to-simulation scale factors by their uncertainties, which are about 1.5% for electron identification and isolation efficiencies, 1% for muon identification and isolation efficiencies, and about 1.5% for the trigger efficiency. The uncertainties in the muon momentum scales are taken into account by varying the momenta of the muons by their uncertainties, taken from the muon momentum scale corrections [109]. All these uncertainties are considered as correlated between years.

For the  tagging efficiency and mistag rate the uncertainties are determined by varying the scale factors for the -tagged jets and mistagged light-flavor quark and gluon jets, according to their uncertainties, as measured in QCD multijet events [97,98,99]. The uncertainties related to the jet energy scale and jet energy resolution are calculated by varying these quantities in bins of \(p_{\mathrm {T}}\) and \(\eta \), according to the uncertainties in the jet energy corrections, which amount to a few percent [95]. The uncertainty in the effect of the jet mismeasurements, described in Sect. 6.2, is added to the jet energy resolution uncertainties. This uncertainty is taken as partially correlated between years.

The uncertainty in \(p_{\mathrm {T}} ^\text {miss}\)  from the contribution of unclustered energy is evaluated based on the momentum resolution of the different particle-flow candidates, according to their classification. Details on the procedure can be found in Refs. [93, 110, 111]. The uncertainty in the modeling of the contributions from pileup collisions is evaluated by varying the inelastic \({\text {p}}{\text {p}}\) cross section in the simulation by \(\pm 4.6\)% [112]. These uncertainties are treated as correlated between data periods.

A summary of the experimental uncertainties in the \({\mathrm{t}\overline{{ \mathrm t}}}\) background and signal is shown in Table 1. These uncertainties are also applied to the prediction of other minor backgrounds and have an effect in both the shape and the normalization.

Table 1 Summary of the contributions of the experimental uncertainties in the DNN score distribution for signal and the \({\mathrm{t}\overline{{ \mathrm t}}}\) background. The values represent the relative variation in the number of expected events across different signal models in the signal region

6.4.2 Modeling uncertainties in the \({\mathrm{t}\overline{{ \mathrm t}}}\) background

Modeling uncertainties for the \({\mathrm{t}\overline{{ \mathrm t}}}\) background are calculated by varying different theoretical parameters in the MC generator within their uncertainties and propagating their effect to the final distributions.

The uncertainty in the modeling of the hard-interaction process is assessed in the powheg sample varying up and down \(\mu _\mathrm {F}\) and \(\mu _\mathrm {R}\) by factors of 2 and 1/2 relative to their common nominal value of . Here denotes the \(p_{\mathrm {T}}\) of the top quark in the \({\mathrm{t}\overline{{ \mathrm t}}}\) rest frame. The effect of this variation is propagated to the acceptance and efficiency, estimated using the \({\mathrm{t}\overline{{ \mathrm t}}}\)  powheg sample. This uncertainty is correlated among the data-taking periods.

The uncertainty in the choice of the PDFs and in the value of \(\alpha _\mathrm {S} \) is determined by reweighting the sample of simulated \({\mathrm{t}\overline{{ \mathrm t}}}\) events according to the envelope of a PDF set of 100 NNPDF3.0 replicas [67] for 2016 and 32 PDF4LHC replicas [113] for 2017 and 2018. The uncertainty in \(\alpha _\mathrm {S} \) is propagated to the acceptance by reweighting the simulated sample by sets of weights with two variations within the uncertainties of \(\alpha _\mathrm {S} \). Only the uncertainties for the 2017 and 2018 periods are taken to be correlated, while the 2016 period is kept uncorrelated, because the PDF set used is different.

Table 2 Summary of the contribution of each modeling uncertainty source to the DNN score distribution for the \({\mathrm{t}\overline{{ \mathrm t}}}\) background
Fig. 6
figure 6

Post-fit DNN score distributions in the signal region for different mass hypotheses of, from upper left to lower right, (225, 50); (275, 100); (275, 70); and (245, 100)\(\,\text {Ge}\text {V}\). The superimposed signal prediction is scaled by the post-fit signal strength and, in the upper panels, it is also multiplied by a factor 20 for better visibility. The post-fit uncertainty band (crosses) includes statistical, background normalization, and all systematic uncertainties described in Sect. 6.4. Events from \({\mathrm{t}\overline{{ \mathrm t}}} {\text {W }}\), \({\mathrm{t}\overline{{ \mathrm t}}} {\text {Z }}\), DY, nonprompt leptons, and diboson processes are grouped into the ’Other’ category. The lower panel contains the data-to-prediction ratio before the fit (dotted brown line) and after (dots), each of them with their corresponding band of uncertainties (blue band for the pre-fit and crosses band for the post-fit). The ratio between the sum of the signal and background predictions and the background prediction (purple line) is also shown. The masses of the signal model correspond to the values of the DNN mass parameters in each distribution

The effect of the modeling uncertainties of the initial-state and final-state radiation is evaluated by varying the parton shower scales (running \(\alpha _\mathrm {S}\)) by factors of 2 and 1/2 [59]. In addition, the impact of the matrix element and parton shower matching, which is parameterized by the powheg generator as \(h_{\mathrm {damp}} = 1.58^{+0.66}_{-0.59}~m_\text {t} \) [63, 64], is calculated by varying this parameter within the uncertainties. This uncertainty is calculated using dedicated \({\mathrm{t}\overline{{ \mathrm t}}}\) samples and is taken as correlated between the years.

To model the measured underlying event the parameters of pythia  are tuned [64, 70]. An uncertainty is assigned by varying these parameters within their uncertainties using dedicated \({\mathrm{t}\overline{{ \mathrm t}}}\) samples. The uncertainty corresponding to the 2016 period is applied for the CUETP8M2T4 tune and is kept as uncorrelated to the uncertainty on the CP5 tune for 2017 and 2018, which is fully correlated for the two periods.

An uncertainty on the \(p_{\mathrm {T}}\) of the top quark is also considered to account for the known mismodeling found in the powheg MC sample [63]. A reweighting procedure exists to fix the mismodeling but, to avoid biasing the search, we use unweighted distributions and assign an uncertainty from the full difference to the weighted distributions.

For the top quark mass, 1\(\,\text {Ge}\text {V}\) is conservatively taken as the uncertainty, which corresponds to twice the uncertainty of the CMS measurement [114], and is also propagated to the acceptance. The differences in the final yields for each bin of the DNN score distribution between the \({\mathrm{t}\overline{{ \mathrm t}}}\) background prediction with \(m_\text {t} = 172.5 \pm 1.0\,\text {Ge}\text {V} \) are taken as an uncertainty, accounting for the possible bias introduced in the choice of \(m_\text {t} = 172.5\,\text {Ge}\text {V} \) in the MC simulation. The uncertainty is assessed using dedicated \({\mathrm{t}\overline{{ \mathrm t}}}\) samples produced with a different \(m_\text {t}\).

The modeling uncertainties in the signal region yields for the \({\mathrm{t}\overline{{ \mathrm t}}}\) background are summarized in Table 2; they have an effect on the shape and also on the normalization.

6.4.3 Signal modeling

The effect on the signal model of the ISR reweighting, described in Sect. 3, is considered. Half of the deviation from unity is taken as the systematic uncertainty in these reweighting factors. This uncertainty is propagated to the final distribution and taken as a shape uncertainty.

The uncertainty in the modeling of the hard interaction in the simulated signal sample is calculated varying up and down \(\mu _\mathrm {F}\) and \(\mu _\mathrm {R}\) by factors of 2 and 1/2 relative to their nominal value. In addition, a 6.5% uncertainty in the signal normalization is assigned, according to the uncertainties in the predicted cross section of signal models in the top squark mass range of the analysis [87].

6.4.4 Other uncertainties

The uncertainty in the overall integrated luminosity for the combined sample, which affects the signal and background normalization, amounts to 1.6% [115,116,117,118]. The total uncertainty is split in different sources, partially correlated across years.

A normalization uncertainty is applied to each background and signal estimate separately. The uncertainty in the \({\mathrm{t}\overline{{ \mathrm t}}}\) normalization is taken from the uncertainty in the NNLO+NNLL cross section, as quoted in Sect. 3, and additionally the top quark mass is varied by \(\pm 1\,\text {Ge}\text {V} \), leading to a variation of the cross section of 6%.

For DY, dibosons, \({\mathrm{t}\overline{{ \mathrm t}}} {\text {W }}\), and \({\mathrm{t}\overline{{ \mathrm t}}} {\text {Z }}\) processes a 30% normalization uncertainty is assigned covering the uncertainty in the theoretical cross section and in the measurements. For the \(\text {t}{\text {W }}\) process an uncertainty of 12% is assigned. In the case of the nonprompt lepton background, a normalization uncertainty of 30% is also applied, covering the largest deviations observed in the same-charge control region described in Sect. 6.2.

Statistical uncertainties arise from the limited size of the MC samples. They are considered for each signal and background process, in each bin of the distributions. They are introduced through the Barlow–Beeston approach [119].

Fig. 7
figure 7

Upper limit at 95% \(\text {CL}\) on the signal cross section as a function of the top squark and neutralino masses in the top quark corridor region. The model is excluded for all of the colored region inside the black boundary

All the systematic uncertainties described in Sects. 6.4.1 and 6.4.2 are assigned to each DNN distribution bin individually, and treated as correlated among all the bins and all processes. The statistical uncertainties are treated as uncorrelated nuisance parameters in each bin of the DNN score distribution. All of the systematic uncertainties are treated as fully correlated among the different final states.

7 Results

7.1 Corridor results

The statistical interpretation is performed by testing the SM hypothesis against the SUSY hypothesis. The data and predicted distributions for the DNN response in the signal region are combined in the nine channels (3 data-taking period \(\times \) 3 lepton flavor combinations of the two leading leptons) in order to maximize the sensitivity to the signal. Each of the distributions is computed for different values of the mass parameters and compared to the prediction for the signal model with the corresponding masses. In Fig. 6 the DNN score distributions for data are compared with those from the fit. The fit function includes the background, and the signal prediction scaled by the post-fit signal strength, for different mass parameters. The points whose DNN distributions appear in the upper plots lie along the center line of the corridor, \(\varDelta m_\text {cor} = 0\), while those shown in the lower plots lie on its boundary.

A binned profile likelihood fit of the DNN output distribution is performed, where the nuisance parameters are modeled using Gaussian distributions. The correlation scheme for different data periods is described in Sect. 6.4. No significant excess is observed over the background prediction for any of the distributions.

Upper limits on the production cross section of top squark pairs are calculated at 95% confidence level (\(\text {CL}\)) using a modified frequentist approach and the \(\text {CL}_\text {s}\) criterion, implemented through an asymptotic approximation [120,121,122,123].

Results are interpreted for different signals characterized by and \(\varDelta m_\text {cor} \le 30\,\text {Ge}\text {V} \). The observed upper limit on the signal cross section is shown in Fig. 7. The expected and observed upper limits are also shown for three different slices corresponding to , 175 and 185\(\,\text {Ge}\text {V}\) in Fig. 8. Both the observed and expected cross section limits exclude the model over the region of the search.

Fig. 8
figure 8

Upper limit at 95% \(\text {CL}\) on the signal cross section as a function of the top squark mass for of 175\(\,\text {Ge}\text {V}\) (upper left), 185\(\,\text {Ge}\text {V}\) (upper right) and 165\(\,\text {Ge}\text {V}\) (lower). The green and yellow bands represent the regions containing 68 and 95%, respectively, of the distribution of limits expected under the background-only hypothesis. The purple dotted line indicates the approximate NNLO + NNLL production cross section

Fig. 9
figure 9

Expected and observed limits in the - mass plane, for the model (upper left), the model (upper right) and a model with a branching fraction of 50% for each of these top squark decay modes (lower), assuming a mass difference between the neutralino and chargino of 5\(\,\text {Ge}\text {V}\). The color indicates the 95% \(\text {CL}\) upper limit on the cross section at each point in the plane. The area below the thick black curve represents the observed exclusion region at 95% \(\text {CL}\), while the dashed red lines indicate the expected limits at 95% \(\text {CL}\) and the region containing 68% of the distribution of limits expected under the background-only hypothesis of the combined analyses. The thin black lines show the effect of the theoretical uncertainties in the signal cross section

7.2 Combined results

A statistical combination of the results of the three searches described in detail in Sect. 5 is performed outside the corridor area in order to provide interpretations in the context of the signal scenarios described in Sect. 1. The signal regions of the analyses targeting different final states are designed to be mutually exclusive. Additionally, there is no significant overlap of any of the control regions with signal regions of a different analysis. The overlap between control regions of the single-lepton and dilepton analysis is found to be less than 1% and therefore considered negligible. Correlations of systematic uncertainties in the expected signal and background yields are studied and taken into account. Uncertainties in the jet energy scale and \(p_{\mathrm {T}} ^\text {miss}\) resolution, b tagging efficiency scale factors, heavy resonance taggers, integrated luminosity and background normalizations are treated as fully correlated. Because of differences in the lepton identification methods and working points, as well as the triggers to select events, the corresponding uncertainties are considered uncorrelated. Theory uncertainties in the choice of the PDF, \({\mu }_\text {R}\) and \({\mu }_\text {F}\) and ISR modeling of the signal prediction, as well as SM backgrounds that are estimated using simulation, are taken to be fully correlated.

Figure 9 (upper left) shows the combination of the results of the three searches for direct top squark pair production for the model with decays. The analysis described in Sect. 6 is exclusively used for extracting limits in the top quark corridor region. No result of the other analyses is used in this particular region of parameter space. The combined result excludes a top squark mass of 1325\(\,\text {Ge}\text {V}\) for a massless LSP, and an LSP mass of 700\(\,\text {Ge}\text {V}\) for a top squark mass of 1150\(\,\text {Ge}\text {V}\). The expected limit of the combination is dominated by the fully hadronic search for signals with large mass splitting. In regions with smaller mass splitting between the top squark and the LSP, searches in the zero- and single-lepton channels have similar sensitivity.

Figure 9 (upper right) shows the equivalent limits for direct top squark pair production for the model with decays. The mass of the chargino is set to the mean of the masses of the top squark and the LSP. The combined result for this scenario excludes a top squark mass of 1260\(\,\text {Ge}\text {V}\) for a massless LSP and an LSP mass of 575\(\,\text {Ge}\text {V}\) for a top squark mass of 1000\(\,\text {Ge}\text {V}\). The combination extends the sensitivity to both top squark and LSP masses by about 50\(\,\text {Ge}\text {V}\) compared to the most sensitive individual result coming from the fully hadronic channel.

Figure 9 (lower) shows the limits for the model with a 50% branching fraction of the top squark decays discussed previously. In this model, the mass splitting between the neutralino and chargino is assumed to be 5\(\,\text {Ge}\text {V}\). Because of the low acceptance for low-momentum leptons the dilepton result is not interpreted in terms of this model. Top squark masses up to 1175\(\,\text {Ge}\text {V}\) are excluded in this model when the LSP is massless, and up to 1000\(\,\text {Ge}\text {V}\) for LSP masses up to 570\(\,\text {Ge}\text {V}\).

As shown in Fig. 9 (upper left), the region of the parameter space of the simplified SUSY models considered for interpretation in this analysis, which is favored by the naturalness paradigm, is now further constrained by the exclusion limits.

7.3 Search for dark matter in association with top quarks

The results of the inclusive top squark searches are interpreted in simplified models of associated production of DM particles with a top quark pair, shown in Fig. 2. The interaction of the DM particles and the top quark is mediated by a scalar or pseudoscalar mediator particle. Assuming a dark matter particle mass of 1\(\,\text {Ge}\text {V}\), scalar and pseudoscalar mediators with masses up to 400 and 420\(\,\text {Ge}\text {V}\) are excluded at 95% CL, respectively, as shown in Fig. 10. The obtained upper limits on \(\sigma ({\text {p}}{\text {p}}\rightarrow {\mathrm{t}\overline{{ \mathrm t}}} \chi \tilde{\chi })/\sigma _{\mathrm {theory}}\) are independent of the mass of the DM fermion (\(m_{\chi }\)), as long as the mediator is produced on-shell [46]. Previous results are improved by more than 100\(\,\text {Ge}\text {V}\)  [50, 51] and the sensitivity extends beyond for the first time. The competing decay channel of the mediator into a top quark pair, \(\phi /a \rightarrow {\mathrm{t}\overline{{ \mathrm t}}} \), is taken into account in the signal simulation and cross section calculation.

Fig. 10
figure 10

The 95% \(\text {CL}\) expected (dashed line) and observed limits (solid line) on \(\sigma /\sigma _{\mathrm {theory}}\) for a fermionic DM particle with \(m_{\chi }=1\,\text {Ge}\text {V} \), as a function of the mediator mass for a scalar (left) and pseudoscalar (right). The green and yellow bands represent the regions containing 68 and 95%, respectively, of the distribution of limits expected under the background-only hypothesis. The horizontal gray line indicates \(\sigma /\sigma _{\mathrm {theory}}=1\). The mediator couplings are set to \(g_\text {q}=g_{\mathrm {DM}}=1\)

8 Summary

Four searches for top squark pair production and their statistical combination are presented. The searches use a data set of proton–proton collisions at a center-of-mass energy of 13\(\,\text {Te}\text {V}\) collected by the CMS detector and corresponding to an integrated luminosity of 137\(\,\text {fb}^{-1}\). A dedicated analysis is presented that is sensitive to signal models where the mass splitting between the top squark and the lightest supersymmetric particle (LSP) is close to the top quark mass. A deep neural network algorithm is used to separate the signal from the top quark background using events containing an opposite-charge dilepton pair, at least two jets, at least one -tagged jet, \(p_{\mathrm {T}} ^\text {miss} >50\,\text {Ge}\text {V} \), and stransverse mass greater than 80\(\,\text {Ge}\text {V}\). No excess of data over the standard model prediction is observed, and upper limits are set at 95% confidence level on the top squark production cross section. Top squarks with mass from 145 to 275\(\,\text {Ge}\text {V}\), for LSP mass from 0 to 100\(\,\text {Ge}\text {V}\), with a mass difference between the top squarks and LSP of up to 30\(\,\text {Ge}\text {V}\) deviation around the mass of the top quark, are excluded for the first time in CMS. Previously published searches in final states with 0, 1, or 2 leptons are combined to extend the exclusion limits of top squarks with masses up to 1325\(\,\text {Ge}\text {V}\) for a massless LSP and an LSP mass up to 700\(\,\text {Ge}\text {V}\) for a top squark mass of 1150\(\,\text {Ge}\text {V}\), for certain models of top squark production. In an alternative signal model of dark matter production via a spin-0 mediator in association with a top quark pair, mediator particle masses up to 400 and 420\(\,\text {Ge}\text {V}\) are excluded for scalar or pseudoscalar mediators, respectively, assuming a dark matter particle mass of 1\(\,\text {Ge}\text {V}\).