1 Introduction

The production of pairs of Higgs bosons (\({\mathrm{H}} \)) in the standard model (SM) has a predicted cross section in gluon–gluon fusion at \(\sqrt{s}=8\,\text {TeV} \) [1, 2] for the Higgs boson mass \(m_{{\mathrm{H}}} \approx 125\,\text {GeV} \) [3] of only \(10.0 \pm 1.4\text {\,fb} \). Many BSM theories suggest the existence of narrow heavy particles \(\mathrm {X}\)   that can decay to a pair of Higgs bosons [412]. The natural width for such a resonance is expected to be a few percent of its pole mass \(m_\mathrm {X} \), which corresponds to a typical detector resolution. In contrast, the SM production of Higgs boson pairs results in a broad distribution of effective mass, falling mainly in the range from 300 to 600\(\,\text {GeV}\). Thus the presence of a narrow state would be readily detected, even if produced with a cross section as small as that for the SM process.

Searches for narrow particles decaying to two Higgs bosons have already been performed by the ATLAS [1315] and CMS [1619] collaborations in \(\mathrm {p}\) \(\mathrm {p}\) collisions at the CERN LHC. Until now their reach was limited to \(m_\mathrm {X} \le 1.5\,\text {TeV} \). Because longitudinal \(\mathrm {W}\)  and \({\mathrm{Z}}\)   states are provided by the Higgs field in the SM, any \({\mathrm{H}} {\mathrm{H}} \) resonance potentially also decays into \(\mathrm {W}\mathrm {W}\) and \({\mathrm{Z}} {\mathrm{Z}} \) final states. Searches for \(\mathrm {X} \rightarrow \mathrm {W}\mathrm {W}\), \({\mathrm{Z}} {\mathrm{Z}} \), and \(\mathrm {W}{\mathrm{Z}} \) states were performed by ATLAS and CMS [2024]. The combinations of these results [2427] indicate that the region around \(m_\mathrm {X} \approx 2\,\text {TeV} \) is particularly interesting to explore.

This paper reports on a search for \(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \) covering the mass range \(1.15< m_\mathrm {X} < 3.0\,\text {TeV} \), significantly extending the reach of the present results beyond \(1.5\,\text {TeV} \). The final state that provides the best sensitivity in this mass range is \({\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} \), which benefits from the expected large branching fraction (\(\mathcal {B}\)) of 57.7 % for \({\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} \) [28] and a relatively low background from SM processes.

Many BSM proposals explicitly considered in this paper postulate the existence of a warped extra dimension (WED) [6] and predict the existence of a scalar radion [79]. The radion is a spin-0 resonance associated with the fluctuations in the length of the extra dimension. The production cross section as a function of \(m_\mathrm {X} \) is proportional to \(1/\Lambda _\mathrm {R} ^2\), where \(\Lambda _\mathrm {R} \) is the scale parameter of the theory. In this paper we consider two cases: \(\Lambda _\mathrm {R} = 1\) and \(3\,\text {TeV} \). In the first case, the WED theory predicts a cross section that can be detected at the LHC [17], but is challenged by the constraints derived from the electroweak precision measurements [29]. This specific model is excluded up to \(m_\mathrm {X} = 1.1\,\text {TeV} \) by the previous \(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \) searches [14, 17]. In contrast, the predicted cross section for \(\Lambda _\mathrm {R} =3 \,\text {TeV} \) is a factor of 9 times smaller, but the theory is less constrained by these searches. We consider that the radion is produced exclusively via gluon-gluon fusion processes, with \(\mathcal {B}(\text {radion} \rightarrow {\mathrm{H}} {\mathrm{H}}) \approx 25~\%\) above 1\(\,\text {TeV}\).

In the mass range of this search, the topology of the \({\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} \) final state is constrained by the size of the Lorentz boost of the Higgs bosons that is typically \(\gamma _{{\mathrm{H}}} \approx m_\mathrm {X}/ 2m_{{\mathrm{H}}} \gg 1\) and defines the so-called boosted regime [3032]. In this regime each Higgs boson is produced with a large momentum and its decay products are collimated along its direction of motion. The hadronization of a pair of narrowly separated \({\mathrm{b}}\) quarks will result in a single reconstructed jet of mass compatible with \(m_{{\mathrm{H}}} \). The \({\mathrm{H}} \) candidates are selected by employing jet substructure techniques to identify jets containing constituents with kinematics consistent with the decay of a highly boosted Higgs boson. These candidates are then required to be consistent with decays of \({\mathrm {B}}\) hadrons, based on our \({\mathrm{b}}\) tagging algorithms. The signal is identified in the dijet mass (\(m_\mathrm {jj} \)) spectrum as a peak above a falling background which originates mainly from multijet events and \({\mathrm{t}}\overline{{\mathrm{t}}} \) production.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6\(\text {\,m}\) internal diameter, providing a magnetic field of 3.8\(\text {\,T}\). A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter, and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections, reside within the solenoid volume. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A detailed description of the CMS detector, together with a definition of the coordinate system and the basic kinematic variables, can be found in Ref. [33].

3 Simulated events

Monte Carlo (MC) simulations are used to provide: predictions of background processes, optimization of the event selection, and cross-checks of data-based background estimations.

Signal, multijet and \({\mathrm{t}}\overline{{\mathrm{t}}} \) background events are generated using the leading-order matrix element generator MadGraph 5v1.3.30 [34, 34]. Parton shower and hadronization are included using pythia 6.4.26 [35], and the matrix element is matched to the parton shower using the MLM scheme [36]. The Z2* tune is used to describe the underlying event. This tune is identical to the Z1 tune [37], but uses the CTEQ6L parton distribution functions (PDF) [38]. The signal events are simulated with an intrinsic width of the radion fixed to 1\(\,\text {GeV}\), \(m_{{\mathrm{H}}} =125\,\text {GeV} \). Different samples are generated for \(m_\mathrm {X} \) ranging from 1.15 to 3 \(\,\text {TeV}\). All generated events are processed through a simulation of the CMS apparatus based on Geant4  [39]. Additional \(\mathrm {p}\mathrm {p}\) interactions within a bunch crossing (pileup) are added to the simulation, with a frequency distribution chosen to match that observed in data. During this data-taking period the mean number of interactions per bunch crossing is 21.

4 Event reconstruction and selections

The analysis is based on data from \(\mathrm {p}\mathrm {p}\) interactions observed with the CMS detector at \(\sqrt{s}=8\,\text {TeV} \). The data correspond to an integrated luminosity of \(19.7{\,\text {fb}^{-1}} \). Events are collected using at least one of the two specific trigger conditions based on jets reconstructed online: the first trigger requires a large \(m_\mathrm {jj} \) calculated for the two jets of highest transverse momentum (referred to as leading jets); the second trigger requires a large value of \(H_{\mathrm {T}} = \sum _i p_{\mathrm {T}} ^i\), where the sum runs over the reconstructed jets in the event with transverse momenta \(p_{\mathrm {T}} >40\,\text {GeV} \). The lower thresholds applied to \(m_\mathrm {jj} \) and the \(H_{\mathrm {T}} \) triggers were changed during the data-taking period to maintain a constant trigger rate while the LHC peak luminosity steadily increased. More than half of the data were collected with \(m_\mathrm {jj} > 750\,\text {GeV} \) and \(H_{\mathrm {T}} > 650\,\text {GeV} \). The remaining data were collected with the requirement \(H_{\mathrm {T}} >750\,\text {GeV} \).

Events are required to have at least one reconstructed \(\mathrm {p}\mathrm {p}\) collision vertex within \(|z | < 24\text {\,cm} \) of the center of the detector along the longitudinal beam directions. Many additional vertices, corresponding to pileup interactions, are usually reconstructed in an event using charged particle tracks. We assume that the primary interaction vertex corresponds to the one that maximizes the sum in \(p_{\mathrm {T}} ^2\) of these associated tracks.

Individual particles are reconstructed using a particle-flow (PF) algorithm [40, 41] that combines the information from all the CMS detector components. Each such reconstructed particle is referred to as a PF candidate. The five classes of PF candidates correspond to muons, electrons, photons, and charged and neutral hadrons. Charged hadron candidates not originating from the primary vertex of the event are discarded to reduce contamination from pileup [42].

The Cambridge–Aachen (CA) algorithm [43], implemented in FastJet [44], clusters PF candidates into jets using a distance parameter \(R = 0.8\). An event-by-event jet area-based correction [42, 45, 46] is applied to each reconstructed jet to remove the remaining energy originating from pileup vertices primarily consisting of neutral particles. The jet four-momenta are also corrected to account for the difference between the measured and the expected momentum at the particle level, using the standard CMS correction procedure described in Refs. [47, 48].

Fig. 1
figure 1

Simulated \(m_\mathrm {j}^\mathrm {P} \) spectrum for spin-0 radion signals, multijet and \({\mathrm{t}}\overline{{\mathrm{t}}}\) events, and the spectrum measured in data. Each event contributes twice to the distribution, once per jet. The multijet contribution is rescaled to match the event yield in data, while the signal and \({\mathrm{t}}\overline{{\mathrm{t}}}\) spectra are rescaled by the large factors indicated, to be visible in the figure

Events are required to have at least two jets, and the two leading jets each to have \(p_{\mathrm {T}} > 40\,\text {GeV} \) and pseudorapidity \(|\eta | < 2.5\). In addition, identification criteria are applied to remove spurious jets associated with calorimeter noise [40]. To reduce the contribution from multijet events, the two leading jets must be relatively close in \(\eta \), \(|\Delta \eta _\mathrm {jj} | <1.3\), a selection discussed in Refs. [23, 49]. Events with \(m_\mathrm {jj} <1\,\text {TeV} \) are rejected. Above this mass threshold, the efficiency of the trigger requirement for the chosen selections exceeds 99.5 %.

The mass and \({\mathrm{b}}\) flavour properties of the leading jets are used to suppress the multijet and \({\mathrm{t}}\overline{{\mathrm{t}}}\) backgrounds. Soft gluon radiation and a fraction of the remaining neutral pileup particles are first removed from each jet through the implementation of a jet-grooming algorithm called jet pruning [50, 51]. This technique reduces significantly the mass of jets originating from quarks and gluons [52], while improving the resolution of the jets resulting from the hadronic decays of a heavy SM boson [53]. The invariant mass \(m_\mathrm {j}^\mathrm {P} \) is calculated for the two leading pruned jets. In Fig. 1, the \(m_\mathrm {j}^\mathrm {P} \) distribution of the two leading jets is shown for data, signal, and background events. For jets initiated by a quark or a gluon, \(m_\mathrm {j}^\mathrm {P} \) peaks around 15\(\,\text {GeV}\), while jets from high-momentum Higgs boson decay usually have a pruned mass around 120\(\,\text {GeV}\). The difference of \(\approx \)5\(\,\text {GeV}\) relative to the nominal \(m_{\mathrm{H}} \) value is related to the presence of neutrinos produced by the semileptonic decays of \({\mathrm {B}}\) mesons, and the inherent nature of the pruning procedure. A small peak near 15\(\,\text {GeV}\) is also observed for signal events, and corresponds mainly to asymmetric decays in which the jet pruning algorithm removes the decay products of one of the two \({\mathrm {B}}\) mesons. Each of the leading jets has to satisfy \(110<m_\mathrm {j}^\mathrm {P} <135\,\text {GeV} \), a requirement that is chosen to maximize the sensitivity of the analysis to the presence of a narrow resonances. Some differences are observed between the data and background estimated from simulation. These discrepancies do not affect the results of this analysis since the background is estimated using techniques based on data only.

The identification of jets likely to have originated from the hadronization of a pair of \({\mathrm{b}}\) quarks exploits the combined secondary vertex (CSV) \({\mathrm{b}}\) jet tagger [54]. This algorithm combines the information from track impact parameters and secondary vertices within a given jet into a continuous output discriminant  [54, 55]. The working point used in this paper corresponds to an efficiency of 80 % for identifying b jets and a rate of 10 % for mistagging jets from light quarks or gluons as originating from \({\mathrm{b}}\) quarks. This working point was chosen to maximize the sensitivity of the analysis, while retaining a sufficient number of events to allow a reliable estimation of the background.

In the first step of the procedure used to select \({\mathrm{H}}\) jet candidates, the pruned jets are split into two subjets by reversing the final iteration in the jet clustering algorithm. The angular separation between the subjets is \(\Delta R \equiv \sqrt{{(\Delta \eta )^2 +(\Delta \phi )^2}}\), where \(\eta \) is the pseudorapidity and \(\phi \) the azimuthal angle. Two cases are considered, with the transition between them occurring at \(m_\mathrm {X} \approx 1.6\,\text {TeV} \):

  1. 1.

    \(\Delta R>0.3\): in this group the jet is considered to be \({\mathrm{b}}\) tagged if at least one subjet satisfies the requirements of the CSV working point. Moreover, the jet is considered as “double \({\mathrm{b}}\) tagged” if both subjets satisfy the CSV requirement.

  2. 2.

    \(\Delta R<0.3\): here the subjet \({\mathrm{b}}\) tagging selection is inefficient [55]. The \({\mathrm{b}}\) tagging algorithm is therefore applied directly to the jet. In this case it is not possible to distinguish between \({\mathrm{b}}\)-tagged and double \({\mathrm{b}}\)-tagged jets, and therefore either of these two possibilities are accepted.

In summary, a jet is considered an \({\mathrm{H}} \) jet candidate if it satisfies the mass and \({\mathrm{b}}\) tagging requirements. Events are selected when both leading jets are \({\mathrm{H}} \) jets, and at least one of them is double \({\mathrm{b}}\) tagged. The simulated results are corrected to match the \({\mathrm{H}} \) and \({\mathrm{b}}\) tagging efficiencies observed in data [55].

Table 1 Summary of selection requirements, with their signal and \({\mathrm{t}}\overline{{\mathrm{t}}}\) background efficiencies, and total number of events observed in data. The selection criteria are applied sequentially and the efficiencies are cumulative, except in the last section of the table dedicated to categorization

A final selection is based on the kinematic properties of the constituents of \({\mathrm{H}} \) jets. The quantity N-subjettiness [5658] \(\tau _N\) is used to quantify the degree to which constituents of a jet can be arranged into N subjets. The ratio \(\tau _{21} = \tau _2/\tau _1\) is calculated for each of the two \({\mathrm{H}} \) jet candidates. High- (HP) and low-purity (LP) Higgs boson candidates are defined as having \(\tau _{21} < 0.5\) and \(0.5 \le \tau _{21} < 0.75\), respectively. Events are required to have at least one HP \({\mathrm{H}} \) jet and another \({\mathrm{H}} \) jet that passes either the HP or LP requirements.

The sample of events satisfying the previously defined criteria is subsequently divided into three categories. Events with two high-purity \({\mathrm{H}} \) jets form the HPHP category. Among the remaining events, those for which the high-purity \({\mathrm{H}} \) jet is the leading jet constitute the HPLP category. The rest of the sample constitutes the LPHP category.

The selection criteria applied to reduce the background are summarized in Table 1. The region of phase space defined by all these criteria is referred to as the signal region. The fraction of the simulated signal and \({\mathrm{t}}\overline{{\mathrm{t}}}\) samples, satisfying these criteria, as well as the number of data events passing the selections is also provided.

The fiducial selection is defined by the two leading jets having \(|\eta | < 2.5\), \(p_{\mathrm {T}} >40\,\text {GeV} \), and a separation \(|\Delta \eta _\mathrm {jj} | <1.3\). The fraction of the signal within this fiducial region depends on its spin, and is \(\approx \)60 % for a spin-0 resonance. The efficiency of the combined \({\mathrm{H}}\) mass and \({\mathrm{b}}\) tagging criteria for events within the fiducial region, for signal and data, is shown in Fig. 2. The number of data events is reduced by four orders of magnitude while the signal efficiencies range from 10 to 20 % with a weak dependence on \(m_\mathrm {X} \), and are observed to be independent of the spin of the resonance. Finally, the total acceptance times efficiency is provided in Table 1, and varies between 4.0 and 8.8 %, with the largest fraction of events populating the HPHP category.

Fig. 2
figure 2

The efficiencies of \({\mathrm{H}} \) mass requirement and combined \({\mathrm{H}} \) mass and \({\mathrm{b}}\) tagging criteria, for data and signal. Events are required to be in the fiducial region (\(|\eta | < 2.5\), \(p_{\mathrm {T}} >40\,\text {GeV} \) for both jets and \(|\Delta \eta _\mathrm {jj} | <1.3\)). The horizontal bar on each data point indicates the width of the bin

Figure 2 shows that the probability of incorrectly identifying multijet or \({\mathrm{t}}\overline{{\mathrm{t}}} \) events as events with two Higgs bosons is less than 0.1 %, and appears to be independent of \(m_\mathrm {jj} \) within statistical uncertainties. A more precise quantification is provided in Table 1 for \({\mathrm{t}}\overline{{\mathrm{t}}}\) events. In particular, we observe that the dijet mass, the pruned jet mass, and \({\mathrm{b}}\) tagging criteria are each sufficient for reducing the \({\mathrm{t}}\overline{{\mathrm{t}}}\) background by an order of magnitude. In contrast, the N-subjettiness criterion is inefficient in reducing it.

5 Signal extraction

The signal is identified in the binned \(m_\mathrm {jj} \) spectrum in bin widths chosen to match the resolution of the dijet mass, as described in Ref. [59]. This resolution is \(\approx \)50\(\,\text {GeV}\) at \(m_\mathrm {X} = 1.15\,\text {TeV} \), increasing slowly to \(\approx \)100\(\,\text {GeV}\) for \(m_\mathrm {X} = 3\,\text {TeV} \).

The analysis defines a likelihood, for each \(m_\mathrm {X} \) hypothesis, based on the total number of events in data, signal, and background counted in a mass window in each category. These mass windows have a typical size of three or four bins centered approximatively around \(m_\mathrm {X} \) (see Table 2) and contains more than 95 % of signal events. The amount of signal is estimated in the mass window using MC simulation, while the amount of background is estimated as the integral of a parameterized model. The total likelihood combines the information from the three event categories.

Table 2 Mass windows used for different signal hypotheses

6 Parameterization of background

After event selection, \(\approx \)75, 90, and 95 % of the total background is expected to originate from multijet events in HPHP, HPLP, and LPHP categories, respectively. The remaining contribution is from \({\mathrm{t}}\overline{{\mathrm{t}}}\) production, which is modelled in simulation, and rescaled to the total next-to-next-to-leading order cross section [60]. All other backgrounds containing Higgs bosons or \(\mathrm {W}\)/\({\mathrm{Z}}\) bosons decaying into jets represent less than 1 % of the total background.

The total background is estimated from data, without separating the multijet or \({\mathrm{t}}\overline{{\mathrm{t}}}\) fractions. The expected \(m_\mathrm {jj} \) background spectrum is approximated by a falling exponential for \(1< m_\mathrm {jj} < 3\,\text {TeV} \),

$$\begin{aligned} \frac{{\mathrm{d}}N_\text {Background}}{{\mathrm{d}}m_\mathrm {jj}} = N_B\, a\, {\mathrm{e}}^{-a(m_\mathrm {jj}-1000\,\text {GeV})}, \end{aligned}$$
(1)

where the parameterization has been chosen to minimize the correlation between the normalization \(N_B\) and slope a. We obtain a from a fit to the \(m_\mathrm {jj} \) distribution in a control region, defined as the portion of phase space where one of the jets satisfies \(110< m_\mathrm {j}^\mathrm {P} < 135\,\text {GeV} \) and the other jet is required to have \(60< m_\mathrm {j}^\mathrm {P} < 100\,\text {GeV} \). This choice of the window for \(m_\mathrm {j}^\mathrm {P} \) results from a compromise between limited signal contamination, sufficiently large statistics, and similarity in substructure properties between the sideband jet and the \({\mathrm{H}}\) jet. To use this control region we assume that there is no resonant signal in the Z\({\mathrm{H}}\) final state.

The control region contains between 1.1–2 times the number of events in the signal region depending on the category. The result of the fit and the uncertainty band associated with the uncertainty in the parameter a are shown in Fig. 3. The effect of a residual contamination of the control region by the signal is explicitly checked by adding an \({\mathrm{H}} {\mathrm{H}} \) signal to the control region at different masses, with a typical \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\), corresponding to the sensitivity of the analysis at a given \(m_\mathrm {X} \). The change in the slope parameter a is observed to be negligible.

We extract \(N_B\) for each signal hypothesis from the fit to the data that excludes events in the counting window described in Sect. 5. This background extraction procedure motivates the choice of the lower value of the \(m_\mathrm {X} \) window for which the search is performed. In order to improve the constraint on \(N_B\), there must be at least one bin on the left side of the mass window to be retained.

Fig. 3
figure 3

Observed \(m_\mathrm {jj} \) spectrum (black points) in the control region together with the superimposed background fit (red line) and the uncertainty associated with the variation of the slope parameter a (red shaded area) for HPHP (top), HPLP (bottom-left), and LPHP (bottom-right) categories

This background estimation procedure assumes, on the one hand, that the \(m_\mathrm {jj} \) spectrum is similar in the signal and the control regions, and on the other hand, that it is similar for multijet and \({\mathrm{t}}\overline{{\mathrm{t}}}\) event samples. The following cross-checks are performed to validate these hypotheses:

  • The similarity of distributions for the signal and control regions are confirmed in the simulated multijet sample.

  • The parameters a and \(N_B\) are extracted from the signal region (using an approach similar to that of Ref. [23]), and found to be compatible within statistical uncertainties with the parameters obtained through the normal method of background estimation.

  • The bin-by-bin normalization between the signal and control regions is calculated using a sideband obtained by inverting the \({\mathrm{b}}\) tagging criterion on one of the jets (using a technique similar to that in Ref. [61]), and the normalization factor found to be independent of \(m_\mathrm {jj} \), within the statistical uncertainties.

  • The \({\mathrm{t}}\overline{{\mathrm{t}}} \) contribution in the signal region obtained from simulation is fitted by the function in Eq. (1) and the resulting fit is found to be consistent with the distribution of the overall background within the statistical uncertainties.

Closure checks of the background-estimation procedure are performed using simulated multijet events. These are also performed directly in data in the control region. For this purpose, the control region is split in two, a low mass control region with \(60<m_\mathrm {j}^\mathrm {P} <90\,\text {GeV} \), and a pseudo-signal region with \(90<m_\mathrm {j}^\mathrm {P} <100\,\text {GeV} \). In both cases, the predicted background is found to be compatible with that observed, within the statistical uncertainties.

7 Systematic uncertainties

The largest contributions to the systematic uncertainty in the signal yields are the uncertainties associated with the classification of the events into the purity categories, the estimation of the efficiency to identify a \({\mathrm{H}}\) jet, and the calculation of the total integrated luminosity (2.6 %) [62], as well as with the determination of the jet energy scale (JES) and resolution (JER). The major systematic uncertainties are summarized in Table 3.

Table 3 Typical uncertainties in different categories
Fig. 4
figure 4

Observed \(m_\mathrm {jj} \) spectrum (black points) compared with a background estimate (black line), obtained in background only hypothesis, for HPHP (top), HPLP (bottom-left), and LPHP (bottom-right) categories. The simulated radion resonances of \(m_\mathrm {X} = 1.5\) and 2\(\,\text {TeV}\) are also shown. The lower panel in each plot shows the difference between the number of observed and estimated background events divided by the statistical uncertainty estimated from data

Fig. 5
figure 5

Observed and expected 95 % CL upper limits on the product of cross section of a narrow resonance and the branching fraction \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\). Theory curves corresponding to WED models with radion are superimposed

The uncertainty in the \({\mathrm{b}}\) tagging efficiency originates from the uncertainty in the data-to-simulation scale factors that are applied to the simulated signal [55]. The scale factors are \(\approx \)90 % with an absolute uncertainty between ±3.8 % and ±14 %, depending on the value of \(m_\mathrm {X} \). The uncertainty increases at large \(m_\mathrm {X} \) because of the limited amount of data available to constrain the scale factors.

The uncertainty in the mass selection efficiency is 2.6 % for each jet and 5.2 % for the event. This uncertainty is estimated by studying high \(p_{\mathrm {T}}\) \(\mathrm {W}\) bosons in a \({\mathrm{t}}\overline{{\mathrm{t}}} \) data control sample [53] and comparing to MC predictions. It includes the effect of the difference in fragmentation between light and \({\mathrm{b}} \) quarks. This uncertainty is fully correlated for all \({\mathrm{H}} \) jets. In addition, the impact of the pileup modelling uncertainty in the Higgs boson mass-tagging efficiency is assumed to be 1.5 % per jet, i.e., 3 % for the event [23].

An uncertainty accounting for possible migration of signal events from the HPHP to the HPLP and LPHP categories results in uncertainties of \(+25\) and \(-19~\%\), and of \(+59\) and \(-37~\%\) in the normalization of the HPHP category, and of both the HPLP and LPHP categories, respectively. These uncertainties are estimated by comparing the \(\tau _{21} \) distribution in measured and simulated \({\mathrm{t}}\overline{{\mathrm{t}}}\) events  [23, 53]. It also includes a quantification of the difference between the fragmentation of \(\mathrm {W}\) and Higgs bosons decaying hadronically. The fraction of signal events that do not enter any of the three categories changes from 2 % at 1.1\(\,\text {TeV}\) to 20 % at 3.0\(\,\text {TeV}\). The uncertainty associated with migration out of the three categories is estimated to be much smaller than that associated with migration within them.

The uncertainties in the JES (1–2 %) [48] and JER (10 %) [47] impact the signal acceptance in the \(m_\mathrm {jj} \) counting window. Each of these systematic contributions provide less than 1 % uncertainty in the normalization of the expected signal events.

In summary, the uncertainty in the signal normalization associated with the migration of signal events between categories is larger than the total contribution of all other uncertainties, which varies from 7 % at \(m_\mathrm {X} = 1.1\,\text {TeV} \) to 15 % at \(m_\mathrm {X} = 3\,\text {TeV} \).

The statistical uncertainty in the total background ranges from 15 % at 1.3\(\,\text {TeV}\) up to 100 % at 3\(\,\text {TeV}\). It is calculated by generating pseudo-experiments in the signal and control regions, assuming Poisson fluctuations in the number of events in each bin about its central value. For low \(m_\mathrm {jj} \), the statistical precision is limited by the uncertainty in the parameter \(N_B\), and for high masses, by the uncertainty in the slope parameter a. The impact of the choice of the functional form used in the parameterization of the background distribution is evaluated by comparing the results from the exponential fit to those from an alternative power-law function, and is found to be negligible compared to the statistical uncertainty.

The uncertainty related to the efficiency of the \(\tau _{21} \) tagger is assumed to be fully correlated between the HPLP and LPHP categories and anticorrelated with the HPHP category. The uncertainties in the background estimate are uncorrelated between categories, while all other uncertainties are expected to be fully correlated among all three categories.

8 Results

The observed data are shown separately for the three event categories in Fig. 4. For comparison, we also show the predictions obtained for the background-only hypothesis. The \(N_B\) normalization parameter is extracted for all events in the signal region with \(1< m_\mathrm {jj} < 3\,\text {TeV} \). The bottom panel of each plot shows the difference between the observed data and the predicted background, divided by the statistical uncertainty estimated in the data. The background model describes the data within their statistical uncertainties. The events with the largest masses in the HPHP, HPLP, and LPHP categories are at \(m_\mathrm {jj} = 1780\), 1560, and 1800\(\,\text {GeV}\), respectively.

Fig. 6
figure 6

Observed and expected 95 % CL upper limits on the product of cross section of a narrow resonance and the branching fraction \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\). Theory curves corresponding to WED models with radion are also shown

Upper limits on the cross section for the production of resonances are extracted using the asymptotic approximation of the CL\(_\mathrm {s}\) method [63, 64]. Figure 5 shows the observed and expected 95 % confidence level (CL) upper limits on the product of the cross section and the branching fraction \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\) obtained for each event category. The HPHP category is always the most sensitive, nevertheless above 2\(\,\text {TeV}\) the HPLP and LPHP categories are also important because of inefficiencies in N-subjettiness at high \(p_{\mathrm {T}}\). Figure 6 and Table 4 provide the combined limits. The excluded cross sections at 95 % CL vary from 10\(\text {\,fb}\) at 1.15\(\,\text {TeV}\) to 1.5\(\text {\,fb}\) at 2\(\,\text {TeV}\). Above 2\(\,\text {TeV}\) the excluded cross sections increase to 2.8\(\text {\,fb}\) at 3\(\,\text {TeV}\), since the sensitivity is limited by the increasing inefficiency of \({\mathrm{H}}\) jet identification, as described in Sect. 4.

Figure 7 extends the \(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} \) search down to \(m_\mathrm {X} = 260\,\text {GeV} \) by including limits from Ref. [17]. This search, referred to as the resolved analysis, considers a case where the decay products from two Higgs bosons are reconstructed as four jets. It is interesting to observe that the sensitivity of the resolved analysis starts to degrade at \(m_\mathrm {X} \approx 1\,\text {TeV} \). At this point the typical angular distance between two jets from one Higgs boson reaches \(\Lambda _{R} = 4m_{{\mathrm{H}}}/m_\mathrm {X} \approx 0.5\) and the two jets overlap [30]. Above 1.1\(\,\text {TeV}\) the boosted analysis becomes more sensitive.

Table 4 Observed and expected 95 % CL upper limits on the product of cross section and the branching fraction \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\) for HPHP, HPLP and LPHP categories combined. The one standard deviation on the 95 % CL upper limit is also provided
Fig. 7
figure 7

Observed and expected 95 % CL upper limits on the product of cross section of a narrow resonance and the branching fraction \(\sigma ({\mathrm{g}} {\mathrm{g}} \rightarrow \mathrm {X}) \, \mathcal {B}(\mathrm {X} \rightarrow {\mathrm{H}} {\mathrm{H}} \rightarrow {\mathrm{b}} \overline{{\mathrm{b}}} {\mathrm{b}} \overline{{\mathrm{b}}} )\). Theory predictions corresponding to WED models with a radion are also shown. Results from the resolved analysis of Ref. [17] are shown by blue squares. For clarity, only a representative subset of the points are provided from the resolved analysis. The result from this paper is shown in black dots

To quantify the sensitivity of this analysis to new physics, the limits are compared to predictions of radion production for \(\Lambda _\mathrm {R} = 1\) and \(3\,\text {TeV} \), as shown in Fig. 6. We find that a radion corresponding to \(\Lambda _\mathrm {R} = 1\,\text {TeV} \) is excluded by the boosted analysis alone, for masses between 1.15 and 1.55\(\,\text {TeV}\). This result extends the limits already set by the resolved analysis from 0.3 to 1.1 \(\,\text {TeV}\).

9 Summary

A search is presented for narrow heavy resonances decaying into a pair of Higgs bosons in proton-proton collisions collected by the CMS experiment at \(\sqrt{s}=8\,\text {TeV} \). The full data sample of \(19.7{\,\text {fb}^{-1}} \)is explored. The background from multijet and \({\mathrm{t}}\overline{{\mathrm{t}}}\) events is significantly reduced by applying requirements related to the flavor of the jet, its mass, and its substructure. No significant excess of events is observed above the background expected from the SM processes. The results are interpreted as exclusion limits at 95 % confidence on the production cross section for \(m_\mathrm {X} \) between 1.15 and 3.0\(\,\text {TeV}\), extending significantly beyond 1.5 TeV the reach of previous searches. A radion with scale parameter \(\Lambda _\mathrm {R} = 1\,\text {TeV} \) decaying into \({\mathrm{H}} {\mathrm{H}} \) is excluded for \(1.15< m_\mathrm {X} <1.55\,\text {TeV} \) for the first time in direct searches.