1 Introduction

Signatures with high \({p_{\mathrm {T}}}\), massive particles such as Higgs bosons, top quarks, and W or Z bosons have become ubiquitous during Run 2 of the Large Hadron Collider (LHC). These particles most often decay hadronically. Due to their large transverse momentum, the decay products become collimated and may be reconstructed as a single jet with large radius parameter R [1, 2] (a ‘large-\(R\) ’ jet). The sensitivity of searches and measurements that use large-\(R\) jets depends on an accurate knowledge of the transverse momentum \(p_{{{\text {T}}}}\) and mass m responses of the detector [3]. A calibration of the large-\(R\) energy and mass scales derived using Monte Carlo simulation yields uncertainties as large as 10%. The calibration described in this paper results in a reduction of these uncertainties by more than a factor of three.

Fig. 1
figure 1

Overview of the large-\(R\) jet reconstruction and calibration procedure described in this paper. The calorimeter energy clusters from which jets are reconstructed have already been adjusted to point at the event’s primary hard-scatter vertex

In this paper, a suite of in situ calibration techniques is described which measure the response in proton–proton (pp) collision data at \(\sqrt{s}=\) 13 \({\text {TeV}}\). The results of several methods are combined to provide a calibration that defines the nominal large-\(R\) jet energy scale (JES) and the jet mass scale (JMS). These measurements provide a significant increase in the precision with which the large-\(R\) jet \(p_{{{\text {T}}}}\) and mass scales are known across most of the kinematically accessible phase space. The jet energy and mass resolutions (JER, JMR) are also measured in situ and compared with the predictions of Monte Carlo simulations (MC). Additional uncertainties on jet substructure observables used to identify boosted objects are derived from data in Ref. [4].

Jet reconstruction starts with clusters of topologically connected calorimeter cell signals. These topological clusters, or ‘topo-clusters’, are brought to the hadronic scale using the local hadronic cell weighting scheme (LCW) [5]. Large-\(R\) jets are reconstructed with the anti-\(k_{t}\) algorithm [6] using a radius parameter \(R = 1.0\). The jets are groomed with the ‘trimming’ algorithm of Ref. [7], which removes regions of the jet with a small relative contribution to the jet transverse momentum. This procedure reduces the impact from additional pp interactions in the event and from the underlying event, improving the energy and mass resolution.

The several stages of the ATLAS large-\(R\) jet calibration procedure are illustrated in Fig. 1. The trimmed large-\(R\) jets are calibrated to the energy scale of stable final-state particles using corrections based on simulations. This jet-level correction is referred to as the simulation-based calibration and includes a correction to the jet mass [8]. Finally, the jets are calibrated in situ using response measurements in pp collision data. A correction based on a statistical combination of data-to-simulation ratios of these response measurements is applied only to data and adjusts for the residual (typically 2–3%) mismodelling of the response. Uncertainties in the JES and JMS are derived by propagating uncertainties from the individual in situ response measurements through the statistical combination.

The in situ calibration is determined in two separate steps. In the first step, the JES is measured with the same methods used to calibrate small-\(R\) jets [9]. These techniques rely on the transverse momentum balance in a variety of final states, illustrated in Fig. 2. The JES correction factor is a product of two terms. The absolute calibration is derived from a statistical combination of three measurements from Z+jet, \(\gamma \)+jet, and multijet events in the central region of the detector. A relative intercalibration, derived using dijet events, propagates the well-measured central JES into the forward region of the detector. The in situ calibration accounts for detector effects which are not captured by simulation. The JES correction is applied as a four-momentum scale factor to jets in data; therefore, it also affects the jet mass calibration.

Fig. 2
figure 2

Schematic representation of the events used to measure the JES and JER: a a dijet event, b a Z+jet or \(\gamma \)+jet event and c a multijet event with several jets recoiling against the leading (large-\(R\)) jet. The labels J\(_{i}\) refer to the ith leading large-\(R\) jet, while j\(_i\) refers to the ith leading small-R jet that fulfils \(\Delta R ({\mathrm {J}}_1,{\mathrm {j}}) > 1.4\). \(\Delta \phi \) is the difference between the azimuthal angle of the jet and the reference object, while \(\Delta \alpha \) is the difference between the azimuthal angle of the jet and the vectorial sum of the recoil system momenta

In the second step of the in situ calibration, the jet mass response is measured using two methods following the application of the in situ JES correction. The mass response is measured in lepton+jets top quark pair production (\(t\bar{t}\) production) [10] with a fit to the peaks in the jet mass distribution formed by high-\(p_{{{\text {T}}}}\) W bosons and top quarks decaying into fully hadronic final states. A second measurement is performed with the \(R_{{\mathrm {trk}}}\) method [3], which takes advantage of the independent measurements by the calorimeter and the inner tracker. This method provides a calibration for the calorimeter jet mass measurement over a broad \(p_{{{\text {T}}}}\) range. The results from the two methods are combined as a smooth function of \(p_{{{\text {T}}}}\) in two mass bins, which could be applied to data as an in situ correction as outlined in Sect. 8.

The JER and JMR are also measured in situ and compared with the prediction of the simulation. The dijet balance method takes advantage of the transverse momentum balance in dijet events to extract the JER. The JMR is obtained from fits to the top quark and W boson mass peaks in high-\(p_{{{\text {T}}}}\) lepton+jets \(t\bar{t}\) events.

Sections 2 and 3 provide overviews of the ATLAS detector, the data set studied, and the simulations used in this paper. Section 4 describes the reconstruction of large-\(R\) jets in ATLAS. The following section presents the results of the balance methods that measure the jet energy scale: the intercalibration, which uses dijet events to ensure a uniform response over the central and forward regions of the detector in Sect. 5.1, the Z+jet balance method in Sect. 5.2, the \(\gamma \)+jet balance method in Sect. 5.3, and the multijet balance method in Sect. 5.4. Section 6 presents the methods that are used to measure the jet mass response: the \(R_{{\mathrm {trk}}}\) method and its results for the energy and mass scale in Section 6.1 and the fits to the W boson and top quark mass peaks in high-\(p_{{{\text {T}}}}\) lepton+jets \(t\bar{t}\) events in Sect. 6.2, which are also used to measure the JMR. The measurement of the JER in dijet events is discussed in Sect. 7. The methodology of the combination procedure is presented in Sect. 8, as well as the resultant combined in situ calibration of the JES and JMS. Sect. 9 summarizes the results.

2 The ATLAS detector and data set

The ATLAS experiment consists of three major sub-detectors: the inner detector, the calorimeters, and the muon spectrometer. The inner detector, closest to the interaction point, is used to track charged particles in a 2 T axial magnetic field produced by a thin superconducting solenoid. It consists of a pixel detector, a silicon tracker equipped with micro-strip detectors, and a transition radiation tracker that provides a large number of space points in the outermost layers of the tracker. It covers the pseudorapidityFootnote 1 range \(|\eta |<\) 2.5. Surrounding the tracker and solenoid, a sampling calorimeter measures the energy of particles produced in the collisions with \(|\eta |<\) 4.9. The energies of electrons and photons are measured precisely in a high-granularity liquid-argon electromagnetic calorimeter. The cylindrical “barrel” covers \(|\eta |<1.475\), and the “endcaps” on either end of the detector cover \(1.375< |\eta |<3.2\). An iron/scintillator tile calorimeter measures the energy of hadrons in the central rapidity range, \(|\eta |<1.7\), and a liquid-argon hadronic endcap calorimeter provides coverage for \(1.5<|\eta |<3.2\). The forward liquid-argon calorimeter measures electrons, photons, and hadrons for \(3.2< |\eta |< 4.9\). Finally, a muon spectrometer in the magnetic field of a system of superconducting air-core toroid magnets identifies muons in the range \(|\eta | < 2.7\) and measures their transverse momenta. The ATLAS trigger system consists of a hardware-based first-level trigger followed by a software-based high-level trigger, which apply a real-time selection to reduce the up to 40 MHz LHC collision rate to an average rate of events written to storage of 1 kHz [11]. A detailed description of the ATLAS experiment is given in Ref. [12].

The data set used in this analysis consists of pp collisions delivered by the LHC at a centre-of-mass energy of \(\sqrt{s}=\) 13 \({\text {TeV}}\) during 2015 and 2016. The specific trigger requirements vary among the various in situ analyses and are described in the relevant sections. All data are required to meet ATLAS standard quality criteria. Data taken during periods in which detector subsystems were not fully functional are discarded. Data quality criteria also reject events that have significant contamination from detector noise or with issues in the read-out. The remaining data correspond to an integrated luminosity of \(36.2~\text{ fb }^{-1}\).

Due to the high luminosity of the LHC, multiple pp collisions occur during each bunch crossing. Interactions which occur within the bunch crossing of interest (in-time pile-up) or in neighbouring bunch crossings (out-of-time pile-up) may alter the measured energy or mass scale of jets or lead to the reconstruction of additional ‘stochastic’ jets, seeded by upwards fluctuations in the local pile-up energy density. The average number of additional pp collisions per bunch crossing is 24 in the Run 2 data from 2015 and 2016 analysed here.

3 Simulations

The data are compared with detailed simulations of the ATLAS detector response [13] based on the Geant4 [14] toolkit. Hard-scatter events for all processes studied were simulated with several different event generators to assess possible systematic effects due to limitations in the physics modelling. Several different simulation packages were also used to hadronize final-state quarks and gluons in order to compare the impact of various models of hadronization and parton showering on the measurements.

Dijet events were generated using several different generator configurations. Depending on the analysis, nominal dijet samples were generated using either Pythia 8 (v8.186) [15] or Powheg-Box 2.0 [16,17,18] interfaced with Pythia 8. These samples were generated with the A14 set of tuned parameters [19] and the NNPDF2.3 LO parton distribution function (PDF) set [20]. Samples generated with Herwig 7 [21] and Sherpa v2.1 [22] were used for comparison. The Herwig 7 sample used the UE-EE-5 set of tuned parameters [23] and CTEQ6L1 PDF set [24]. The Sherpa leading-order multileg generator includes \(2\rightarrow 2\) and \(2\rightarrow 3\) processes at matrix element level, combined using the CKKW prescription [25].

Z+jets events are generated using Powheg-Box 2.0 interfaced to the Pythia 8.186 parton shower model. The CT10 PDF set is used in the matrix element [26]. The AZNLO set of tuned parameters [27] is used, with PDF set CTEQ6L1, for the modelling of non-perturbative effects. The EvtGen 1.2.0 program [28] is used for the properties of b- and c-hadron decays. Photos++ 3.52 [29] is used for QED emissions from electroweak vertices and charged leptons. Samples of Z+jet events are compared to a second sample generated using Sherpa 2.2.1. Matrix elements are calculated for up to 2 partons at NLO and 4 partons at LO using Comix [30] and OpenLoops [31] and merged with the Sherpa parton shower [32] according to the ME+PS@NLO prescription [33]. The NNPDF30nnlo PDF set is used in conjunction with dedicated parton shower tuning developed by the Sherpa authors. \(\gamma \)+jets events are compared to a sample generated with the Sherpa 2.1.1 event generator. Matrix elements are calculated with up to 3 or 4 partons at LO and merged with the Sherpa parton shower according to the ME+PS@LO prescription. The CT10 PDF set is used in conjunction with dedicated parton shower tuning developed by the Sherpa authors. Z+jets events are generated using Powheg-Box 2.0 interfaced to the Pythia 8.186 parton shower model. The CT10 PDF set is used in the matrix element [26]. The AZNLO set of tuned parameters [27] is used, with PDF set CTEQ6L1, for the modelling of non-perturbative effects. The EvtGen 1.2.0 program [28] is used for the properties of b- and c-hadron decays. Photos++ 3.52 [29] is used for QED emissions from electroweak vertices and charged leptons. Samples of Z+jet events are compared to a second sample generated using Sherpa 2.2.1. Matrix elements are calculated for up to 2 partons at NLO and 4 partons at LO using Comix [30] and OpenLoops [31] and merged with the Sherpa parton shower [32] according to the ME+PS@NLO prescription [33]. The NNPDF30nnlo PDF set is used in conjunction with dedicated parton shower tuning developed by the Sherpa authors. \(\gamma \)+jets events are compared to a sample generated with the Sherpa 2.1.1 event generator. Matrix elements are calculated with up to 3 or 4 partons at LO and merged with the Sherpa parton shower according to the ME+PS@LO prescription. The CT10 PDF set is used in conjunction with dedicated parton shower tuning developed by the Sherpa authors.

For \(\gamma \)+jet events, Pythia 8 was used as the nominal generator, where the \(2\rightarrow 2\) matrix element is convolved with the NNPDF2.3LO PDF set. The A14 event tune was used. These events are compared to a sample generated with Sherpa v2.1.1, which includes up to four jets in the matrix element. These events were generated using the default Sherpa tune and the CT10 PDF set.

Top quark pair production and single top production in the s-channel and Wt final state were simulated at NLO accuracy with Powheg-Box v2 [34] and the CT10 PDF set. For electroweak t-channel single top quark production, Powheg-Box v1 was used, which utilizes the four-flavour scheme for NLO matrix element calculations together with the fixed four-flavour PDF set CT10f4. In all cases, the nominal sample was interfaced with Pythia 8 with the CTEQ6L1 PDF set, which simulates the parton shower, fragmentation, and underlying event. The \(h_{\text {damp}}\) parameter in Powheg, which regulates the \(p_{{{\text {T}}}}\) of the first additional emission beyond the Born level and thus the \(p_{{{\text {T}}}}\) of the recoil emission against the \(t\bar{t}\) system, was set to the mass of the top quark (172.5 \({\text {GeV}}\)). Systematic uncertainties in the modelling of hadronization were evaluated using a Powheg sample interfaced to Herwig 7. W+jet events, simulated in Sherpa v2.2.0, are considered as a background to \(t\bar{t}\) production.

The effect of pile-up on reconstructed jets was modelled by overlaying multiple simulated minimum-bias inelastic pp events on the signal event. These additional events were generated with Pythia 8, using the A2 set of tuned parameters [35] and MSTW2008LO PDF set [36]. The distribution of the average number of interactions per bunch crossing in simulated samples is reweighted to match that of the analyzed dataset.

4 Large-\(R\) jet reconstruction and simulation calibration

This section describes the reconstruction of large-\(R\) jets and the grooming procedure. Three classes of jets are used: calorimeter jets, particle-level (or ‘truth’) jets, and track jets. The large-\(R\) jets considered in this paper are reconstructed using the anti-\(k_{t}\) algorithm [6] with a radius parameter \(R = 1.0\). For balancing and veto purposes, jets reconstructed with radius parameter \(R = 0.4\) (‘small-\(R\) jets’) are used in some parts of the analysis with their own calibration procedures applied [9]. The specific implementation of the jet clustering algorithm used is taken from the FastJet package [37, 38].

4.1 Large-\(R\) jets

Calorimeter jets are formed from topological clusters of calorimeter cells. The clusters are seeded by cells with an energy significantly above the calorimeter noise. The large-\(R\) jets used in this paper are reconstructed using topological clusters that are calibrated to correct for response differences between energy deposition from electromagnetic particles (electrons and photons) and hadrons with the LCW scheme of Ref. [5]. Small-R jets reconstructed from “electromagnetic scale” topo-clusters are used as a reference system in the multijet balance method of Sect. 5.4. Results are labelled with “LCW” or “EM” to indicate the calibration of the clusters. Topological clusters are defined to be massless. The four-momenta of these topo-clusters, initially defined as pointing to the geometrical centre of the ATLAS detector, are adjusted to point towards the hard-scatter primary vertex of the event, which is defined as the primary vertex with the largest associated sum of track \(p_{{{\text {T}}}} ^2\).

To reduce the effects of pile-up, soft emissions, and the underlying event on jet substructure measurement, the trimming algorithm is applied to the jets. Trimming reclusters the jet constituents of each \(R = 1.0\) jet using the \(k_{t}\) algorithm [39] and \(R_{{\mathrm {sub}}}=\) 0.2, producing a collection of subjets for each jet. Subjets with \(p_{{{\text {T}}}} ^{{\mathrm {subjet}}} / p_{{{\text {T}}}} ^{{\mathrm {jet}}} < 0.05\) are removed, and the jet four-momentum is recalculated from the remaining constituents.

In this paper, trimmed large-\(R\) jets with \(p_{{{\text {T}}}}\) > 200 \({\text {GeV}}\) and \(|\eta | < 2.5\) are studied.

4.2 Particle-level jets and the simulation-based jet calibration

The reference for the simulation-based jet calibration is formed by particle-level jets. These are created by clustering stable particles originating from the hard-scatter interaction in the simulation event record which have a lifetime \(\tau \) in the laboratory frame such that \(c\tau > 10\) mm. Particles that do not leave significant energy deposition in the calorimeter (i.e. muons and neutrinos) are excluded. Particle-level jets are reconstructed and trimmed using the same algorithms as those applied to large-\(R\) jets built from topological clusters, incorporating the grooming procedure within the jet definition.

After reconstruction of the calorimeter jets, a correction derived from a sample of simulated dijet events is applied to restore the average reconstructed calorimeter jet energy scale to that of particle-level jets. A correction is also applied to the \(\eta \) of the reconstructed jet to correct for a bias relative to particle-level jets in certain regions of the detector [40]. Both corrections are applied as a function of the reconstructed jet energy and the detector pseudorapidity, \(\eta _{{\mathrm {det}}}\), defined as the pseudorapidity calculated relative to the geometrical centre of the ATLAS detector. This yields a better location of the energy-weighted centroid of the jet than the use of the pseudorapidity calculated relative to the hard-scatter primary vertex.

Reconstructed jets are matched to particle-level jets using an angular matching procedure that minimizes the distance \(\Delta R = \sqrt{(\Delta \phi )^2 + (\Delta \eta )^2}\). The energy response is defined as \(E_{{\mathrm {reco}}}/E_{{\mathrm {truth}}}\), where \(E_{{\mathrm {reco}}}\) is the reconstructed jet energy prior to any calibration (later denoted \(E_0\)) and \(E_{{\mathrm {truth}}}\) is the energy of the corresponding particle-level jets. The mass response is defined as \(m_{{\mathrm {reco}}}/m_{{\mathrm {truth}}}\), where \(m_{{\mathrm {reco}}}\) and \(m_{{\mathrm {truth}}}\) represent the jet mass of the matched detector-level and particle-level jets, respectively. The average response is determined in a Gaussian fit to the core of the response distribution. The parameterization of the average jet energy response \(R_E = \langle E_{\text {reco}}/E_{\text {truth}} \rangle \) used for the simulation calibration is presented as a function of \(\eta _{{\text {det}}}\) and for several values of the truth jet energy in Fig. 3a. The correction is typically 5–10%, with a weak dependence on the jet energy and a characteristic structure in \(\eta _{{\text {det}}}\) that reflects the calorimeter geometry.

The simulation-based JES correction factor \(c_{{\mathrm {JES}}}\) is determined as a function of the jet energy and pseudorapidity \(\eta _{{\text {det}}}\). It is applied to the jet four-momentum as a multiplicative scale factor. The pseudorapidity correction \(\Delta \eta \) only changes the direction. This means that the reconstructed large-\(R\) jet energy, mass, \(\eta \), and \(p_{{{\text {T}}}}\) become

$$\begin{aligned} E_{{\mathrm {reco}}}= & {} c_{\text {JES}}\,E_0,\quad m_{{\mathrm {reco}}} = c_{\text {JES}}\,m_0,\quad \eta _{{\mathrm {reco}}} = \eta _0+\Delta \eta ,\\ p_{{{\text {T}}}} ^{{\mathrm {reco}}}= & {} c_{\text {JES}}\,|\vec {p}_0| / \cosh {(\eta _0+\Delta \eta )}, \end{aligned}$$

where the quantities \(E_0\), \(m_0\), \(\eta _0\), and \(\vec {p}_0\) refer to the jet properties prior to any calibration, as determined by the trimming algorithm. The quantities \(c_{\text {JES}}\) and \(\Delta \eta \) are smooth functions of the large-\(R\) jet kinematics. None of the calibration steps affect the azimuthal angle \(\phi \) of the jet.

The large-\(R\) jet invariant mass is calibrated in a final step. This is important when using the jet mass in physics analyses, because the jet mass is more sensitive than the transverse momentum to soft, wide-angle contributions and to cluster merging and splitting, as well as to the calorimeter geometry. For the mass correction the jet mass response \(R_m = \langle m_{\text {reco}}/m_{\text {truth}} \rangle \) is determined using the same procedure as for the jet energy calibration. The mass calibration is applied after the standard JES calibration. The mass response is presented in Fig. 3 for three representative values of the truth jet mass: 40 \({\text {GeV}}\) in panel (b), the W boson mass in panel (c), and the top quark mass in panel (d). The mass response is close to unity for jets with \(p_{{{\text {T}}}}\) between 200 and 800 \({\text {GeV}}\) and as large as 1.5 for very energetic jets with relatively low mass. Several effects can impact the jet mass response. The reconstructed mass can be artificially increased by the splitting of topo-clusters during their creation. This effect is particularly important for jets with small particle-level mass relative to their \(p_{{{\text {T}}}}\) (\(m/p_{{{\text {T}}}} \lessapprox 0.05\)). Similarly, when several particles form one topo-cluster, or when particles fail to produce any topo-cluster, the mass response is decreased. This effect is significant for jets with large particle-level mass relative to their \(p_{{{\text {T}}}}\) (\(m/p_{{{\text {T}}}} \gtrapprox 0.5\)).

The simulation-based correction to the large-\(R\) jet mass \(c_{{\mathrm {JMS}}}\) is applied as a function of the jet \(E_{{\mathrm {reco}}}\), \(\eta _{{\text {det}}}\), and \(\log (m_{{\mathrm {reco}}}/E_{{\mathrm {reco}}})\), keeping the large-\(R\) jet energy fixed and thus allowing the \(p_{{{\text {T}}}} \) to vary [40]. This factor is also a smooth function of the large-\(R\) jet kinematics. This has the following impact on the reconstructed jet kinematics:

$$\begin{aligned} E_{{\mathrm {reco}}}= & {} c_{\text {JES}}\,E_0,\quad m_{{\mathrm {reco}}} = c_{\text {JES}}\,c_{\text {JMS}}\,m_0,\quad \eta _{{\mathrm {reco}}} = \eta _0+\Delta \eta ,\\ p_{{{\text {T}}}} ^{{\mathrm {reco}}}= & {} c_{\text {JES}} \sqrt{E_0^2-c^2_{\text {JMS}}\,m_0^2} / \cosh {(\eta _0+\Delta \eta )}. \end{aligned}$$

All results that correspond to jets that are brought to the particle-level with the simulation-based calibration are labelled with “JES+JMS”.

Fig. 3
figure 3

The response for a the jet energy and bd the jet mass of large-\(R\) jets. The jet energy response is presented as a function of jet detector pseudorapidity \(\eta _{{\text {det}}} \) for several values of the truth jet energy, ranging from 200 \({\text {GeV}}\) to 2 \({\text {TeV}}\). The jet mass response is presented as a function of jet pseudorapidity for several values of the jet transverse momentum from 200 \({\text {GeV}}\) to 2 \({\text {TeV}}\) and for three representative values of the truth jet mass: b 40 \({\text {GeV}}\), representing a typical value for quark or gluon jets, c the W boson mass, and d the top quark mass. The response is determined in simulation of dijet events as the ratio of the reconstructed jet mass to the mass of the corresponding particle-level jet. These results are used to define the jet-level mass correction applied in the simulation calibration

4.3 Tracks and track jets

Tracks are reconstructed from the hits generated by charged particles passing through the inner tracking detector (ID). They are required to have \(p_{{\mathrm {T}}}>\) 500 \({\text {MeV}}\). To reduce fake tracks, candidate tracks must be composed of at least one pixel detector hit and at least six hits in the silicon tracker. The track transverse impact parameter \(|d_0|\) relative to the primary vertex must be less than 1.5 mm and the longitudinal impact parameter \(|z_0|\) multiplied by \(\sin {\theta }\) relative to the primary vertex must be less than 3 mm [41, 42].

Jets reconstructed from charged-particle tracks are used as a reference in calibration and uncertainty studies, taking advantage of the independence of instrumental systematic effects between the ID and the calorimeter. Track jets are reconstructed by applying the same jet reconstruction procedure to tracks as those used when constructing the topo-cluster jets described above, including the jet trimming algorithm. Track jets are not calibrated.

4.4 The combined jet mass

The jet mass resolution is improved by combining the jet mass measurement in the calorimeter with the measurement of the charged component of the jet within the ID [43,44,45,46,47,48,49,50,51]. A track jet is reconstructed from ID tracks with \(p_{{{\text {T}}}} > 500~{\text {MeV}}\) which are ghost-associated [52] to the topo-cluster large-\(R\) jet. The measurement of this track jet’s mass is multiplied by the ratio of the transverse momenta of the calorimeter jet and the track jet to obtain the track-assisted mass:

$$\begin{aligned} m^{{\mathrm {TA}}} = m^{{\mathrm {track}}} \, \frac{p_{{\mathrm {T}}}^{{\mathrm {calo}}}}{p_{{\mathrm {T}}}^{{\mathrm {track}}}}. \end{aligned}$$
(1)

where \(m^{{\mathrm {TA}}}\) is the track-assisted mass, \(m^{{\mathrm {track}}}\) the mass obtained from the tracker, and \(p_{{{\text {T}}}} ^{{\mathrm {calo}}}\) and \(p_{{{\text {T}}}} ^{{\mathrm {track}}}\) are the transverse momenta measured respectively by the calorimeter and tracker. This alternative mass measurement has better resolution for high-\(p_{{{\text {T}}}}\) jets with low values of \(m/p_{{{\text {T}}}} \). A weighted least-squares combination of the mass measurements is subsequently performed with weights:

$$\begin{aligned} m^{{\mathrm {comb}}} = w_{{\text {calo}}}\,m^{{\mathrm {calo}}} + w_{{\text {TA}}}\,m^{{\mathrm {TA}}}, \end{aligned}$$

where \(w_{{\text {calo}}}\) and \(w_{{\text {TA}}}\) are determined by the expected mass resolutions \(\sigma _{{\mathrm {calo}}}\) and \(\sigma _{{\mathrm {\mathrm {TA}}}}\) of the calorimeter and track-assisted measurements, using the central 68% inter-quantile range of the jet mass response distribution in dijet events:

$$\begin{aligned} w_{{\text {calo}}} = \frac{\sigma _{{\mathrm {calo}}}^{-2}}{\sigma _{{\mathrm {calo}}}^{-2} + \sigma _{{\mathrm {TA}}}^{-2}}, \quad w_{{\text {TA}}} = \frac{\sigma _{{\mathrm {TA}}}^{-2}}{\sigma _{{\mathrm {calo}}}^{-2} + \sigma _{{\mathrm {TA}}}^{-2}}, \end{aligned}$$

such that the resolution of the combined mass measurement is always better than either of the two inputs within the sample from which the weights are derived. In this paper, in situ measurements are presented for the jet mass reconstructed from topo-clusters and for the track-assisted mass. The constraint \( w_{{\text {calo}}} + w_{{\text {TA}}} = 1\) ensures that the combined mass is calibrated, if the scales of both mass definitions are fixed.

5 In situ \(p_{{{\text {T}}}}\) response measurements

In this section, the methods used to derive the in situ calibration for the energy (or transverse momentum) response are presented. These methods use \(p_{{{\text {T}}}}\) conservation in events where a large-\(R\) jet recoils against a well-measured reference object. The first method is based on the \(p_{{{\text {T}}}}\) balance in dijet events with a central (\(|\eta _{{\text {det}}} | \le 0.8\)) and a forward (\(|\eta _{{\text {det}}} | > 0.8\)) jet. It is applied after the simulation calibration described in Sect. 4. The \(\eta \)-intercalibration corrects the \(p_{{{\text {T}}}}\) of forward jets to make the jet energy response uniform as a function of pseudorapidity. After the \(\eta \)-intercalibration procedure, three further balance methods are used to provide an absolute \(p_{{{\text {T}}}}\) scale calibration. In the Z+jet balance method, the recoiling system is a reconstructed \(Z \rightarrow \mu ^+\mu ^-\) or \(Z \rightarrow e^+e^-\) decay, in the \(\gamma \)+jet balance method it is a photon, and in the multijet balance method the system is formed by several calibrated small-\(R\) jets with low \(p_{{{\text {T}}}}\). These three methods offer complementary coverage over a broad \(p_{{{\text {T}}}}\) range. The Z+jet balance method provides the most precise results in the low-\(p_{{{\text {T}}}}\) interval between 200 and 500 \({\text {GeV}}\), the \(\gamma \)+jet balance between 500 \({\text {GeV}}\) and 1 \({\text {TeV}}\), and the multijet balance extends to 2.5 \({\text {TeV}}\). Results of the three methods are presented in this section and are combined into a global constraint on the JES in Sect. 8.

5.1 Dijet \(\eta \)-intercalibration

The relative \(\eta \)-intercalibration extends the jet calibration to the forward detector region, \(0.8< |\eta | < 2.5\). It is derived from the differences in the \(p_{{{\text {T}}}}\) balance between a central reference and a forward jet in data and simulations. The \(\eta \)-intercalibration is determined in dijet events using a procedure similar to that used for small-\(R\) jets [53]. The \(p_{{{\text {T}}}}\) balance of the dijet system is characterized by its asymmetry \(\mathcal {A}\), defined in terms of the forward (probe) and central (reference) jet \(p_{{{\text {T}}}}\) (\(p_{{{\text {T}}}} ^{{\mathrm {probe}}}\) and \(p_{{{\text {T}}}} ^{{\mathrm {ref}}}\)) as

$$\begin{aligned} \mathcal {A} = \frac{p_{{{\text {T}}}} ^{{\mathrm {probe}}} - p_{{{\text {T}}}} ^{{\mathrm {ref}}}}{p_{{{\text {T}}}} ^{{\mathrm {avg}}}}, \end{aligned}$$

where \(p_{{{\text {T}}}} ^{{\mathrm {avg}}} = (p_{{{\text {T}}}} ^{{\mathrm {probe}}} + p_{{{\text {T}}}} ^{{\mathrm {ref}}})/2\). The central reference jets are required to be within \(|\eta |<0.8\). The balancing probe jet \(\eta _{\text {det}}\) defines the detector region whose response is being probed. The asymmetry distribution is studied in bins of \(p_{{{\text {T}}}} ^{{\mathrm {avg}}}\) and the probe jet \(\eta _{{\text {det}}}\). In each bin, the relative response difference between the central and forward jets is

$$\begin{aligned} R_{{\mathrm {rel}}} = \left\langle \frac{p_{{{\text {T}}}} ^{{\mathrm {probe}}}}{p_{{{\text {T}}}} ^{{\mathrm {ref}}}} \right\rangle = \frac{2+\langle \mathcal {A}\rangle }{2-\langle \mathcal {A}\rangle }, \end{aligned}$$
(2)

where \(\langle \mathcal {A}\rangle \) is the mean value of the asymmetry. The asymmetry distribution is approximately Gaussian, and the mean value is extracted using a Gaussian fit to the core of the distribution.

Large-\(R\) jets with \(p_{{{\text {T}}}}\) from 180 \({\text {GeV}}\) to 2 \({\text {TeV}}\) within \(|\eta |<2.5\) are considered. Dijet events in data are selected using several dedicated single-jet triggers based on small-\(R\) jets. Their efficiency has been evaluated for large-\(R\) jets and each trigger is used in its region of full efficiency for those jets. These triggers provide enough events for this technique to be used over a wide range of \(p_{{{\text {T}}}}\). To ensure a \(2\rightarrow 2\) body topology, events with energetic additional radiation are vetoed with an upper cut on the transverse momentum of the third jet \(J_3\), and the leading two jets are required to satisfy a minimum angular separation in azimuth. Both of these requirements are varied in order to derive systematic uncertainties accounting for their impact on the response measurements. These selections and systematic variations are summarized in Table 1. No pile-up jet tagging employing the Jet Vertex Tagger likelihood measure (JVT) [54, 55] is applied for large-\(R\) jets, since in this kinematic region the contamination by pile-up jets is negligible.

Table 1 Summary of the dijet topology selection and systematic variations considered for the \(\eta \)-intercalibration analysis. The label \({\text {J}}_{3}\) refers to the third trimmed \(R=1.0\) jet in the event after ordering the jets in \(p_{{{\text {T}}}}\)

The relative jet-\(p_{{{\text {T}}}}\) response \(R_{{\mathrm {rel}}}\) is shown in Fig. 4 as a function of the large-\(R\) jet pseudorapidity for data, Powheg +Pythia 8, and Sherpa for two \(p_{{{\text {T}}}}\) intervals. The relative jet response as a function of the large-\(R\) jet \(p_{{{\text {T}}}}\) is shown in Fig. 5 for two pseudorapidity ranges of the probe jet. In the central region, the relative responses of all three samples agree by design. The relative response in data increases in the forward region due to features of the experimental response which are not well-reproduced in the simulation and hence not accounted for in the simulation-based JES calibration factor \(c_{\text {JES}}\). Compared to the measured response, the prediction remains relatively constant around unity. The difference between the simulated and measured responses reaches about 5% around \(|\eta | = 2.5\). Similar trends are observed for \(R=0.4\) jets in Ref. [9]. In the lower panel of Figs. 4 and 5, the ratio of simulation to data is shown. An interpolation using a filter with a sliding Gaussian kernel across \(\eta _{{\text {det}}}\) yields a smooth function of jet \(p_{{{\text {T}}}}\) and \(\eta _{{\text {det}}}\). The inverse of this smooth function is taken as the \(\eta \)-intercalibration correction factor \(c_{\text {rel}} (p_{{{\text {T}}}},\eta _{{\text {det}}})\), which is applied as a jet four-momentum scale factor.

The uncertainties associated with the \(\eta \)-intercalibration are shown in Fig. 6 for two representative \(p_{{{\text {T}}}}\) bins. The uncertainties associated with the veto on additional radiation and the \(\Delta \phi \) requirement placed on the dijet topology are derived by varying these selection criteria to the values listed in Table 1 and re-deriving the calibration. An additional systematic uncertainty accounts for the choice of event generator and parton shower models. The simulation uncertainty is derived by comparing the relative jet-\(p_{{{\text {T}}}}\) response for two event generators: Powheg +Pythia 8 and Sherpa. In general, the uncertainties associated with the derived calibration are small, amounting to a \(\sim \) 1% uncertainty within the region of interest for large-\(R\) jets (\(|\eta |<2.0\)). Uncertainties originating from the kinematic requirements made to select events are typically negligible, except in the highest \(p_{{{\text {T}}}} ^{{\mathrm {avg}}}\) bins.

Fig. 4
figure 4

The relative large-\(R\) jet response \(R_{{\mathrm {rel}}}\) as a function of the large-\(R\) jet detector pseudorapidity \(\eta _{{\text {det}}}\) in two representative average transverse momentum \(p_{{{\text {T}}}} ^{{\mathrm {avg}}}\) bins a \(280~{\text {GeV}}< p_{{{\text {T}}}} ^{{\mathrm {avg}}}~< 380~{\text {GeV}}\) and b \(550~{\text {GeV}}< p_{{{\text {T}}}} ^{{\mathrm {avg}}} < 700~{\text {GeV}}\). The average response with in the reference region \(|\eta _{{\text {det}}} |<0.8\) is unity by construction. In the lower panels, the dotted lines interpolating between Powheg +Pythia markers are obtained by smoothing with a filter using a sliding Gaussian kernel

Fig. 5
figure 5

The relative large-\(R\) jet response \(R_{{\mathrm {rel}}}\) as a function of the large-\(R\) jet \(p_{{{\text {T}}}}\) in two representative detector pseudorapidity \(\eta _{{\text {det}}} \) bins in the forward and central reference regions a \(1.7< \eta _{{\text {det}}} < 1.8\) and b \(-0.6< \eta _{{\text {det}}} < -0.4\). In the lower panels, the lines interpolating between Powheg +Pythia markers are obtained by smoothing with a filter using a sliding Gaussian kernel

Fig. 6
figure 6

Uncertainties associated with the large-\(R\) jet \(\eta \)-intercalibration as a function of detector pseudorapidity \(\eta _{{\text {det}}} \) in two representative average transverse momentum \(p_{{{\text {T}}}} ^{{\mathrm {avg}}}\) bins a \(280~{\text {GeV}}< p_{{{\text {T}}}} ^{{\mathrm {avg}}} < 380~{\text {GeV}}\) and b \(550~{\text {GeV}}< p_{{{\text {T}}}} ^{{\mathrm {avg}}} < 700~{\text {GeV}}\). The uncertainties evaluated using variations of the dijet topology selection are negligible relative to the simulation modelling uncertainty, which typically amounts to a 1% uncertainty for large-\(R\) jets within \(0.8< |\eta _{{\text {det}}} | < 2.0\)

5.2 Z+jet balance

For large-\(R\) jets within \(|\eta _{{\text {det}}} |<0.8\), an in situ calibration is derived by examining the \(p_{{{\text {T}}}}\) balance of a large-\(R\) jet and a leptonically decaying Z boson, either \(Z\rightarrow e^+e^-\) or \(Z\rightarrow \mu ^+\mu ^-\) (Fig. 2b). Both of these channels provide a precise, independent reference measurement of the jet energy, either from the inner detector and muon spectrometer tracks used to reconstruct muons or from the well-measured electromagnetic showers and inner detector tracks used to reconstruct electrons. The applicable range of this calibration is limited by the kinematic range where Z boson production is relatively abundant, that is, up to a Z boson \(p_{{{\text {T}}}}\) of about 500 \({\text {GeV}}\). Electrons used to reconstruct the Z boson are required to pass ‘medium likelihood identification’ quality and ‘Loose’ isolation requirements and must be reconstructed within \(|\eta |<2.47\) (excluding the transition region \(1.36< |\eta | < 1.52\) between the barrel and endcap electromagnetic calorimeters) with at least \(20~{\text {GeV}}\) of \(p_{{{\text {T}}}}\)  [56, 57]. Similarly, ‘VeryLoose’ quality and ‘Loose’ isolation requirements are placed on muons, which must be reconstructed within \(|\eta |<2.4\) with \(p_{{{\text {T}}}} >20~{\text {GeV}}\) [58]. The lepton pair must have opposite charge and be kinematically consistent with the decay of a Z boson, requiring the invariant mass of the lepton pair to satisfy \(66< m_{\ell ^+\ell ^-} < 116~{\text {GeV}}\). Large-\(R\) jets studied here are calibrated with the simulation calibration and \(\eta \)-intercalibration described in Sects. 4 and 5.1.

The direct balance method used here closely follows the methodology outlined in Ref. [9]. The average momentum balance between the large-\(R\) jet and Z boson is

$$\begin{aligned} R_{{\mathrm {DB}}} = \left\langle \frac{p_{{{\text {T}}}} ^{\mathrm {J}}}{p_{{{\text {T}}}} ^{{\text {ref}}}} \right\rangle , \end{aligned}$$
(3)

where \(p_{{{\text {T}}}} ^{\mathrm {J}}\) is the large-\(R\) jet \(p_{{{\text {T}}}}\) and \(p_{{{\text {T}}}} ^{{\text {ref}}} = p_{{{\text {T}}}} ^{Z}\,\big |\cos \left( \Delta \phi \right) \big |\) is the component of the reference momentum collinear with the jet, with \(\Delta \phi \) being the azimuthal angle between the large-\(R\) jet and reference Z boson. The average value is determined using a Gaussian fit.

Even with an ideal detector, the momentum balance \(R_{{\mathrm {DB}}}\) of Eq. 3 will only equal unity for an ideal \(2 \rightarrow 2\) process. In practice, there tends to be more QCD radiation in the hemisphere opposite to the colour-neutral Z boson, and therefore \(R_{{\mathrm {DB}}}\) tends to be below unity. The event selection imposes a veto on the \(p_{{{\text {T}}}}\) of additional sub-leading jets. A minimum requirement is also imposed on the angular separation \(\Delta \phi \) of the large-\(R\) jet and reference Z boson. Any mismodelling in the jet energy scale may be evaluated using the balance double ratio of \(R_{{\mathrm {DB}}}\) in data and simulation \(R_{{\mathrm {DB}}} ^{{\mathrm {data}}} / R_{{\mathrm {DB}}} ^{{\mathrm {MC}}}\). If the event selection criteria are met and the reference object is well measured and correctly modelled in simulation, any deviation from unity in the double ratio can be attributed to a mismodelling of the jet response in simulation and may be taken as an in situ correction.

Calibrated anti-\(k_{t}\) \(R=0.4\) jets constructed from electromagnetic-scale topo-clusters are used to veto additional radiation. These jets are required to be \(\Delta R > 1.4\) from the large-\(R\) jet whose response is being probed (\({\mathrm {J}}_1\)), which ensures that there is no overlap. Such small-R jets with \(p_{{{\text {T}}}} < 60\) \({\text {GeV}}\) must also satisfy a requirement on the jet vertex tagger (JVT) [54], which is designed to reject additional jets produced by pile-up interactions using information from the inner detector. The \(2 \rightarrow 2\) topology selection only accepts events in which any small-R jet is reconstructed with a \(p_{{{\text {T}}}} < \max (0.1 \, p_{{{\text {T}}}} ^{{\text {ref}}},15~{\text {GeV}})\) and the \(\Delta \phi \) between the large-\(R\) jet and Z boson is greater than 2.8. A summary of the event selection is presented in Table 2. This table also reports variations associated with each criterion, performed by redoing the full analysis for each such variation and taking the difference between the varied and nominal results as the systematic uncertainty.

Table 2 Summary of the \(2 \rightarrow 2\) topology selection and systematic variations considered for the Z+jet direct balance analysis. The labels J\(_{i}\) refer to the ith leading large-\(R\) jet, and j\(_i\) to the ith leading small-R jet that fulfils \(\Delta R ({\mathrm {J}}_1,{\mathrm {j}}_i) > 1.4\)

Measurements of \(R_{{\mathrm {DB}}}\) are carried out separately in the electron and muon channels. They are found to be consistent and thus combined to provide a single measurement of the JES. The average momentum balance in Z+jet events after this combination is shown in Fig. 7. The balance is found to be consistently below unity as a function of \(p_{{{\text {T}}}} ^{{\text {ref}}}\). The ratio of the predicted balance to the measured balance is consistently 1–4% above unity. The uncertainties associated with this measurement are shown in Fig. 8, where modelling systematic and statistical uncertainties are the dominant source of error over the \(p_{{{\text {T}}}}\) range considered.

Fig. 7
figure 7

The momentum balance \(R_{{\mathrm {DB}}}\) as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\) in Z+jet events for the combined \(e^+e^-\) and \(\mu ^+\mu ^-\) channels. Only statistical uncertainties are shown. For each \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin, the measured \(R_{{\mathrm {DB}}}\) is plotted against the average jet \(p_{{{\text {T}}}}\) of the bin. The horizontal error bars give an indication of the width of the associated \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin

Fig. 8
figure 8

Breakdown of the uncertainties in the JES measurement with the Z+jet direct balance method as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The sources include the statistical uncertainty, variations of the generator (simulation modelling), variations of the event selection (pile-up (JVT), sub-leading jet veto, \(\Delta \phi \)), the uncertainties in the energy scale and resolution of electrons (e E-scale and e E-resolution) and muons (\(\mu \) E-scale and \(\mu \) E-resolution), and the uncertainty in the pile-up conditions (\(N_{\text {PV}}\) shift). These uncertainties are also discussed in the context of small-\(R\) jets in Ref. [9]. The lines are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

5.3 \(\gamma \)+jet balance

The large-\(R\) jet energy scale can be measured using the \(\gamma \)+jet final state (Fig. 2b). This method exploits the fact that the energy of photons is measured more precisely than that of jets. As cross-section for this process is larger than that for Z+jets production, this balance technique probes higher large-\(R\) jet \(p_{{{\text {T}}}}\). The \(\gamma \)+jet method is based on the balance between photons and large-\(R\) jets, using the ratio \(R_{{\mathrm {DB}}}\) defined in Eq. (3), where the reference momentum \(p_{{{\text {T}}}} ^{{\text {ref}}} = p_{{{\text {T}}}} ^{\gamma } \big |\cos \left( \Delta \phi \right) \big |\) is the component of \(p_{{{\text {T}}}} ^{\gamma }\) collinear with the jet.

The double ratio of \(R_{{\mathrm {DB}}} ^{\mathrm {data}} / R_{{\mathrm {DB}}} ^{{\mathrm {MC}}}\) measures any residual modelling effects in the jet energy scale calibration. If the reference photon is well measured experimentally and the \(\gamma \)+jet events are correctly modelled in simulation, any deviation from unity in the double ratio can be attributed to a mismodelling of the jet response in the Monte Carlo simulation.

Events are selected using the lowest unprescaled single-photon trigger. The offline selection requires the presence of a photon satisfying the ‘tight’ identification and isolation requirements [59, 60] with at least 140 \({\text {GeV}}\) of \(E_{{\mathrm {T}}}\). This criterion ensures full trigger efficiency. As in the case of Z+jet balance (Sect. 5.2), the presence of significant additional radiation in the event invalidates the assumption of a balanced topology. Events are therefore vetoed if a reconstructed, calibrated \(R=0.4\) jet built from electromagnetic-scale topo-clusters has a \(p_{{{\text {T}}}}\) which satisfies \(p_{{{\text {T}}}} >\max (0.1 \, p_{{{\text {T}}}} ^{{\text {ref}}},15~{\text {GeV}})\). Small-R jets with \(p_{{{\text {T}}}} < 60\) \({\text {GeV}}\) must also satisfy a JVT requirement. Photons must be separated from reconstructed large-\(R\) jets by at least \(\Delta \phi ({\mathrm {J}},\gamma )>2.8\). The simulation calibration and \(\eta \)-intercalibration described in Sects. 4 and 5.1 are applied to the large-\(R\) jets studied here.

A photon purity correction is applied to the mean balance results in data to correct for contamination from misidentified jets or electrons that may skew the nominal \(p_{{{\text {T}}}}\) balance. The contamination of the photon sample by fakes is derived from data using the double-sideband, or ABCD, method [61, 62] in the plane spanned by the photon isolationFootnote 2 and the photon identification measure.Footnote 3 The purity correction results in a shift of the relative \(R_{{\mathrm {DB}}}\) value between data and simulation of about 2%.

In Fig. 9 the result is shown as a function of the reference \(p_{{{\text {T}}}}\) for large-\(R\) jets in the region \(|\eta |<0.8\). The ratio of the predicted response in the simulation to the measured response is shown in the inset below the main panel. As already observed in Sect. 5.2, the ratio of simulation to data is above unity over the whole \(p_{{{\text {T}}}}\) range. These results are included in the in situ calibration that corrects the jet energy response in data.

The uniformity of the large-\(R\) jet response across the detector geometry is shown in Fig. 10, as a validation of the \(\eta \)-intercalibration procedure (Sect. 5.1). The relative response across the detector is constant and well behaved.

Fig. 9
figure 9

The momentum balance \(R_{{\mathrm {DB}}}\) extracted from \(\gamma \)+jet events in data and simulations as a function of the transverse momentum \(p_{{{\text {T}}}}\) of the large-\(R\) jet. The ratio of the results obtained from the nominal Pythia simulation and from data is shown in the bottom panel. The ratio of Pythia to Sherpa results, taken as a systematic uncertainty associated with modelling, is included in the shaded band in the ratio panel, which also includes statistical and systematic uncertainties from other sources. For each \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin, the measured \(R_{{\mathrm {DB}}}\) is plotted against the average jet \(p_{{{\text {T}}}}\) of the bin. The horizontal error bars give an indication of the width of the associated \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin

Fig. 10
figure 10

The momentum balance \(R_{{\mathrm {DB}}}\) extracted from \(\gamma \)+jet balance distributions in data and simulation as a function of the large-\(R\) jet detector pseudorapidity \(\eta _{{\text {det}}} \). The ratio of the results obtained from the nominal Pythia simulation to the results from data is shown in the bottom panel. The ratio of Pythia to Sherpa results, taken as a systematic uncertainty associated with modelling, is included in the shaded band in the ratio panel, which also includes statistical and systematic uncertainties from other sources

There are three main categories of systematic uncertainties in the \(R_{{\mathrm {DB}}}\) measurement: those related to the modelling of additional QCD radiation which affects the balance, uncertainties associated with the photons [63, 64], and effects due to the presence of pile-up jets. The effects of extra radiation on the balance are assessed by varying the topological selections and the overlap removal as described in Table 3. Repeating the analysis separately using \(\Delta \phi ({\mathrm {J}},{\mathrm {j}})>1.2\) and \(\Delta \phi ({\mathrm {J}},{\mathrm {j}})>1.6\) produces a negligible systematic shift relative to the nominal result. The effects of the photon measurement are assessed by varying the energy scale and resolution of the photon calibration, as well as by varying the measured photon purity in the purity correction. The effects of pile-up jets on the calibration are estimated by varying the JVT selection threshold for the small-R jets. Lastly, the analysis is repeated with Sherpa 2.1 MC samples, in place of the nominal Pythia 8 samples, to assess the modelling uncertainty. As shown in Fig. 11, the overall combined systematic and statistical uncertainty is approximately 1% for the \(p_{{{\text {T}}}}\) range from 150 to 880 \({\text {GeV}}\). The photon energy scale uncertainty is the dominant source over the entire \(p_{{{\text {T}}}}\) range.

Table 3 Summary of the selection and systematic variations considered for the \(\gamma \)+jet direct balance analysis. The labels J\(_{1}\) refers to the leading large-\(R\) jet and j\(_1\) to the leading small-R jet that fulfils \(\Delta R ({\mathrm {J}}_1,{\mathrm {j}}) > 1.4\)
Fig. 11
figure 11

Systematic uncertainties in the in situ measurement of the jet energy scale obtained with the \(\gamma \)+jet method as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

5.4 Multijet balance

The Z+jet and \(\gamma \)+jet techniques provide precise constraints on the jet energy scale for jets with \(p_{{{\text {T}}}}\) up to 1 \({\text {TeV}}\). The energy scale of higher-\(p_{{{\text {T}}}} \) large-\(R\) jets is measured using multijet events. A schematic representation of the event topology used in this method is shown in Fig. 2c. The multijet balance (MJB) method takes advantage of events where an energetic large-\(R\) jet is balanced against a system that consists of multiple lower-\(p_{{{\text {T}}}}\) jets.

For the calibration of large-\(R\) jets the reference \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}\) is obtained as the four-vector sum of calibrated small-\(R\) anti-\(k_{t}\) jets. The transverse momentum balance is

$$\begin{aligned} R_{{\mathrm {MJB}}} = \left\langle \frac{p_{{{\text {T}}}} ^{{\mathrm {J}}}}{p_{{{\text {T}}}} ^{{\mathrm {recoil}}}} \right\rangle , \end{aligned}$$

where \(p_{{{\text {T}}}} ^{\mathrm {J}}\) is the transverse momentum of the leading large-\(R\) jet and \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}\) is the magnitude of the vectorial sum of the transverse momenta of the recoil system of small-R jets. The average value of the ratio is taken to be the mean value of a Gaussian fit. The value of \(R_{{\mathrm {MJB}}}\) is measured in data and determined in simulation in several bins of \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}\). The data-to-simulation double ratio \(R_{{\mathrm {MJB}}} ^{{\mathrm {data}}}/R_{{\mathrm {MJB}}} ^{{\mathrm {MC}}}\) allows estimation of the response for high-\(p_{{{\text {T}}}} \) jets.

Events are selected using single small-R jet triggers. Bins of \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}\) are defined to correspond to a given fully efficient single small-R jet trigger. The triggers used for 200 \({\text {GeV}}\) < \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}<\) 550 \({\text {GeV}}\) are prescaled, whereas an unprescaled jet trigger is used for \(p_{{{\text {T}}}} ^{{\mathrm {recoil}}}>\) 550 \({\text {GeV}}\).

The event selection is summarized in Table 4. For small-R jets with \(p_{{{\text {T}}}} {} < 60\) \({\text {GeV}}\) within \(|\eta |<2.4\), the JVT selection is applied to suppress pile-up jets. The large-\(R\) probe jet is required to have \(|\eta _{{\mathrm {det}}}|<\) 0.8, while the small-R jets that constitute the recoil system are required to have \(|\eta _{{\mathrm {det}}}|<\) 2.8 and \(p_{{{\text {T}}}} >25\) \({\text {GeV}}\). To select events with multijet recoil systems, the leading jet in the recoil system (\({\mathrm {j_1}}\)) is allowed to have no more than 80% of the total transverse momentum of the recoil system. This selection ensures that the recoil system consists of several jets with lower \(p_{{{\text {T}}}}\) than the large-\(R\) jet, which are each well-calibrated by small-R jet in situ techniques [9]. The angle \(\alpha \) in the azimuthal plane between the leading large-\(R\) jet and the vector defining the recoil system is required to satisfy \(|\alpha -\pi |<\) 0.3. The \(\Delta R\) distance \(\beta \) between the leading large-\(R\) jet and the nearest small-R jet from the recoil system is required to be greater than 1.5. The simulation calibration and \(\eta \)-intercalibration described in Sects. 4 and 5.1 are applied to the large-\(R\) jets studied using this technique.

Table 4 Summary of the event selection and systematic variations considered for the multijet direct balance analysis. The label j\(_i\) refers to the ith leading small-R jet

Figure 12 shows the distribution of \(R_{{\mathrm {MJB}}}\) as a function of the large-\(R\) jet \(p_{{{\text {T}}}}\). The balance in data decreases from approximately 1.01 at \(p_{{{\text {T}}}}\) = 300 \({\text {GeV}}\) to about 0.99 for jets with \(p_{{{\text {T}}}}\) = 2 \({\text {TeV}}\). The simulation shows a similar downward trend. The response in simulations is 2% higher than in data, consistent with the findings of the other methods where they overlap.

The total uncertainty in the \(R_{{\mathrm {MJB}}}\) measurement is approximately \(\pm \,2\%\) or lower for \(p_{{{\text {T}}}}\) < 2 \({\text {TeV}}\). The uncertainty in the energy scale of the jets of the recoil in situ procedure is propagated through the large-\(R\) MJB procedure. Uncertainties associated with high-\(p_{{{\text {T}}}}\) jets in the recoil system which lie beyond the region covered by the \(R=0.4\) in situ analyses are derived from measurements of the calorimeter response to isolated single charged particles, which are also propagated through this large-\(R\) jet analysis to provide coverage at the highest values of jet \(p_{{{\text {T}}}}\) (> 1 \({\text {TeV}}\)) [65]. No assumption is made about the flavour of the recoil jets (originating from a gluon, a light quark, or a heavy-flavour quark). This lack of knowledge is a source of systematic uncertainty. The uncertainty in the multijet-balance observable due to the jet flavour response is evaluated using a correlated propagation of the small-R jet flavour response uncertainties, i.e. all jets are shifted simultaneously.

In addition to the jet calibration and uncertainties in the reference scale, the event selection criteria and the modelling in the event generators directly affect the \(p_{{{\text {T}}}}\) balance used to obtain the multijet-balance results. The impact of the event selection criteria is investigated by shifting each event selection criterion up and down by a specified amount and observing the change in the multijet-balance variable. Using an approach to systematic uncertainties similar to that in the small-R in situ analysis, the transverse momentum threshold for recoil jets is shifted by ± 5 \({\text {GeV}}\), the \(p_{{{\text {T}}}} ^{{\mathrm {j1}}}/p_{{{\text {T}}}} ^{{\mathrm {recoil}}}\) is shifted by ± 0.1, the angle \(\alpha \) is shifted by ± 0.1, and \(\beta \) is shifted by ± 0.4. The uncertainty due to modelling of multijet events in simulations is estimated from the largest difference between the multijet-balance results obtained from the nominal Pythia 8 simulation and those obtained from Sherpa v2.1 and Herwig 7. Figure 13 shows the breakdown of the fractional uncertainties in the jet energy scale derived from this method. Various uncertainties propagated from the reference jet system dominate the measurement across the entire \(p_{{{\text {T}}}}\) range.

Fig. 12
figure 12

Mean transverse momentum balance \(R_{{\mathrm {MJB}}}\) for leading-\(p_{{{\text {T}}}} {}\) large-\(R\) jets (\(|\eta |<\) 0.8) balanced against a system of at least two small-R jets (\(p_{{{\text {T}}}} {} \ge 25\) \({\text {GeV}}\), \(|\eta |<\) 2.8) as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The measured balance is compared with the prediction of Monte Carlo simulations based on the event generators Pythia 8, Sherpa 2.1, and Herwig 7. Below, the ratio of response measurements in data and simulation is presented. The shaded band indicates the total uncertainty of the measurement, described in detail in the text. For each \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin, the measured \(R_{{\mathrm {DB}}}\) is plotted against the average jet \(p_{{{\text {T}}}}\) of the bin. The horizontal error bars gives an indication of the width of the associated \(p_{{{\text {T}}}} ^{{\text {ref}}}\) bin

Fig. 13
figure 13

The fractional uncertainty in \(R_{{\mathrm {MJB}}}\) as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

6 In situ jet mass calibration

In this section, two methods to derive an in situ calibration for the large-\(R\) jet mass are presented. The first method, known as the \(R_{{\text {trk}}}\) method, relies on the tracker to provide an independent measurement of the jet mass scale and its associated uncertainty. The second method, known as forward folding, fits the mass peaks and jet mass response of the W boson and top quark to measure the relative energy and mass scales and resolutions between data and simulations. Both measurements are performed after applying the in situ calibration for the energy scale, which also affects the jet mass scale. The results in this section are combined into a global jet mass calibration, detailed in Sect. 8.

6.1 Calorimeter-to-tracker response ratios

The calorimeter-to-tracker response double-ratio method (or \(R_{{\text {trk}}}\) method) is built around the fact that the ATLAS detector provides two independent measurements of the properties of the same jet from the calorimeter and the tracker [3]. Jets formed from inner detector tracks only take into account the hits from their charged-particle constituents. Calibrated jets formed from energy depositions within the calorimeter provide a measure of the properties from the full shower. The average calorimeter-to-track jet response

$$\begin{aligned} R_{{\text {trk}}} = \left\langle \frac{p_{{{\text {T}}}} ^{{\mathrm {calo}}}}{p_{{{\text {T}}}} ^{{\mathrm {track}}}}\right\rangle \end{aligned}$$

is proportional to the average calorimeter-to-truth jet response. Therefore, a comparison of the double ratio of \(R_{{\text {trk}}}\) in simulations and data provides a way to validate the modelling of large-\(R\) jet properties in situ. The ratio of \(R_{{\text {trk}}} \) values determined in data and simulations should be equal to unity for well-modelled observables. Any deviation from this expectation can be taken as a scale uncertainty in the measurement. This method is versatile and allows the determination of uncertainties for several variables, such as the \(p_{{{\text {T}}}}\), mass, and substructure information of large-\(R\) jets. Moreover, the dijets process provides a very large sample, such that the analysis can be performed in a large number of \(p_{{{\text {T}}}}\) and mass or \(m/p_{{{\text {T}}}} \) regions.

Figure 14 shows \(R_{{\text {trk}}}\) as a function of the large-\(R\) jet \(p_{{{\text {T}}}}\) in dijet events for data and several simulation samples. The maximum spread between the two generators and three tracking variations that assume three different types of mismodelling (resolution [66], efficiency within dense environments [67], and alignment [68]) is about 8%. A steady increase in the calorimeter-to-track jet response \(R_{{\text {trk}}}\) with increasing large-\(R\) jet \(p_{{{\text {T}}}}\) is observed, going well beyond the expected ratio of the total and charged transverse momenta of a jet, caused by inefficiencies in the tracker response at high jet \(p_{{{\text {T}}}}\).

Figure 15 shows a breakdown of the uncertainties in the large-\(R\) jet \(p_{{{\text {T}}}}\) derived from this method for the transverse momentum for large-\(R\) jets with values of \(m/p_{{{\text {T}}}} \approx 0.2\). The main source of uncertainty across the entire \(p_{{{\text {T}}}}\) range originates from differences between data and the nominal Monte Carlo generator considered in this study. As this uncertainty was expected to be large, the \(R_{{\text {trk}}}\) method is neither included in the in situ JES combination nor used as a source of systematic uncertainty for the JES of large-\(R\) jets. Rather, the \(R_{{\text {trk}}}\) \(p_{{{\text {T}}}}\) results are used as an independent cross-check to validate the JES calibration techniques.

Fig. 14
figure 14

Measurement of \(R_{{\mathrm {trk}}}^{p_{{{\text {T}}}}}\) as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\) for large-\(R\) jets with \(m/p_{{{\text {T}}}} =0.2\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. Data are compared with three generators and with three tracking variations for the default generator Pythia 8 (shown as a band around these points). The double ratio of \(R_{{\mathrm {trk}}}^{p_{{{\text {T}}}}}\) measured in simulations and data is shown in the lower panel

Fig. 15
figure 15

The total uncertainty in the relative jet energy scale in data and simulations associated with the \(R_{{\mathrm {trk}}}\) method is plotted as a function of jet transverse momentum \(p_{{{\text {T}}}}\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The contributions from several sources are indicated. The baseline uncertainty represents the deviation of the double ratio from unity for the baseline simulations. The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

The same method is also applied to the large-\(R\) jet calorimeter mass, and is shown in Fig. 16. The largest difference between the considered generators is \(\sim \) 2 to 3%. Figure 17 shows the various uncertainties in the large-\(R\) jet mass derived from the \(R_{{\text {trk}}}\) mass response for large-\(R\) jets with \(m/p_{{{\text {T}}}} =0.2\). Again, the main source of uncertainty originates from differences between data and the nominal simulation.

Fig. 16
figure 16

Measurement of \(R_{{\mathrm {trk}}}^m\) as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\) for large-\(R\) jets with \(m/p_{{{\text {T}}}} =0.2\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. Data are compared with three generators and with three tracking variations for the default generator Pythia 8 (shown as a band around these points). The double ratio of \(R_{{\mathrm {trk}}}^m\) measured in simulations and data is shown in the lower panel

Fig. 17
figure 17

The total uncertainty in the relative jet mass scale between data and simulation associated with the \(R_{{\mathrm {trk}}}\) method is plotted as a function of jet transverse momentum \(p_{{{\text {T}}}}\) for large-\(R\) jets with \(m/p_{{{\text {T}}}} =0.2\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The contributions from several sources are indicated. The baseline uncertainty represents the deviation of the double ratio from unity for the baseline simulations. The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

The \(R_{{\mathrm {trk}}}\) method can also be used to study the topology dependence of the response modelling. The double ratio is constructed in two event samples, with different jet flavours (jets originating from light quarks or gluons and jets containing a hadronic top quark decay). The dijet sample used for Fig. 14 is dominated by gluon jets at low transverse momenta, while at higher momenta the fraction of light-quark jets in the sample increases. The \(t\bar{t}\) sample of Sect. 6.2 is enriched in large-\(R\) jets that contain a complete high-\(p_{{{\text {T}}}}\) object’s decay (either a top quark or W boson). In Fig. 18 the double ratios of the two samples are compared for jet \(p_{{{\text {T}}}}\) and jet mass. The jets in the samples correspond to the same pseudorapidity range \(|\eta |<\) 2.0 and the same \(p_{{{\text {T}}}}\) and jet mass intervals. In both samples, the double ratio is constructed with the nominal simulation events, which rely on Pythia 8 for hadronization. As systematic uncertainties are expected to partially cancel out, only statistical uncertainties are shown.

Fig. 18
figure 18

The simulation/data ratio of \(R_{{\mathrm {trk}}}\) for a large-\(R\) jet \(p_{{{\text {T}}}}\) and b calorimeter mass as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). Two sets of results are derived from a dijet sample, dominated by light-quark and gluon jets, and a \(t\bar{t}\) sample, where the large-\(R\) jets contain a boosted W boson or top quark. The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The jets in both samples correspond to the same pseudorapidity range \(|\eta |<\) 2.0 and the same \(p_{{{\text {T}}}}\) and jet mass intervals. The double ratio is constructed with the nominal Pythia 8 samples for dijet events and Powheg +Pythia 8 samples for the \(t\bar{t}\) sample. The error bars indicate statistical uncertainties

There is a mild tension between the double-ratio results from the two samples. The double ratio in the \(t\bar{t}\) sample is systematically somewhat higher than the equivalent result in the dijet sample. The difference is typically 1% or less, except in the first bin of the double ratio for jet mass. This is significant compared to the statistical uncertainties but is small in comparison with the modelling uncertainties of the \(R_{{\mathrm {trk}}}\) method. Some properties of these two jet populations differ, such as the distribution of their \(m/p_{{{\text {T}}}} \) and their flavour composition, and so it is not expected that the modelling uncertainties will cancel out exactly. No additional uncertainty is assigned to account for the topology dependence.

6.2 Forward folding

A high-purity signal sample of large-\(R\) jets with high-\(p_{{{\text {T}}}}\), hadronically decaying W bosons and top quarks is obtained by selecting \(t\bar{t}\) events in the lepton+jets final state, where a hadronically decaying top quark balances one which decays to a leptonically decaying W boson and b-quark. This sample is used to measure the response for jets in signal-like topologies which contain jets consisting of multiple regions of high energy density [69, 70]. The jet mass response is determined by fits to the W boson and top quark mass peaks in the large-\(R\) jet invariant mass distribution of the hadronically decaying top quark candidate.

The event selection is based on the ATLAS search for \(t\bar{t}\) resonances [71] and is summarized in Table 5. It requires a central high-\(p_{{{\text {T}}}}\), isolated muon, and significant missing transverse momentum (\(E_{{\text {T}}}^{{\text {miss}}}\)) [72]. The W boson transverse mass obtained from \(m_{{\text {T}}}^2=2p_{{\text {T}}}^{{\text {lep}}}E_{{\text {T}}}^{{\text {miss}}}(1-\cos (\Delta \phi ))\), where \(\Delta \phi \) is the azimuthal angle between the charged lepton and the direction of the missing transverse momentum, must be greater than 60 \({\text {GeV}}\). A multivariate b-tagging algorithm is used to identify \(R=0.4\) jets which originate from the decays of b-quarks based on information about the impact parameters of inner detector tracks matched to the jet, the presence of displaced secondary vertices, and the reconstructed flight paths of b- and c-hadrons inside the jet; the 70% signal tagging efficiency working point is used here [73].

Table 5 Summary of the event selection for the top quark events decaying into lepton+jets, to be used for top and W mass calibration. The \(p_{{{\text {T}}}}\) bins into which events are divided are also shown

The large-\(R\) jet mass distribution of the highest-\(p_{{{\text {T}}}}\) large-\(R\) jet in the hemisphere opposite to the charged lepton is shown in Fig. 19 for two categories of events, and for both the calorimeter-only and track-assisted jet masses. For large-\(R\) jets with intermediate \(p_{{{\text {T}}}}\) (200 \({\text {GeV}}\) \(< p_{{{\text {T}}}} < \) 350 \({\text {GeV}}\)), in Fig. 19a, c, the decay products of the hadronic W boson are captured in a single large-\(R\) jet. For high-\(p_{{{\text {T}}}}\) jets with \(p_{{{\text {T}}}}\) > 350 \({\text {GeV}}\), in Fig. 19b, d, the complete hadronic top decay is captured in the main large-\(R\) jet. The high-\(p_{{{\text {T}}}}\) W boson and top quark topologies are confirmed by, respectively, vetoing or requiring a b-tagged small-R jet that overlaps with the large-\(R\) jet.

Fig. 19
figure 19

The distributions of the jet invariant mass for large-\(R\) jets in samples enriched in a, c boosted W bosons and b, d boosted top quarks. The distribution of the calorimeter mass is shown in (a) and (b), and the distribution of the track-assisted mass of the same jets is shown in (c) and (d). The large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The template estimated from simulations is rescaled to match the observed yield. The lower panels display the data-to-simulation ratio. The error bars on the data represent the statistical uncertainty. The dashed uncertainty band on the simulation template includes the systematic uncertainties due to signal and detector modelling

The track-assisted mass (Eq. (1)) is obtained by scaling the invariant mass of the charged-particle jet by the ratio of the \(p_{{{\text {T}}}}\) of the calorimeter and charged-particle jets. The resulting jet mass distributions in the W boson and top quark large-\(R\) jet samples are presented in Fig. 19c, d. The selection for this second set of plots is entirely based on the properties of the matched calorimeter jet, such that plots (a) and (c) and plots (b) and (d) are populated by the same jets. The track-assisted mass peaks in (c) and (d) are slightly broader than the calorimeter-based mass peaks in (a) and (b) for large-\(R\) jets with a large invariant mass and relatively low \(p_{{{\text {T}}}}\).

The position and shape of the mass peaks provide information about the large-\(R\) jet mass scale and resolution. Values for the ratio of the response in data and simulations (\(s = R^{m}_{{\mathrm {data}}}/R^{m}_{{\mathrm {MC}}}\)) and the ratio of the resolution in data and simulations (\(r = \sigma ^{m}_{{\mathrm {data}}}/\sigma ^{m}_{{\mathrm {MC}}}\)) are extracted from the jet mass spectrum. These two parameters are extracted simultaneously in a fit referred to as forward folding [10]. This method produces simulation-based predictions of the jet mass spectrum with variable response and resolution. This is achieved by folding particle-level jets with a response function. The default response function is taken from the nominal simulations. The predicted detector-level jet mass spectrum for arbitrary values of s and r is obtained by modifying the response function by

$$\begin{aligned} m^{{\mathrm {fold}}} = s \, m^{{\mathrm {reco}}} + \left( m^{{\mathrm {reco}}} - m^{\text {truth}} \, R_m\left( m^{\text {truth}},p_{{{\text {T}}}} ^{\text {truth}}\right) \right) \,(r - s), \end{aligned}$$

where \(m^{{\mathrm {reco}}}\) is the detector-level large-\(R\) jet mass and \(R_m\) is the large-\(R\) jet mass response. The value of \(R_m\) is obtained from simulations, as discussed in Sect. 4. Typical values of \(R_m\) are in the range 0.8–1.5, depending on jet \(p_{{{\text {T}}}}\) and mass. The forward-folding procedure does not require the response to be Gaussian. The scale factors s and r also modify the non-Gaussian tails of the response function, if these are present in the simulations.

The prediction from simulation is fit to the data by minimizing the \(\chi ^2\) built with the predicted and observed distributions. The best-fit values for s and r are taken as the data-to-simulation scale factors for the large-\(R\) jet mass response and jet mass resolution. This method has the advantage that the response for the \(t\bar{t}\) events and events from other Standard Model processes is varied consistently. It was first applied to 2012 data [10]. Further details of the forward-folding procedure are in Refs. [43, 74].

The results of the fits are shown in Fig. 20. The data sample is divided in several \(p_{{{\text {T}}}}\) bins. The W boson peak is fitted in two intervals: 200 \({\text {GeV}}\) \(<~p_{{{\text {T}}}} < \) 250 \({\text {GeV}}\) and 250 \({\text {GeV}}\) \(<~p_{{{\text {T}}}} < \) 350 \({\text {GeV}}\). The top quark peak is fitted for \(p_{{{\text {T}}}}\) between 350 and 500 \({\text {GeV}}\) and between 500 \({\text {GeV}}\) and 1 \({\text {TeV}}\). The small error bars on the points represent the statistical uncertainty, and the larger error bars represent the total uncertainty. The dominant systematic effect is expected to be due to the modelling of top quark pair production, estimated by repeating the analysis with Powheg + Herwig 7, Sherpa, and several variations of the generator settings that regulate the probability of hard initial- and final-state radiation.

Fig. 20
figure 20

Summary of the in situ measurements of the large-\(R\) jet mass response in \(t\bar{t}\) events with a lepton+jets final state as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The closed circles correspond to the JMS and JMR of trimmed large-\(R\) jets reconstructed from calorimeter clusters. The open circles represent the equivalent result for the track-assisted mass. The dashed lines, corresponding to ±1% for the JMS and ±10% for the JMR, are drawn for reference. The results in the first two \(p_{{{\text {T}}}}\) bins (200 \({\text {GeV}}\) \(< p_{{{\text {T}}}} < \) 250 \({\text {GeV}}\) and 250 \({\text {GeV}}\) \(< p_{{{\text {T}}}} < \) 350 \({\text {GeV}}\)) correspond to a sample of high-\(p_{{{\text {T}}}}\) W bosons, and the highest two bins (350 \({\text {GeV}}\) \(<~p_{{{\text {T}}}} < \) 500 \({\text {GeV}}\) and 0.5 \({\text {TeV}}\) \(<~p_{{{\text {T}}}} < \) 1 \({\text {TeV}}\)) correspond to high-\(p_{{{\text {T}}}}\) top quarks. In each subsample, the JMS and JMR are extracted simultaneously in a two-parameter fit to the mass distribution. The statistical and total uncertainties are indicated with the small and large error bars on the data points, respectively

An in situ calibration is also derived for the track-assisted mass in a completely analogous fashion. The JMS and JMR results are shown with open circles in Fig. 20. The statistical and systematic uncertainties are indicated on the data points. The systematic uncertainties are dominated by modelling uncertainties and are expected to be strongly correlated between the two measurements. The in situ scales of the two mass measurements are found to be within 1% for all points and within 0.5% for three out of four. As the track-assisted mass is primarily sensitive to the \(p_{{{\text {T}}}}\) response of the calorimeter, this level of agreement implies that the \(p_{{{\text {T}}}}\) and mass scales are closely connected for these high-mass jets with relatively low \(p_{{{\text {T}}}}\).

Measurements of the \(p_{{{\text {T}}}}\) response of high-\(p_{{{\text {T}}}}\) W bosons or top quarks can be obtained directly by fitting the balance distribution of the two top quark candidates. This provides a cross-check of the direct balance methods discussed previously in Sects. 5.15.4 in a topology with a very different radiation pattern. The reference system is formed by the b-jet, the charged lepton, and the neutrino from the semileptonic top quark decay. It is reconstructed by adding the four-vectors of the charged lepton, the leading (and possibly b-tagged) small-\(R\) jet in a cone of size \(\Delta R =\) 1.5 around the charged lepton, and the neutrino [75]. The transverse momentum of the neutrino is inferred by assigning the \(E_{{\text {T}}}^{{\text {miss}}}\) to the neutrino \(p_{{{\text {T}}}}\), and its \(p_z\) can be reconstructed using a W-mass constraint (but does not affect the balance measurement). The resulting balance distribution of the probe jet \(p_{{{\text {T}}}}\) and the recoiling semileptonic top quark decay system has a distinctive peak around 1. The peak position is sensitive to the large-\(R\) jet energy scale, and its width is sensitive to the resolution. Measurements of the relative jet mass scale and resolution obtained by fitting the balance distribution with the same forward-folding technique are shown in Fig. 21, after the application of the in situ JES calibration derived from light quark and gluon jets (Sect. 5). The results are compatible with unit JES within the precision of the measurement. This provides another confirmation that the Monte Carlo modelling of the response of high-\(p_{{{\text {T}}}}\), hadronically decaying W bosons or top quarks is adequate within 2–3%, and that a calibration derived from jets without hard substructure is applicable to topologies with hard substructure.

Fig. 21
figure 21

Summary of the in situ measurements of the large-\(R\) jet response in \(t\bar{t}\) events with a lepton+jets final state as a function of the large-\(R\) jet transverse momentum \(p_{{{\text {T}}}}\). The closed circles correspond to the JES and JER of trimmed large-\(R\) jets reconstructed from calorimeter clusters. The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The error bars represent the total uncertainty. Statistical uncertainties are indicated with the inner error bar (only visible on some of the points). The dashed lines, corresponding to ± 1% for the JES and ± 10% for the JER, are drawn for reference. The results in the first two \(p_{{{\text {T}}}}\) bins (200 \(< p_{{{\text {T}}}} < \) 250 \({\text {GeV}}\) and 250 \(< p_{{{\text {T}}}} < \) 350 \({\text {GeV}}\)) correspond to a sample of high-\(p_{{{\text {T}}}}\) W bosons, and the highest two bins correspond to high-\(p_{{{\text {T}}}}\) top quarks

7 Measurement of the large-\(R\) jet \(p_{{{\text {T}}}}\) resolution

The in situ measurement of the ATLAS jet \(p_{{{\text {T}}}}\) resolutionFootnote 4 relies on a measurement that exploits the momentum balance between the leading and sub-leading large-\(R\) jets in dijet events. This measurement follows the event selection criteria outlined for the \(\eta \)-intercalibration provided in Sect. 5.1, including the trigger strategy. The simulation calibration and \(\eta \)-intercalibration described in Sects. 4 and 5.1 are applied to the large-\(R\) jets studied here, and the large-\(R\) jet \(p_{{{\text {T}}}}\) is also corrected using the combination of the in situ direct balance techniques discussed in Sects. 5.25.3, and 5.4, which is presented in Sect. 8.

The asymmetry distribution of 5.1 is studied in dijet events in bins of the dijet system \(p_{{{\text {T}}}} ^{\text {avg}}\) and the probe large-\(R\) jet \(\eta _{{\text {det}}} \). The width of the asymmetry distribution depends on the resolution of the jet \(p_{{{\text {T}}}}\) measurement and on the intrinsic particle-level width, which arises due to balance fluctuations and out-of-cone effects. Since the latter effect is uncorrelated with the detector response, the component of the asymmetry width due to the detector resolution can be determined by subtracting in quadrature the asymmetry width of particle-level (‘truth-level’) jets from that of reconstructed jets, giving

$$\begin{aligned} \sigma _{\mathcal {A},{\text {det}}}&= \sqrt{\sigma _{\mathcal {A},{\text {reco}}}^2 - \sigma _{\mathcal {A},{\text {truth}}}^2}. \end{aligned}$$

The jet energy resolution is measured in two \(\eta _{{\text {det}}}\) bins: the central reference region \(|\eta _{{\text {det}}} | < 0.8\), denoted “ref”, and a forward region \(0.8< |\eta _{{\text {det}}} | < 2.0\), denoted “fwd”. If both large-\(R\) jets are within the central reference region, they have the same \(p_{{{\text {T}}}}\) resolution. In this case, the determination of the probe jet is arbitrary, and the assignment proceeds using a random-number generator. Since both jets contribute the same amount to the asymmetry distribution, the relative jet-\(p_{{{\text {T}}}}\) resolution of the reference region is defined by

$$\begin{aligned} \left( \frac{\sigma _{p_{{{\text {T}}}}}}{p_{{{\text {T}}}}}\right) _{\text {ref}} = \frac{\sigma _{\mathcal {A},{\text {det}}}^{\text {ref}}}{\sqrt{2}}. \end{aligned}$$

The resolution of forward jets is extracted from the width of the asymmetry distribution in events where a central reference jet balances a forward probe jet (in the region \(0.8< |\eta _{{\text {det}}} | < \) 2.0). The result is corrected for the resolution of central jets by subtracting the asymmetry of central dijet systems, giving

$$\begin{aligned} \left( \frac{\sigma _{p_{{{\text {T}}}}}}{p_{{{\text {T}}}}}\right) _{\text {fwd}} = \sqrt{(\sigma _{\mathcal {A},{\text {det}}}^{\text {fwd}})^2 - \frac{(\sigma _{\mathcal {A},{\text {det}}}^{\text {ref}})^2}{2}}. \end{aligned}$$
(4)

Figure 22 shows \(\sigma _{\mathcal {A}}\) for reconstructed- and truth-level dijet systems as a function of \(p_{{{\text {T}}}} ^{\text {avg}}\) in two \(\eta _{{\text {det}}}\) bins, as well as for data. For each of the event generators, the width of the detector-level asymmetry is shown as a solid line, while the particle-level asymmetry is indicated by a dashed line. For forward jets, the additional correction shown in Eq. (4) is applied to account for the effect of the resolution of the large-\(R\) jet within the central reference region.

Fig. 22
figure 22

Width of the dijet asymmetry distribution obtained using reconstructed (\(\sigma _{\mathcal {A},{\mathrm {reco}}}\)) and particle-level jets (\(\sigma _{\mathcal {A},{\mathrm {truth}}}\)) as a function of the average jet transverse momentum \(p_{{{\text {T}}}} ^{\mathrm {avg}}\). Results are shown a for events where both jets have detector pseudorapidity in the range \(|\eta _{{\text {det}}} | < 0.8\) and b for events where the probe jet has \(0.8< |\eta _{{\text {det}}} | < 2.0\), and the reference jet is still within \(|\eta _{{\text {det}}} | < 0.8\). The measurement is compared with the prediction from simulations based on the three generators Pythia 8, Herwig 7, and Sherpa 2.1. Also an unweighted average of the three is shown. The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. Statistical errors are usually smaller than the size of the marker. The resolution at the particle level is also shown as a dashed line

Following the correction for the particle-level width, the results of a fit to the asymmetry distribution obtained in data and from several event generators (Pythia 8, Herwig 7, and Sherpa 2.1) are shown in Fig. 23, where the measured relative resolution \( \sigma ( p_{{{\text {T}}}} {} ) / p_{{{\text {T}}}} \) is plotted as a function of the average \(p_{{{\text {T}}}}\) of the two jets, \(p_{{{\text {T}}}} ^{{\mathrm {avg}}}\). The correction for the particle-level resolution is estimated using the Pythia sample. The measured resolution in the central region is in fair agreement with the predicted resolution. The resolution of forward jets in data and simulations is compatible within the observed uncertainties. The choice of event generator has a small effect on the resolution.

In Fig. 23, the relative \(p_{{{\text {T}}}}\) resolution, \(p_{{{\text {T}}}} ^{{\mathrm {reco}}} / p_{{{\text {T}}}} ^{{\mathrm {true}}}\), as predicted by the simulations is compared with the result of the extraction of the resolution from the asymmetry in simulated events. The difference between the two indicates a bias in the method that is taken as an additional uncertainty (labelled non-closure).

Fig. 23
figure 23

Comparison of the measured jet \(p_{{{\text {T}}}}\) resolution with the resolution determined in simulation, averaged between different generators as a function of the average jet \(p_{{{\text {T}}}} \) and in two bins of detector pseudorapidity \(\eta _{{\text {det}}} \) from a \(|\eta _{{\text {det}}} | < 0.8\) and from b \(0.8< |\eta _{{\text {det}}} | < 2.0\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. The error band, drawn as a light band, represents the statistical and systematic uncertainties added in quadrature. The determination of the relative resolution using the in situ technique for an average of three simulations and their envelope is also shown as a dark band. Inconsistencies between the resolution determined using the in situ technique and of the resolution determined from the response in simulation by matching particle-level jets to reco-level jets (light dotted line) are taken as an additional uncertainty in the measurement. The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

The total uncertainty in the determination of the JER is shown in Fig. 24 as a function of the average \(p_{{{\text {T}}}}\) and in the two \(\eta _{{\text {det}}}\) regions. A breakdown of the uncertainties into individual sources is presented. The large-\(R\) jet energy scale is varied according to its uncertainty, leading to a 10–15% variation in the measured resolution due to its impact on the asymmetry (labelled as ‘JES uncertainty’). The non-closure uncertainty is found to be a nearly constant 10% effect in the central region and to be 5–10% in the forward region. The \(\Delta \phi \) requirement is also varied by ± 0.5, which has a small effect primarily for low-\(p_{{{\text {T}}}}\) jets. The modelling uncertainty is estimated as the variation of the result when using different generators for the particle-level momentum imbalance, where Pythia 8 is chosen as a nominal sample and Herwig 7 and Sherpa 2.1 are chosen as the variations.

Fig. 24
figure 24

The relative uncertainty in the ratio of the jet transverse momentum \(p_{{{\text {T}}}}\) resolution measured in dijet events and in simulations as a function of the average jet \(p_{{{\text {T}}}} \) in pseudorapidity \(\eta \) bins a \(|\eta | < 0.8\) and b \(0.8< |\eta | < 2.0\). The large-\(R\) jet \(p_{{{\text {T}}}}\) is corrected using the simulation calibration, \(\eta \)-intercalibration, and a combination of in situ direct balance techniques. Contributions from three sources are estimated separately by propagating the uncertainty in the energy scale to the measurement, by varying the \(\Delta \phi \) selection, and by varying the event generator. The lines shown are obtained by smoothing a binned representation of these uncertainties using a sliding Gaussian kernel

8 Combined large-\(R\) jet calibration results

The measurements of the trimmed large-\(R\) jet response relative to simulation obtained using the different in situ methods presented in Sects. 5 and 6 are combined to determine the relative jet energy and mass scales over a broad range of jet transverse momenta. The combination procedure is described in detail in Ref. [76].

The data-to-simulation response ratios obtained from the \(\gamma \)+jet, Z+jet, and multijet balance methods are combined to produce a jet \(p_{{{\text {T}}}}\)-dependent calibration curve. The uncertainties in the \(p_{{{\text {T}}}}\) calibration are obtained by error propagation of the uncertainties associated with the in situ methods. A jet mass calibration is derived analogously using the jet mass response measurements provided by the forward-folding and \(R_{{\text {trk}}}\) methods.

The measurements of the \(p_{{{\text {T}}}}\) response are performed in bins of the jet transverse momentum (the \(p_{{{\text {T}}}} ^{{\mathrm {ref}}}\) values are translated to jet \(p_{{{\text {T}}}}\)) and evaluated inclusively in mass. The jet mass response combination is performed in bins of the jet transverse momentum and in two bins of the jet mass. The combination proceeds in three steps which take into account correlations between uncertainties and possible inconsistencies between the in situ methods:

  • Simple Monte Carlo method: Pseudo-experiments are created that represent the ensemble of measurements and contain the full data-treatment chain including interpolation and averaging (described in the following steps). These pseudo-experiments are used to consistently propagate all uncertainties into the evaluation of the average. They are generated taking into account all known correlations by coherently shifting all correction factors by one standard deviation. The difference between the shifted-correction result and the nominal result provides an estimate of the propagated systematic uncertainty.

  • Interpolation: The relative \(p_{{{\text {T}}}}\) (mass) response is defined in fine \(p_{{{\text {T}}}} \) bins, separately for each in situ method using interpolating splines based on first- or second-order polynomials.

  • Averaging: The actual combination is carried out using a weighted average of the in situ measurements based on a \(\chi ^2\)-minimization. The weights take into account the statistical and systematic uncertainties, as well as correlations and differing bin sizes. The local \(\chi ^2\) is also useful to define the level of agreement between in situ measurements where they overlap.

The uncertainty sources are treated according to the Hessian formalism: each uncertainty source is fully correlated across kinematic regions (i.e. as a function of \(p_{{{\text {T}}}}\) and \(\eta \)) but is uncorrelated with other sources. Sources of uncertainty that affect both the small-\(R\) and large-\(R\) jet in situ calibration are treated as fully correlated. The reduced \(\chi ^2\) is estimated as \(\sqrt{\chi ^2/N_{{\mathrm {dof}}}}\), where \(N_{{\mathrm {dof}}}\) is the number of degrees of freedom (in this case, the number of combined measurements contributing to the average in a particular \(p_{{{\text {T}}}}\) bin). In case of disagreement between different in situ measurements, i.e. when the reduced \(\chi ^2\) value is larger than 1, the uncertainty sources are rescaled by \(\sqrt{\chi ^2/N_{{\mathrm {dof}}}}\).

A smoothing procedure using a variable-size sliding interval with a Gaussian kernel is applied to the response ratio and its associated systematic uncertainties. This smoothing removes spikes due to statistical fluctuations in the measurements, as well as discontinuities at the first and last point in a given measurement.

In Fig. 25, the ratio of the jet \(p_{{{\text {T}}}}\) response in data and simulations is shown as a function of the jet transverse momentum. Data points are shown for the \(\gamma \)+jet, Z+jet, and multijet balance methods, and the band corresponds to the result of the combination.

Fig. 25
figure 25

Data-to-simulation ratio of the average jet \(p_{{{\text {T}}}}\) response as a function of large-\(R\) jet \(p_{{{\text {T}}}} \). The combined result (band) is based on three in situ techniques: the Z+jet balance method (open squares), \(\gamma \)+jet balance method (closed triangles), and the multijet balance (open triangles). The errors represent the statistical (inner error bars) and the total uncertainty (statistical and systematic uncertainties added in quadrature, outer error bars). The results apply to trimmed anti-\(k_t\) jets with \(R=1.0\). The lines shown are smoothed using a sliding Gaussian kernel

The relative weight in the fit of the three methods is shown in Fig. 26. The Z+jet balance makes the largest contribution up to transverse momenta of approximately 500 \({\text {GeV}}\). Between 500 \({\text {GeV}}\) and 1 \({\text {TeV}}\), the \(\gamma \)+jet balance receives the largest weight. At higher \(p_{{{\text {T}}}}\), the multijet balance method acquires more weight in the combination. Beyond 1 \({\text {TeV}}\), it provides the only measurement and extends the jet energy scale beyond 2 \({\text {TeV}}\).

The local \(\chi ^2\) per degree of freedom in Fig. 27 quantifies the level of agreement between the three sets of measurements. The results of the three methods agree in the whole \(p_{{{\text {T}}}}\) range \(0.1~{\text {TeV}}< p_{{{\text {T}}}} < 1\) \({\text {TeV}}\), where all three provide results.

Fig. 26
figure 26

The weight assigned to different techniques in the combination of in situ measurements of the relative \(p_{{{\text {T}}}}\) response of large-\(R\) jets in data and simulations, as a function of the jet transverse momentum \(p_{{{\text {T}}}}\). For each \(p_{{{\text {T}}}}\) bin, the weight of the Z+jet, \(\gamma \)+jet, and multijet balance methods are shown. The slight discontinuities observed in the weights correspond to the onset of the multijet balance method at \(p_{{{\text {T}}}} \sim \) 300 \({\text {GeV}}\) and to the upper end of the Z+jets method for \(p_{{{\text {T}}}} \sim \) 800 \({\text {GeV}}\)

The combined \(p_{{{\text {T}}}}\) response in data is approximately 3% lower than in the simulation over most of the \(p_{{{\text {T}}}}\) range. The deviation from unity in the data/MC ratio is significant, as the total uncertainty approaches 1% in the intermediate \(p_{{{\text {T}}}}\) region. These observations are consistent with previous in situ measurements of the \(R=0.4\) JES during Run 2 [9] with similar levels of associated uncertainty. At low \(p_{{{\text {T}}}}\), the uncertainty reaches about 1% at 200 \({\text {GeV}}\). Above 1.5 \({\text {TeV}}\), the uncertainty increases, reaching over 2% at 2.4 \({\text {TeV}}\).

A breakdown of the total JES uncertainty is presented graphically in Figs. 28 and 29. This includes uncertainties in \(\gamma \)+jet, Z+jet, and multijet balance methods associated with the simulation modelling, reference system construction and calibration, and the event selection. Furthermore, as the large-\(R\) multijet balance method uses small-R jets as a reference system, all nuisance parameters from the small-R jet calibration enter as uncertainties in the combination presented here.

Fig. 27
figure 27

The \(\chi ^2/N_{{\mathrm {dof}}}\) of the combination of in situ measurements of the relative jet-\(p_{{{\text {T}}}}\) response of large-\(R\) jets in data and simulations as a function of the jet transverse momentum \(p_{{{\text {T}}}}\). The \(\chi ^2/N_{{\mathrm {dof}}}\) indicates the level of tension between the results of the \(\gamma \)+jet and multijet balance methods in each \(p_{{{\text {T}}}}\) bin. For transverse momenta beyond 1 \({\text {TeV}}\), only one method is available, and the \(\chi ^2/N_{{\mathrm {dof}}}\) goes to zero

Fig. 28
figure 28

Breakdown of the combined uncertainty in the large-\(R\) jet \(p_{{{\text {T}}}}\) response as a function of the jet transverse momentum \(p_{{{\text {T}}}} \), for the a \(\gamma \)+jet and b Z+jet analyses. Contributions are shown for each of the nuisance parameters of the \(\gamma \)+jet and Z+jet balance methods. The vertical axis reflects the uncertainty introduced by a given nuisance parameter in combination, incorporating the weight of the method from which it originates. The nuisance parameters related to the \(\gamma \)+jet method (both directly and through their effect on the multijet balance) are shown in the left panel, and those of the Z+jets method are shown in the right panel. The lines shown are smoothed using a sliding Gaussian kernel

Fig. 29
figure 29

Breakdown of the combined uncertainty in the large-\(R\) jet \(p_{{{\text {T}}}}\) response as a function of the jet transverse momentum \(p_{{{\text {T}}}} \). Contributions are shown for nuisance parameters of the multijet balance method for nuisance parameters a originating from the MJB selection and b propagated from the small-R jets which constitute the recoil system. The vertical axis reflects the uncertainty introduced by a given nuisance parameter in combination, incorporating the weight of the method from which it originates. Since the multijet balance method relies on the small-R jet \(p_{{{\text {T}}}}\), nuisance parameters from all associated uncertainties are propagated. The lines shown are smoothed using a sliding Gaussian kernel

The combination of the jet mass response includes results from two methods. Forward folding provides four measurements in the \(p_{{{\text {T}}}}\) range below 1 \({\text {TeV}}\). The \(R_{{\text {trk}}}\) method takes advantage of a large data sample and can be finely binned in mass and \(p_{{{\text {T}}}}\), extending to over 2 \({\text {TeV}}\). The combined result is shown in Fig. 30 for two jet mass intervals: the plot in the upper panel corresponds to the W boson mass window with \(50~{\text {GeV}}< m < 120~{\text {GeV}}\), and the lower panel corresponds to the top quark mass window with \(120~{\text {GeV}}< m < 300~{\text {GeV}}\).

The in situ jet mass calibration factor is defined from the combined mass response shown in Figure 30 as \(c_m = R_{\text {MC}}^m/R_{\text {data}}^m\). It is applied as a scale factor to the jet mass but does not affect the jet momentum vector. The full calibration applied to large-\(R\) jets in data impacts the reconstructed jet energy, mass, pseudorapidity, and \(p_{{{\text {T}}}}\) according to

$$\begin{aligned} \begin{aligned} E_{{\mathrm {reco}}}&= c_s\sqrt{E_0^2+c_{\text {JMS}}\,m_0\left( c_m^2-1\right) },\\ m_{{\mathrm {reco}}}&= c_s\,c_{\text {JMS}}\,c_m\,m_0,\\ \eta _{{\mathrm {reco}}}&=\eta _0+\Delta \eta ,\\ p_{{{\text {T}}}} ^{{\mathrm {reco}}}&= c_s\,\sqrt{\left( E_0^2-c_{\text {JMS}}^2\,m_0^2\right) }\cosh {(\eta +\Delta \eta )}, \end{aligned} \end{aligned}$$

where \(c_s =c_{\text {JES}}\,c_{\text {abs}}\,c_{\text {rel}}\) is the product of several calibration factors. The factor \(c_{\text {JES}}\) corresponds to the simulation-based JES calibration, \(c_{\text {rel}}\) to the relative in situ correction obtained from the \(\eta \)-intercalibration, and \(c_{\text {abs}}\) to the absolute in situ correction from the balance methods. All c-factors and the factor \(\Delta \eta \) are smooth functions of the large-\(R\) jet kinematics. The terms \(E_0\), \(m_0\), \(\eta _0\) and \(\vec {p}_0\) refer to the jet properties prior to any calibration, as returned by the trimming algorithm.

The measured JMS correction is consistent with unity within the precision of the combined measurements. This suggests that the application of an in situ JES correction is sufficient to correct the JMS of these trimmed large-\(R\) jets in the mass and \(p_{{{\text {T}}}}\) ranges considered here. The level of precision with which the JMS is measured depends on the kinematic region in question. For large-\(R\) jets in the high-mass bin with \(p_{{{\text {T}}}}\) between 400 \({\text {GeV}}\) and 1 \({\text {TeV}}\), the uncertainties are 2–5%. In other kinematic regions the uncertainty is larger, approaching 10% at high \(p_{{{\text {T}}}}\) in both mass bins.

Fig. 30
figure 30

Data-to-simulation ratio of the average jet mass response as a function of the large-\(R\) jet \(p_{{{\text {T}}}} \). Corrections using a combination of two in situ methods, the \(R_{{\text {trk}}}\) and forward-folding approaches, are applied. The fit is performed for large-\(R\) jet mass in the W mass range 50–120 \({\text {GeV}}\) (upper), and the top mass range 120–300 \({\text {GeV}}\) (lower). The error bars represent the statistical and systematic uncertainties added in quadrature. The results apply to anti-\(k_{t}\) jets with \(R=1.0\) calibrated with the LC+JES+JMS scheme. The lines shown are smoothed using a sliding Gaussian kernel

Fig. 31
figure 31

Breakdown of the combined JMS uncertainty shown in Fig. 30 as a function of jet transverse momentum \(p_{{{\text {T}}}} \) for the jet mass bin 50–120 \({\text {GeV}}\). Contributions are shown for each of the nuisance parameters of the a \(R_{{\text {trk}}}\) and b forward-folding methods. The vertical axis reflects the uncertainty introduced by a given nuisance parameter in combination, incorporating the weight of the method from which it originates. This weight is dominated at high \(p_{{{\text {T}}}}\) by the \(R_{{\text {trk}}}\) method. The lines shown are smoothed using a sliding Gaussian kernel

The contributions of several sources to the uncertainty in the combined jet mass scale are presented in Figs. 31 and 32. In both the \(R_{{\text {trk}}}\) and forward-folding techniques, the leading systematic uncertainties are associated with uncertainties in the event generators across most of the \(p_{{{\text {T}}}}\) range and for the two mass intervals considered.

Fig. 32
figure 32

Breakdown of the combined JMS uncertainty shown in Fig. 30 as a function of jet transverse momentum \(p_{{{\text {T}}}} \) for the jet mass bin 120–300 \({\text {GeV}}\). Contributions are shown for each of the nuisance parameters of the a \(R_{{\text {trk}}}\) and b forward-folding methods. The vertical axis reflects the uncertainty introduced by a given nuisance parameter in combination, incorporating the weight of the method from which it originates. This weight is dominated at high \(p_{{{\text {T}}}}\) by the \(R_{{\text {trk}}}\) method. The lines shown are smoothed using a sliding Gaussian kernel

9 Conclusion

Several in situ calibration methods are used to measure the response of the ATLAS detector to trimmed large-\(R\) jets using 36.2 \(\text{ fb }^{-1}\) of \(\sqrt{s}=\) 13 \({\text {TeV}}\) proton–proton collision data provided by the LHC and collected by the ATLAS experiment during 2015 and 2016. These methods exploit the transverse momentum balance in events where a jet recoils against a reference system with a precisely known energy scale, the independence of measurements performed with different sub-detectors, or the position and width of known mass peaks. With this ensemble of techniques, dedicated jet energy scale and jet mass scale calibrations are derived for large-\(R\) jets. The results of several techniques applied to a variety of final states are consistent within the uncertainties, indicating that after calibration, the simulations model the flavour dependence of the jet \(p_{{{\text {T}}}}\) and mass response to within a few percent.

The results of all methods are combined taking into account correlations between uncertainties and possible discrepancies between the results of different in situ methods. The combined measurement of the ratio of the energy scales in data and simulations are used to derive an in situ correction to the response, which determines the large-\(R\) jet energy and mass scales. The residual uncertainty in the ratio of the energy scales in data and simulations is 1–2% for transverse momenta from 150 \({\text {GeV}}\) to 2 \({\text {TeV}}\). The precision of the jet mass scale varies from 2 to 10% over the same \(p_{{{\text {T}}}}\) range. The results of the simulations for jet \(p_{{{\text {T}}}}\) and mass resolution are also validated in situ and found to agree with the measured resolution within 10–15%. The in situ JES calibration, derived from light quark and gluon jets, is found to fully correct the energy and mass scales of high \(p_{{{\text {T}}}}\) W bosons and top quarks to within the precision of the present measurement (1–3%).

Large-\(R\) jets are a vital ingredient of the ATLAS physics programme. This new in situ calibration leads to significantly reduced uncertainties in the reconstructed large-\(R\) jet \(p_{{{\text {T}}}}\) and mass, thus increasing the sensitivity of searches and the precision of Standard Model measurements using large-\(R\) jets.