1 Introduction

The precise reconstruction of the trajectories of charged particles created in proton–proton (pp) and heavy-ion collisions at CERN’s Large Hadron Collider (LHC) is a key ingredient in many of the physics processes studied by the ATLAS Collaboration. Almost every measurement performed using the ATLAS detector [1], from Standard Model processes to searches for new physics phenomena, relies on the accurate reconstruction of charged particles.

In order to reconstruct the trajectories of charged particles, ATLAS uses the Inner Detector (ID) tracking system to provide efficient, robust and precise position measurements of charged particles as they traverse the detector. The energy deposits from charged particles (hits) recorded in individual detector elements of the ID are used to reconstruct their trajectories (tracks) and estimate the associated track parameters. The precision achieved for the track parameters is determined by several factors: the intrinsic resolution of sensitive devices; the knowledge of the magnetic field; the distribution of material in and before the ID and the knowledge of it; and the knowledge of the geometry, i.e. the location and orientation, of the detector elements. The purpose of the detector alignment is to determine, as precisely as possible, the actual geometry of the active detector elements of the tracking system, and to follow changes in the geometry with time.

Poor knowledge of the actual geometry of the active detector elements results in a deterioration of the resolution of reconstructed track parameters. The criteria for the minimum precision required were defined in order to limit the degradation of the resolution of the track parameters for high-momentum tracks to less than 20% in comparison to a perfectly aligned detector [2]. In addition, correlated geometrical distortions can lead to systematic biases in the reconstructed track parameters. Correlated systematic biases can be introduced either by real detector deformations to which the alignment procedure has little sensitivity or by the procedure used to determine the alignment parameters. These correlated biases are referred to as ‘weak modes’ of the alignment.

In this document, the ATLAS ID alignment procedure and its performance during Run 2 of the LHC is presented. A new layer of pixel sensors was included in the detector for Run 2, which posed additional challenges for the alignment of the detector compared to those faced during Run 1 [3, 4]. The greatest new challenge was the short-timescale movement of parts of the detector during data taking.

This paper is organised as follows: a brief description of the ATLAS detector is given in Sect. 2. Section 3 presents the formalism of the ATLAS track-based ID alignment. Section 4 introduces the different alignment levels and Sect. 5 discusses the detector stability and describes the time-dependent alignment. The performance of the ATLAS Run 2 alignment is presented in terms of track parameter biases in Sects. 6 and 7. Concluding remarks are made in Sect. 8.

2 The ATLAS detector

The ATLAS detector [1] at the LHC is a multipurpose particle detector with a forward–backward symmetric cylindrical geometry that covers nearly the entire solid angle around the collision point. The global ATLAS reference frame is a right-handed Cartesian coordinate system, where the origin is at the nominal pp interaction point, corresponding to the centre of the detector. The positive x-axis points to the centre of the LHC ring, the positive y-axis points upwards and the z-axis points along the beam direction. Polar coordinates \((r,\phi )\) are used in the transverse plane, \(\phi \) being the azimuthal angle around the beam pipe. The pseudorapidity is defined in terms of the polar angle \(\theta \) as \(\eta =-\ln \tan (\theta /2)\). Angular distance is measured in units of \(\Delta R \equiv \sqrt{(\Delta \eta )^2 + (\Delta \phi )^2}\).

ATLAS consists of the ID (described in Sect. 2.1), electromagnetic and hadronic calorimeters, a muon spectrometer and a magnet system. Lead/liquid-argon sampling calorimeters provide electromagnetic energy measurements with high granularity and a steel/scintillator-tile hadronic calorimeter covers the central pseudorapidity range of \(\vert \eta \vert < 1.7\). The endcap and forward regions are instrumented with liquid-argon calorimeters for measurements of both electromagnetic and hadronic showers up to \(\vert \eta \vert = 4.9\). The outer part of the detector consists of a muon spectrometer with high-precision tracking chambers for coverage up to \(\vert \eta \vert = 2.7\), fast detectors for triggering over \(\vert \eta \vert < 2.4\), and three large superconducting toroid magnets with eight coils each. The ATLAS detector has a two-level trigger system to select events for offline analysis [5].

2.1 Inner Detector structure

The ATLAS ID [2, 6] consists of three subdetectors utilising three technologies: silicon pixel detectors, silicon strip detectors and straw drift tubes, all surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field [7]. The ID is designed to reconstruct charged particles within a pseudorapidity range of \(\vert \eta \vert < 2.5\) (see Fig. 1 for a schematic view of the ID barrel region and Table 1 for a list of the main detector characteristics). The material distribution inside the ID has been studied in data through use of hadronic interactions and photon conversion vertices [8, 9]. During the second LHC data-taking run (2015–2018) with pp collisions at a centre-of-mass energy \(\sqrt{s}=13\,\text {TeV}\), the ID collected data with an efficiency greater than 99% [10].

Fig. 1
figure 1

A 3D visualisation of the structure of the barrel of the ID. The beam pipe, the IBL, the Pixel layers, the four cylindrical layers of the SCT and the three layers of TRT barrel modules consisting of 72 straw layers are shown

Table 1 Summary of the main characteristics of the ID subdetectors. The intrinsic resolution of the IBL and the Pixel sensors are reported along r\(\phi \) and z, while for SCT and TRT only the resolution along r\(\phi \) is given [1, 11]. For SCT and TRT the element size refers to the spacing of the read-out strips and the diameter of the straw tube, respectively
Fig. 2
figure 2

Schematic representation of the ATLAS global reference frame (xyz) and the local reference frame of each component of the ID. The Pixel, IBL, and SCT modules are grouped in the ‘Silicon’ category. For each component, the local-x axis points along the most sensitive direction; the local-z axis points away of the ATLAS centre; and the local-y direction is chosen according to the right-handed frame. For TRT tubes, the local reference frame is determined by the orientation of the module they are mounted on. For visualisation purposes only, the local reference frame is referred to as \((x',y',z')\) in the drawing

The innermost part of the Inner Detector consists of a high-granularity silicon pixel detector and includes the insertable B-layer (IBL) [11, 12], a new tracking layer added for Run 2 which is closest to the beam line and designed to improve the precision and robustness of track reconstruction. The IBL consists of 280 silicon pixel modules arranged on 14 azimuthal carbon fibre staves surrounding the beam pipe at a radius of 33.25 mm. Each stave is instrumented with 12 two-chip planar modules, covering the region of \(\vert \eta \vert <2.7\), and 8 single-chip modules with 3D sensors [13, 14], four at each end of the stave (\(2.7<\vert \eta \vert <3\)). The remainder of the Pixel detector [2, 6, 15] consists of 1744 silicon pixel modules arranged in three barrel layers and two endcaps with three disks each. Each pixel module comprises 16 front-end chips bump-bonded to the sensor substrate. The barrel modules were assembled on staves of 13 modules each, whilst the endcap modules were assembled directly on the disks. In order to simplify the notation throughout the rest of the paper, the term Pixel is used to refer only to the detector already in place during Run 1 and the new layer is referred to explicitly as the IBL.

The Semiconductor Tracker (SCT) [16,17,18] consists of 4088 silicon strip modules. They are arranged in four barrel layers and two endcaps with nine disks each. Each module consists of two pairs of single-sided strip sensors glued back-to-back with a 40 mrad angle between them. Each module comprises 12,128-channel chips. Due to the stringent build tolerances each SCT module is considered a solid object for the purposes of alignment. The barrel modules are mounted directly on the cylindrical support structures of each layer in 12 rings, whilst the endcap modules are assembled in 3 rings on the disks. The barrel SCT sensors have a uniform pitch strip of \(80\,\upmu \hbox {m}\), while the endcap sensor strips run radially with a \(161.5\,\upmu \hbox {rad}\) angular pitch.

The Transition Radiation Tracker (TRT) [19,20,21] is the outermost subdetector and extends track reconstruction radially outwards to a radius of 1082 mm. It is made of 350,848 gas-filled straw tubes of 4 mm diameter. The tubes are arranged in 96 barrel modules in 3 layers (32 modules per layer) and 40 disks in each endcap. The expected hit resolutions for each subdetector are summarised in Table 1.

2.2 Local coordinate system

The local coordinate system of an individual sensor of the detector is a right-handed system frame with the origin placed in the geometrical centre of the sensor. The local coordinate system for each subsystem component is illustrated in Fig. 2. The convention used is the following: the local-x axis points along the most sensitive direction of the sensor. This corresponds to the shorter pitch side for Pixel and IBL modules, and perpendicular to the strip-orientation for the SCT. In the silicon detectors, the local-y axis is oriented along the long side of the sensor (i.e. longer pitch direction for the Pixels and IBL and the strip direction in the SCT), while the local-z direction is orthogonal to the local xy plane. In the case of the TRT, the local-y axis points along the wire: either in the same direction as the global z-axis (barrel) or radially outwards (endcaps). In the barrel, the local-z axis points radially outwards (from the origin of the global frame to the straw centre). In the endcaps, the local-z axis points outwards (parallel to the beam line). The local-x axis is perpendicular to both the TRT wire and the radial direction.

Hits are reconstructed in the local reference frame. The TRT measures the radial distance of the primary ionisation from the wire as \(\sqrt{x^2+z^2}\), taking both x and z in the local frame.

3 Alignment principles and formalism

This section reviews the formalism for in situ alignment of the ATLAS ID using reconstructed tracks. The concept of Global \(\chi ^2\) alignment is introduced, followed by a discussion of ‘weak modes’ (Sects. 3.2.4, 6) and how they can be avoided by adding constraints on track parameters. The section closes with a detailed description of the alignment procedure and its implementation within the ATLAS software.

The approach used is based on the Newton–Raphson method and determines both the trajectory parameters and a set of alignment parameters, \(\varvec{\alpha }\). In this context, \(\varvec{\alpha }\) are chosen as the six degrees of freedom (DoF) of each alignable structure that uniquely define its position and orientation in space. These correspond to three translations \((T_x,T_y,T_z)\) and three rotations \((R_x,R_y,R_z)\). Translations are relative to the origin of the reference frame of each alignable structure and rotations are around the Cartesian axes.

3.1 Track fitting with the Newton–Raphson method

The Newton–Raphson method uses an iterative approach to find the best fit to a set of measurements of a track left in the detector by a charged particle traversing active detector elements. The quality of the fit is characterised by a track \(\chi ^2\), determined from the distances between the hits in the detector, which constitutes the track measurements, and the fitted track (residuals). The trajectory of a track in a magnetic field is parameterised by a set of five parameters. The chosen parameterisation in ATLAS is: \(\varvec{\tau }=(d_0,~z_0,~\phi _0,~\theta _0,~q/p)\), where \(d_0\) and \(z_0\) are the transverse and longitudinal impact parameters and \(\phi _0\) and \(\theta _0\) the azimuthal and polar angles of the track, all defined at the point of closest approach to the z-axis of the reference frame [22]. The ratio q/p is the inverse of the particle momentum (p) multiplied by its charge (q) (see Ref. [4] for more details).

The track \(\chi ^2\) is calculated from all measured track-hit residuals, \(r_i=e_i(\varvec{\tau })-m_i\). where \(m_i\) is the position of the \(i{\mathrm {th}} \) measurement, and \(e_i\) is the position of the intersection of the fitted track with the surface on which the \(i{\mathrm {th}} \) measurement is made. The determination of the intersection position (\(e_i\)) includes the measurement in question, which causes \(r_i\) to be a biased residual. The track \(\chi ^2\) is defined, using vector notation, as

$$\begin{aligned} \chi ^2 = {\varvec{r}}^\top \Omega ^{-1} {\varvec{r}}, \end{aligned}$$
(1)

where \({\varvec{r}}\) is the vector of track residuals and \(\Omega \) is the covariance matrix of the corresponding measurements.Footnote 1 The parameters of a track’s trajectory, \(\varvec{\tau }\), are those that minimise this \(\chi ^2\). The minimisation is done using the first and second derivatives of the \(\chi ^2\) with respect to \(\varvec{\tau }\). Defining the derivative \(G=\hbox {d}{\varvec{r}}/\hbox {d}\varvec{\tau }\), the condition for the minimisation of the \(\chi ^2\) is

$$\begin{aligned} \left( \frac{\hbox {d}\chi ^2}{\hbox {d}\varvec{\tau }}\right) ^\top = 2~G^\top \Omega ^{-1}{\varvec{r}}= 0\,\,. \end{aligned}$$
(2)
Fig. 3
figure 3

Schematic representation of a charged particle crossing detector planes. The measurement, \(m_i\), on each the \(i{\mathrm {th}} \) layer is indicated by a red star. Also shown are the fitted track trajectory for a given set of track parameters, \(\varvec{\tau }\) (black line), the position of the intersection of the fitted track with the surface on which the \(i{\mathrm {th}} \) measurement is made, \(e_i(\varvec{\tau })\) (green ellipse), and the residuals, \(r_i\) (blue line)

In practical terms, the values of \(\varvec{\tau }\) satisfying Eq. (2) are found using an iterative procedure by evaluating the first and second derivatives of \(\chi ^2\) with respect to the track parameters of the current iteration, \(\varvec{\tau }_0\). If the derivative G were constant, then the problem would be linear and the solution would be exact. In general, the derivative G depends on the track parameters themselves. Therefore, the procedure is repeated until a convergence criterion is met.

The track fit is further improved by taking into account the impact of material interactions on the trajectory of the particle. Energy loss is treated as point-like at the center of material layers and for hadrons and muons, which are used during the alignment, deterministic as the variance of the energy loss processes is small. Additional parameters, \(\varvec{\theta }\), are added to account for the effects of multiple Coulomb scattering (MCS) of the particle with the detector components, as detailed in Appendix A. Consequently, the residuals now also depend on \(\varvec{\theta }\) and the variance of the scattering angles, \(\Theta \):

$$\begin{aligned} \chi ^2_{\mathrm {Track}}= {\varvec{r}}^\top \Omega ^{-1} {\varvec{r}}+ \varvec{\theta }^\top \Theta ^{-1} \varvec{\theta }. \end{aligned}$$
(3)

Thus, \(\chi ^2_{\mathrm {Track}}\) has to be minimised for \(\varvec{\tau }\) and \(\varvec{\theta }\) simultaneously. The derivatives of residuals with respect to track and scattering parameters are defined as \(G \equiv \partial {\varvec{r}}/ \partial \varvec{\tau }\) and \(S \equiv \partial {\varvec{r}}/ \partial \varvec{\theta }\), respectively. In the following, the Global \(\chi ^2\) method for alignment including MCS is described and the following simplified notation is adopted:

$$\begin{aligned} \varvec{\rho }\equiv & {} \left( \begin{array}{c} {\varvec{r}}\\ \varvec{\theta }\end{array} \right) , \quad V \equiv \left( \begin{array}{cc} \Omega &{} 0 \\ 0 &{} \Theta \end{array} \right) , \quad \varvec{\pi }\equiv \left( \begin{array}{c} \varvec{\tau }\\ \varvec{\theta }\end{array} \right) , \quad \mathrm {and}\nonumber \\ H\equiv & {} \left( \begin{array}{cc} G &{} S \\ \frac{\partial \varvec{\theta }}{\partial \varvec{\tau }}=0 &{} \frac{\partial \varvec{\theta }}{\partial \varvec{\theta }}=\mathbf{I } \end{array} \right) . \end{aligned}$$
(4)

It should be noted that refinements are made to the track fit by performing multiple iterations of the fit during which: material effects are recalculated; measurements are recalibrated based on the particles incident angle; and outliers are removed.

3.2 The Global \(\chi ^2\) method for alignment

3.2.1 General definition of the Global \(\chi ^2\) method for alignment

The Global \(\chi ^2\) is a track-based alignment method which uses a \(\chi ^2\) built from a large sample of reconstructed tracks and their associated hits in the detector elements being aligned. The alignment parameters are determined by minimising the Global \(\chi ^2\) with respect to the alignment parameters:

$$\begin{aligned} \chi ^2_{\mathrm {Global}}= \sum _i \chi ^2_{\mathrm {Track}\,i}~, \end{aligned}$$
(5)

where \(\chi ^2_{\mathrm {Track}\,i}\) is the \(\chi ^2\) of the \(i{\mathrm {th}} \) track as given by Eq. (3). The residuals used in Eq. (5) depend on the alignment parameters (\(\varvec{\alpha }\)) as both, the measurements and the track extrapolations depend on \(\varvec{\alpha }\), the former directly and the latter through the fitted track parameters. Therefore, the minimisation of \(\chi ^2_{\mathrm {Global}}\) with respect to \(\varvec{\alpha }\) uses the total derivative operator with respect to \(\varvec{\alpha }\), which can be expressed as:

$$\begin{aligned} \frac{\text{ d }}{\text{ d }\alpha _i } = \frac{\partial }{\partial \alpha _i } + \sum _{j}\frac{\text{ d }\pi _j }{\text{ d }\alpha _i }\frac{\partial }{\partial \pi _j }~. \end{aligned}$$
(6)

The \(\text{ d }\varvec{\pi }/ \text{ d }\varvec{\alpha }\) term is determined from the condition that, once \(\chi ^2_{\mathrm {Global}}\) is at a minimum, \(\chi ^2_{\mathrm {Track}}\) is also at a minimum with respect to the track parameters:

$$\begin{aligned} \frac{\text{ d }}{\text{ d }\varvec{\alpha }} \frac{\partial \chi ^2_{\mathrm {Track}}}{\partial \varvec{\pi }} = 0~. \end{aligned}$$
(7)

Using Eq. (7) in Eq. (6), this results in:

$$\begin{aligned} \frac{\text{ d }\varvec{\pi }}{\text{ d }\varvec{\alpha }} = - \left( \frac{\partial ^2 \chi ^2_{\mathrm {Track}}}{\partial \varvec{\pi }^2}\right) ^{-1} \frac{\partial ^2 \chi ^2_{\mathrm {Track}}}{\partial \varvec{\alpha }\partial \varvec{\pi }}~, \end{aligned}$$

which allows the nested dependence of the \(\varvec{\pi }\) on \(\varvec{\alpha }\) to be resolved, thereby removing the need to determine both (the track parameters and alignment parameters) simultaneously.

Ignoring second-order derivatives in the residuals, using the covariance matrix of the track parameters, C, expressed as

$$\begin{aligned} C = 2 \left( \frac{\partial ^2\chi ^2_{\mathrm {Track}}}{\partial \varvec{\pi }^2}\right) ^{-1} = \left( H^\top V^{-1} H \right) ^{-1} , \end{aligned}$$
(8)

and defining A as the derivative of the residuals with respect to the alignment parameters:

$$\begin{aligned} A \equiv \frac{\partial \varvec{\rho }}{\partial \varvec{\alpha }}~, \end{aligned}$$
(9)

the total derivative operator with respect to \(\varvec{\alpha }\) can be written as:

$$\begin{aligned} \left( \frac{\text{ d }}{\text{ d }\varvec{\alpha }}\right) ^\top = \left( \frac{\partial }{\partial \varvec{\alpha }} \right) ^\top - A^\top V^{-1} H C \left( \frac{\partial }{\partial \varvec{\pi }} \right) ^{\top }~. \end{aligned}$$

The first- and second-order derivatives of \(\chi ^2_{\mathrm {Global}}\) with respect to \(\varvec{\alpha }\) are thus:

$$\begin{aligned} {\varvec{Y}}\equiv & {} \left( \frac{\text{ d }\chi ^2_{\mathrm {Global}}}{\text{ d }\varvec{\alpha }} \right) ^\top \nonumber \\= & {} 2 \sum _{\mathrm {tracks}} A^\top V^{-1} (V - HCH^\top ) V^{-1} \varvec{\rho }~, \end{aligned}$$
(10)
$$\begin{aligned} {\mathcal {M}}\equiv & {} \frac{\text{ d}^2\chi ^2_{\mathrm {Global}}}{\text{ d }\varvec{\alpha }^2} \nonumber \\= & {} 2 \sum _{\mathrm {tracks}} A^\top V^{-1} (V - HCH^\top ) V^{-1} A~. \end{aligned}$$
(11)

Here, the term \(HCH^\top \) represents the covariance of the track parameters in the measurement space, whereas the covariance of the residuals of the track fit is given by

$$\begin{aligned} R = V - HCH^\top ~. \end{aligned}$$

3.2.2 Newton–Raphson method for Global \(\chi ^2\) alignment

In analogy to the general method for track fitting (Sect. 3.1) an iterative approach is used to solve for the alignment parameters. The first- and second-order derivatives are obtained using Eqs. (10) and (11) and evaluated for an initial set of alignment parameters, \(\varvec{\alpha }_0\). Such an initial geometry description is available from design drawings, survey measurements, or previous alignment results. The alignment corrections, to the initial geometry, are given by

$$\begin{aligned} {\varvec{X}}\equiv & {} \Delta \varvec{\alpha }= - \left( \left. \frac{\text{ d}^2\chi ^2_{\mathrm {Global}}}{\text{ d }\varvec{\alpha }^2} \right| _{\varvec{\alpha }_0}\right) ^{-1} \left. \left( \frac{\text{ d }\chi ^2_{\mathrm {Global}}}{\text{ d }\varvec{\alpha }}\right) ^{\top }\right| _{\varvec{\alpha }_0} \nonumber \\\equiv & {} - {\mathcal {M}}^{-1} {\varvec{Y}}. \end{aligned}$$
(12)

The above step is repeated for successive iterations until a convergence criterion is met and \(\Delta \varvec{\alpha }\) is negligibleFootnote 2. This requires re-fitting the tracks using the updated geometry (initial alignment constants \(\varvec{\alpha }_0\) plus their corrections \(\Delta \varvec{\alpha }\)), to obtain new residuals and new derivatives, and solving again to compute the next set of corrections to the alignment constants.

3.2.3 Locality ansatz

If the initial track parameters, \(\varvec{\pi }_0\), minimise \(\chi ^2_{\mathrm {Global}}\) for a given \(\varvec{\alpha }_0\), Eq. (10) simplifies to

$$\begin{aligned} \left. \left( \frac{\text{ d }\chi ^2_{\mathrm {Global}}}{\text{ d }\varvec{\alpha }}\right) ^{\top }\right| _{\varvec{\pi }_o, \varvec{\alpha }_0} = 2 \sum _{\mathrm {tracks}} A^\top V^{-1} \varvec{\rho }\, , \end{aligned}$$
(13)

as the term \(H^\top V^{-1}\varvec{\rho }\) is zero. Consequently, if the measurements are independent and V is diagonal, the derivative with respect to a particular parameter \(\varvec{\alpha }_i\) only receives contributions from residuals for which the related entries in the derivative matrix A are non-zero. In other words, if \(\varvec{\alpha }_i\) is an alignment parameter of a given detector module, only the measurements in this module contribute to the first-order derivative of \(\chi ^2_{\mathrm {Global}}\) with respect to \(\varvec{\alpha }_i\). Therefore, contributions to the \(\chi ^2_{\mathrm {Global}}\) from measurements in other subdetectors and MCS effects can be ignored. This useful property is labelled as the so-called locality ansatz [24] and provides an important simplification for the software implementation.

3.2.4 Adding constraints on track parameters

It is of particular importance to assure that the determination of the track parameters is free from systematic biases that can occur due to poorly determined ‘weak modes’ of the alignment. These modes are geometry distortions that leave the \(\chi ^2\) of the fitted tracks nearly unchanged and typically lead to an incorrect solution of the alignment. They can be controlled by imposing constraints on track parameters [25]. Examples of such constraints, discussed in detail in Sects. 6 and 7, are the beam-spot constraint, track parameter constraints from external detector systems (e.g. calorimeters), and constraints determined using reconstructed physics events (e.g. mass constraints from narrow resonances). These constraints are included in the Global \(\chi ^2\) method by adding extra terms to the expression for the \(\chi ^2\) in Eq. (5). For one track the modified contribution to \(\chi ^2\) is

$$\begin{aligned} \chi ^2_{\mathrm {cons}} = \varvec{\rho }^\top V^{-1}\varvec{\rho }+ (\varvec{\pi }- {\varvec{q}})^\top T^{-1} (\varvec{\pi }- {\varvec{q}})~, \end{aligned}$$
(14)

where \({\varvec{q}}\) is a vector defining the constraint on \(\varvec{\pi }\) and T is its covariance matrix.

In the ATLAS implementation, this constraint is implemented by adding a pseudo-measurement on a track [22]. The solution for the alignment parameters is given by Eq. (12), where for each constrained track the covariance matrix is now defined as

$$\begin{aligned} C = 2 \left( \frac{\text{ d}^2\chi ^2}{\text{ d }\varvec{\pi }^2}_{\mathrm {cons}} \right) ^{-1} = \left( H^\top V^{-1} H + T^{-1} \right) ^{-1}. \end{aligned}$$

In this context, the first-order derivative of the Global \(\chi ^2\) is given by

$$\begin{aligned} \left. \left( \frac{\text{ d }\chi ^2}{\text{ d }\varvec{\alpha }} \right) ^{\top } \right| _{\varvec{\alpha }_0}= & {} 2 \sum _{\mathrm {tracks}} A^\top V^{-1} (V - HCH^\top ) V^{-1} \varvec{\rho }(\varvec{\alpha }_0) \nonumber \\&- A^\top V^{-1} H C T^{-1}(\varvec{\pi }(\varvec{\alpha }_0) - {\varvec{q}}). \end{aligned}$$
(15)

If the tracks have been re-fitted with the imposed constraint, the locality ansatz drastically simplifies Eq. (15), reducing it to Eq. (13). This property is used in the ATLAS implementation.

3.2.5 Constraints on alignment parameters

Often one has some prior knowledge of the geometry from either survey measurements or mechanical constraints. These constraints can be included by adding terms to the \(\chi ^2\) in Eq. (5). In the general case, one can write

$$\begin{aligned} \chi ^2_{\mathrm {cons}} = \sum _{\mathrm {tracks}} \varvec{\rho }^\top V^{-1}\varvec{\rho }+ (\varvec{\alpha }- \varvec{\alpha }_{\mathrm {cons}} )^\top W^{-1} (\varvec{\alpha }- \varvec{\alpha }_{\mathrm {cons}} )~, \end{aligned}$$
(16)

where \(\varvec{\alpha }_{\mathrm {cons}}\) is a vector defining the constraint on \(\varvec{\alpha }\) and W is its covariance matrix. The added term leads to extended expressions for the first and second derivatives of \(\chi ^2\) with respect to \(\varvec{\alpha }\) (Eqs. (10), (11)):

$$\begin{aligned}&{\varvec{Y}} \longrightarrow {\varvec{Y}} + 2~W^{-1} (\varvec{\alpha }- {\varvec{a}})~,\nonumber \\&{\mathcal {M}} \longrightarrow {\mathcal {M}} + 2~W^{-1}, \end{aligned}$$
(17)

while the solution is computed using \({\varvec{Y}}\) and \({\mathcal {M}}\) in Eq. (12).

The special case when \({\varvec{a}}\equiv \varvec{\alpha }_0\) and W is diagonal, i.e. when the alignment parameters are constrained to their initial values, is further discussed in Sect. 3.3.3.

3.2.6 The Local \(\chi ^2\) method

The main advantage of the Global \(\chi ^2\) method arises from its rigorous treatment of correlations between alignable objects through the tracks connecting them. However, this approach becomes technically challenging when the number of alignment parameters is very large, such as in the case of the alignment of individual TRT straws (\(\approx 700{,}000\) parameters). In order to overcome this challenge, a simplified version of the \(\chi ^2\) approach (the Local \(\chi ^2\) method) is used. It is based on the minimisation of the same \(\chi ^2\), Eq. (5), but the implicit dependence on the fitted track parameters is dropped, reducing Eq. (6) to a simpler form:

$$\begin{aligned} \frac{\text{ d }}{\text{ d }\varvec{\alpha }} = \frac{\partial }{\partial \varvec{\alpha }}~. \end{aligned}$$

Consequently, Eqs. (10) and (11) are reduced to:

$$\begin{aligned} \begin{aligned} \left( \frac{\text{ d }\chi ^2_{\mathrm {Local}}}{\text{ d }\varvec{\alpha }} \right) ^\top&= 2 \sum _{\mathrm {tracks}} A^\top \Omega ^{-1} {\varvec{r}}\,\\ \frac{\text{ d}^2 \chi ^2_{\mathrm {Local}}}{\text{ d }\varvec{\alpha }^2}&= 2 \sum _{\mathrm {tracks}} A^\top \Omega ^{-1} A~. \end{aligned} \end{aligned}$$

In addition, the problem is reduced to separate systems of equations describing individual alignable modules. The Local \(\chi ^2\) method eliminates the numerical challenges of the Global \(\chi ^2\) since only systems of equations with up to six parameters (albeit many of them) need to be solved. However, due to the loss of the correlations between alignable objects, the Local \(\chi ^2\) method needs a much larger number of iterations to converge.

3.3 Solving the linear system of alignment equations

In general, the properties of the matrix representing a system of linear equations determine the most suitable solution technique. The matrix \({\mathcal {M}}\) in Eq. (11) as defined in the Global \(\chi ^2\) ansatz is found to be symmetric and singular and to have a poor matrix condition number if no constraints are applied. The addition of appropriate constraints generally renders the matrix positive definite. The singular nature of the matrix is the result of detector movements that leave a track’s \(\chi ^2\) unchanged. The simplest examples are global transformations of the detector (either translations or rotations), which are generally singular modes.Footnote 3 A trivial way to remove these global degrees of freedom is to fix a detector element, making it the reference point for all other detector elements. This method has the unwanted drawback of arbitrarily selecting one detector module as the reference frame. In the following section, two methods used to obtain a solution to the alignment system of linear equations are discussed along with how ‘weak modes’ are removed or mitigated.

3.3.1 Diagonalisation of the matrix

The symmetric matrix \({\mathcal {M}}\) is decomposed into its diagonal basis: \(P{\mathcal {D}}P^{\top }\) where \({\mathcal {D}}\) is a diagonal matrix containing the eigenvalues of \({\mathcal {M}}\), and P is a matrix containing the eigenvectors of \({\mathcal {M}}\). Of course, in the diagonal basis all parameters (directions) are linearly independent, and the solutions plus their associated uncertainties are given by the eigenvalues (\(\lambda _i\)) with:

$$\begin{aligned} X_{{\mathcal {D}}}^i=-\frac{1}{\lambda _i}Y_{{\mathcal {D}}}^i~~~~~{\mathrm {and}} ~~~~~\sigma (X_{{\mathcal {D}}}^i)=\frac{1}{\sqrt{\lambda _i}}\,, \end{aligned}$$
(18)

where \(X_{{\mathcal {D}}}^i\) and \(Y_{{\mathcal {D}}}^i\) are the \(i{\mathrm {th}} \) component of vectors \(\pmb {X}_{{\mathcal {D}}}\) and \(\pmb {Y}_{{\mathcal {D}}}\) in the diagonal basis, with \(\pmb {Y}_{{\mathcal {D}}} = P^{\top }~\pmb {Y}\).

Singular and weak modes must be excluded as their eigenvalues are zero or have an arbitrarily large associated uncertainty, respectively. Although this can be achieved in many ways, the primary method employed is to set \({{\mathcal {D}}}^{-1}_{i,i} = 0\) for the modes that need to be removed, thereby creating a new diagonal matrix \({\mathcal {D'}}^{-1}\) which provides the solution:

$$\begin{aligned} \pmb {X} = - P\mathcal {D'}^{-1}P^{\top }\pmb {Y} . \end{aligned}$$

The DSPEV function in the LAPACK [26] software package is used as a baseline in the ATLAS implementation to diagonalise large matrices. Alternative implementations using ROOT [27], EIGEN [28] and CLHEP [29] linear algebra classes are also available. In general, the computation time for matrix diagonalisation scales as \({\mathcal {O}}(\mathrm {DoF}^3)\) and solutions for very large systems become untenable on a single machine. If the initial matrix is poorly conditioned, the accuracy of the numerical solution can be limited by the precision of 64-bit floating-point computations for problems exceeding \({\mathcal {O}}\)(10,000) DoFs.

3.3.2 Direct solving

Even for very large problems, direct solvers offer an accurate and CPU-efficient method for solving sparse linear equations. In addition, less memory is required as no matrix is inverted or diagonalised in the process. The LDLT Cholesky factorisation method provided within EIGEN [28] is used within the ATLAS ID alignment and takes less than 10 min to solve an alignment problem with 35,000 parameters (the approximate number of parameters needed to align all modules in the ID simultaneously) on a modern CPU. Direct solving is used when aligning thousands of degrees of freedom (usually when aligning at individual module level). Obtaining a direct solution does not offer the possibility of eliminating specific eigenmodes. Thus, other preconditioning techniques are used in order to extract a meaningful solution (e.g. Sect. 3.3.3). It is noteworthy that, although not extensively utilised within ATLAS, it is possible to iteratively find the eigenvalues and associated eigenvectors of large systems by solving \({\mathcal {M}}{\mathbf {x}}=\lambda {\mathbf {x}}\) for \({\mathbf {x}}\) and \(\lambda \) [30], which can be useful in understanding the weak modes of very large systems and identifying the underconstrained degrees of freedom.

Table 2 Typical alignment configurations used throughout Run 2 data taking to derive baseline alignment constants. Translational degrees of freedom (DoF) are denoted by a T, rotational ones by an R. As shown in Fig. 2, TRT barrel straws run parallel to the beam line. That corresponds to \(T_z\) at level 1 and \(T_y\) at level 2

3.3.3 Constraining alignment parameters in the solution (the Soft Mode Cut)

As introduced in Sect. 3.2.5, setting \({\varvec{a}}\equiv \varvec{\alpha }_0\) and having a diagonal W constrains the alignment parameters to their initial values. Here, W denotes a diagonal matrix with diagonal elements: \(\sigma (\alpha _i)^2\), providing the tolerances to the corrections of the alignment parameters.

For this special case, the top row of Eq. (17) simplifies to Eq. (10) and the diagonal of the matrix \({\mathcal {M}}\) in Eq. (11) is incremented by the reciprocal of assumed variances of alignment corrections:

$$\begin{aligned} \left( {\mathcal {M}} + 2~W^{-1} \right) \pmb {X} = -\pmb {Y} ~. \end{aligned}$$

The above equation can be rearranged as

$$\begin{aligned} \sum _{j} \left( \sigma (\alpha _i)\sigma (\alpha _j)~{\mathcal {M}}_{ij} + 2~I_{ij}\right) ~ \frac{X_j}{\sigma (\alpha _j)} = -\sigma (\alpha _i)~Y_i \end{aligned}$$
(19)

yielding an equation in which the corrections to the alignment parameters are normalised to their assumed uncertainties \(\Delta \alpha _i \longrightarrow \Delta \alpha _i/{\sigma (\alpha _i)}\). Apart from the extra identity matrix I, Eq. (19) is exactly equivalent to Eq. (12).

To illustrate the effect of such a constraint, consider the case that all \(\sigma (\alpha _i)\) are equal (\(\sigma (\alpha _i)=\sigma _c\)). The extra identity matrix does not affect the eigenmodes of \({\mathcal {M}}\), but adds an offset to its spectrum of eigenvalues:

$$\begin{aligned} \mathcal {M'}={\mathcal {M}}+2~I/\sigma _c^2, ~~~~ \mathcal {D'}={\mathcal {D}}+ I /\sigma _c^2, ~~~~ \lambda '_i=\lambda _i + 1/\sigma _c^2. \end{aligned}$$

The solution in the diagonal basis, Eq. (18), takes the form:

$$\begin{aligned} X_{{\mathcal {D}}}^i=\frac{1}{\lambda _i+1/\sigma _c^2}~Y_{{\mathcal {D}}}^i ~~~~~{\mathrm {and}}~~~~~\sigma (X_{{\mathcal {D}}}^i)=\frac{1}{\sqrt{\lambda _i+1/ \sigma _c^2}}\,. \end{aligned}$$
(20)

Hence, one obtains a solution explicitly free from ill-defined (weak) modes. This operation does not require an explicit diagonalisation and can be used as preconditioning prior to fast solving, providing powerful control over solutions for an arbitrarily large number of DoFs. Due to the typically exponential nature of the eigenspectrum, Eq. (20) represents a solution with a clear cut-off in the diagonal basis for \(\lambda _i \ll 1/\sigma _c^2\). This technique is extensively used in the ATLAS implementation.

4 Inner Detector alignment

The ID is composed of a large number of active detector components (see Sect. 2.1 for details). Each component or grouped collection of modules (e.g. a subdetector) can be treated as an alignable structure. The alignment is performed at different hierarchical levels following the assembly structure of the ID. Starting with the largest physical structures at level 1, the detector subsystems are aligned separated into endcaps and barrel regions in order to correct for collective movements. Level 2 treats individual barrel layers and endcap disks as physical structures (barrel modules and endcap wheels in the case of the TRT). Level 3 corresponds to a silicon module or TRT wire alignment. In this context, the SCT modules are considered as a single element in the alignment procedure due to their high construction precision [16, 17]. The levels are addressed sequentially during the alignment procedure, see Table 2.

In Run 2, the alignment levels were updated to accommodate the IBL. These changes are straightforward for levels 2 and 3, as the IBL represents merely an additional Pixel layer or additional silicon modules, respectively. The support structure of the IBL is mechanically independent from the previously installed Pixel subdetector, so IBL movement is not expected to be correlated with collective Pixel movements. Consequently, the IBL is treated as a separate physical structure at level 1.

4.1 Time-dependent alignment

Time-dependent alignment is performed for each LHC fill prior to data reconstruction to determine if the detector, or individual subsystems, have moved significantly compared to a reference alignment. Such detector movements occur on different timescales, which are classified as short, medium, or long.

Short timescales describe movements during a single LHC fill that are a result of variations of the thermal load of the ID. These movements are caused by fluctuations in the power consumption of the front-end electronics, due to variations in the trigger rate, that additionally affect the temperature of the cooling system. On medium timescales, in the range of days to a month, changes to the environmental conditions of the detector, such as ramping the magnetic field or cycling the power or cooling systems, often lead to significant movements of the detector. Slow gradual movements of the subsystems over several months (long timescales) were also observed and attributed to mechanical relaxations after sudden changes.

An automated time-dependent level 1 alignment is performed within the ATLAS prompt calibration loop [10] to address all known time-dependent movements, as detailed in Sect. 5. These results are monitored and new alignment corrections are automatically obtained during the calibration period. They serve as input for the bulk reconstruction of the corresponding dataset.

4.2 Baseline alignment constants

The baseline alignment constants are a set of reference constants that serve as initial estimates for the time-dependent refinements of the alignment. In order to achieve an accurate detector alignment and a minimisation of track parameter biases over a data-taking period, a large quantity of data are used (typically \(\sim 2\hbox { fb}^{-1}\)). The levels of alignment performed are summarised in Table 2. The alignment using the global \(\chi ^{2}\) method typically converges within two to four iterations for levels 1 and 2, while at least four iterations are required at level 3 (silicon). The TRT level 3 (straw level) uses the local \(\chi ^{2}\) method and requires up to 30 iterations to converge, owing to the large number of DoFs.

Depending on the alignment level, some DoFs may be fixed during the alignment procedure if poor sensitivity is expected. Alignment levels targeting the silicon subdetectors use all tracks, whereas alignment levels including the TRT require tracks based on silicon and TRT hits. In order to remove weak modes from the alignment solution, appropriate constraints are added to the global \(\chi ^{2}\) method (see Sect. 3.2.4). Different constraints are considered depending on the expected misalignment and DoF for each alignable structure, listed in Table 2. Additionally, each subsystem can be aligned at any required level independently from the others. Further subdivision of alignment levels into smaller physical detector components, e.g. the division of individual barrel layers into staves, is also supported and used. At level 1, the SCT barrel is kept fixed due to its good stability and to serve as reference for the rest of the structures.

4.3 Residuals

As described in Sect. 3.2, the solution of the Global \(\chi ^2\) is the one that minimises the unbiasedFootnote 4 track-hit residuals. Figures 4, 5 and 6 show track-hit residual distributions for data and simulation in different ID subdetectors.

Data and simulation correspond to a set of muons selected in \(Z\rightarrow \mu ^+\mu ^-\) candidate events triggered by the lowest-threshold unprescaled single and dimuon triggers. The simulation sample was generated with the Powheg-Box v1 Monte Carlo event generator [31,32,33] at next-to-leading order (NLO) in \(\alpha _\mathrm {S}\) interfaced to Pythia 8.186 [34] for the modelling of the parton shower, hadronisation, and underlying event, with parameter values set according to the AZNLO tune [35]. The CT10 (NLO) set of parton distribution functions (PDF) [36] was used for the hard-scattering processes, whereas the CTEQ6L1 PDF set [37] was used for the parton shower. Events are required to contain two muons (satisfying ‘medium’ quality criteria as defined in Ref. [38]) with opposite charge and \(p_{\text {T}} >20\,\text {GeV}\). In addition, requirements on the opening angle between the two muons, \(\gamma (\mu ^{+}, \mu ^{-})> 45^\circ \), and their invariant mass, \(70~\text {GeV}< m_{\mu ^{+} \mu ^{-}}< 110~\text {GeV}\), are imposed. In Figs. 4, 5 and 6, both data and simulation correspond to \(2\hbox { fb}^{-1}\) of data collected during 2018. Statistical uncertainties in data and simulation are included in all the figures, although barely visible as they are negligible.

Fig. 4
figure 4

The IBL local-x (left) and local-y (right) residual distributions for the \(Z\rightarrow \mu ^+\mu ^-\) data sample compared with simulated data. The distributions are integrated over all hits on tracks in barrel modules

Fig. 5
figure 5

The Pixel local-x (left) and local-y (right) residual distributions for the \(Z\rightarrow \mu ^+\mu ^-\) data sample compared with simulated data. The distributions are integrated over all hits on tracks in barrel modules

Fig. 6
figure 6

The local-x residual distributions in the SCT (left) and TRT (right) for the \(Z\rightarrow \mu ^+\mu ^-\) data sample compared with simulated data. The distributions are integrated over all hits on tracks in barrel modules

Adequate agreement is seen between data and simulation in the residual distributions, where differences are quantified in terms of the ‘full width at half maximum’ (FWHM) figure of merit. A similar level of agreement is observed for the data collected during the other years of Run 2. In the case of the IBL, Pixel and SCT barrel, larger residual widths are observed in data. As shown in Sect. 5.3, the Run 2 alignment accuracy and stability in the central pseudorapidity range for both the Pixel and SCT barrel modules is controlled to a precision better than \(0.5\,\upmu \hbox {m}\) and \(2\,\upmu \hbox {m}\) in local-x and local-y, respectively. Consequently, several other possible causes of the observed discrepancy between data and simulation are considered, such as imperfect modelling of the interactions of muons with detector material in the simulation, the material description, delta ray production modelling, mis-modelling of the detector response (and resolution) in simulation, and residual biases not uniform across individual modules in data. The latter particularly impacts the local-y track-hit residuals in Fig. 4. The poorest agreement is seen for the IBL residuals, which have not yet been corrected for sensor distortions, in contrast to the Pixel layers. The sensor distortion can result in track-hit residual biases of up to \(10\,\upmu \hbox {m}\) within a given module, thus causing a broadening of the overall distribution [39]. The shape of the IBL modules was recently parameterised with Bernstein–Bézier functions and will be corrected in the track fitting procedure for Run 3 data taking. The cause of the small bias of \(4\,\upmu \hbox {m}\) in the IBL local-y track-hit residuals in simulation in Fig. 4 is currently unidentified. Simulated samples use a perfectly aligned detector with no track-based alignment correction, hence this bias originates from the track or cluster reconstruction. On data, this small reconstruction bias is removed by the alignment without a significant effect on alignment precision.

5 Detector stability and time-dependent alignment

This section discusses the main sources of time variation in ID geometry and the methods implemented to mitigate these effects within the ATLAS prompt calibration loop [10]. In addition, the stability of the ID in Run 2 is summarised, final time-dependent corrections for all subsystems are presented, and the precision of the alignment is determined. All results use pp collision data at \(\sqrt{s} = 13\,\text {TeV}\). The alignment precision for heavy-ion data in Run 2 is at least as good as the final precision of pp collision data, as the instantaneous luminosity, and therefore the thermal load variations in the ID, is typically lower.

5.1 Short-timescale movements

Detector movements on short timescales are particularly challenging, since the ID track-based alignment calculates an average position correction for the time interval under study.Footnote 5

The procedure used to correct for rapid movements must balance two competing effects: the alignment corrections must be determined in time intervals that are short enough to capture the motion of the particular deformation, but long enough to include sufficient data to obtain precise corrections.

5.1.1 Temperature-dependent IBL distortions

During the Run 2 commissioning of the IBL, it was already noticed that the IBL staves can be distorted by hundreds of micrometers relative to the design geometry. It was soon observed that these distortions depend on the operating temperature and correspond to module displacement in the azimuthal direction of the staves, equivalent to their local-x direction. The distortion was understood to be caused by tight, asymmetric mechanical coupling of materials with different coefficients of thermal expansion (CTE). The correlation between temperature and the size of the IBL distortion was studied using cosmic-ray data in March 2015 with a controlled variation of the IBL temperature, \(T_{\mathrm {set}}\), in the range \(-20{^{\circ }}\hbox {C}\) to \(+15{^{\circ }}\hbox {C}\) [40]. The size of the distortion was measured in situ using the track-based alignment and a fit to a model determined from a three-dimensional finite-element analysis. This model parameterises the IBL distortion in local-x, \(\delta x(z)\), using a parabolic function,

$$\begin{aligned} \delta x(z) = B - \frac{M}{z_{0}^{2}} \left( z^2 - z_{0}^{2} \right) , \end{aligned}$$
(21)

where z is the global-z coordinate of the module, \(z_0 \equiv 366.5\hbox { mm}\) is the coordinate of the stave mount at both ends, B is the baseline describing the overall translation of the stave in local-x, and M is the magnitude of the distortion at the stave centre. The temperature gradient of M with respect to \(T_{\mathrm {set}}\) is found to be \(\mathrm {d}M/\mathrm {d}T_{\mathrm {set}} = (-10.6\pm 0.7)\,\upmu \hbox {m/K}\). The local-y position shows no temperature-dependent effect within \(20\,\upmu \hbox {m}\) uncertainty, whereas the local-z (bending out of the plane of the stave) was not included in this study. The IBL distortion is shown in Fig. 7 for different \(T_{\mathrm {set}}\) values using 2015 and 2016 pp collision data.

From the initial data taking in Run 2 through September 2015, the IBL power consumption per module was found to be stable, and fluctuations in \(T_{\mathrm {set}}\) were within \(\sim 0.2\hbox { K}\) resulting in a stable detector \((\delta x(z) < 3\,\upmu \hbox {m})\) [40]. This situation changed with the rapid increase in integrated luminosity per LHC fill after September, which induced an increase in the low-voltage (LV) currents in the IBL module front-end electronics. This increase was traced back to radiation-induced leakage current in transistors [41]. The change in LV currents depends on the total ionisation dose. Studies show that the increase reaches a peak value for radiation doses between 10 and 30 kGy and decreases for higher doses to a value close to the pre-irradiation case.

Fig. 7
figure 7

IBL local-x position in the transverse plane averaged over all 14 IBL staves for 2015 data using \(T_{\mathrm {set}}= -10{^{\circ }}\,\hbox {C}\) (red open squares), and for 2016 data using different \(T_{\mathrm {set}}\) (\(+15{^{\circ }}\hbox {C}\), solid blue circles; \(+5{^{\circ }}\,\hbox {C}\), solid green triangles). No error bars associated with data are shown. The IBL distortion was constant during all three LHC fills. Here, only the correction due to the IBL distortion is shown. The baseline, which describes the overall translation of the whole stave, is subtracted using Eq. (21). The fit represents only a first-order correction. Additional corrections are computed as part of the detailed alignment corrections at a later stage

Fig. 8
figure 8

The Pixel detector vertical (\(T_{y}\)) movement as a function of the time since the start of an LHC fill. The average Pixel \(T_{y}\) for the entire run (dashed blue line) is compared with its time evolution and with the instantaneous LHC luminosity. The error bars represent the statistical uncertainty

These variations in the LV currents caused an increase in IBL module temperatures that resulted in changes in IBL distortions on short timescales. In this context, values of \(\delta x(z)\) of up to \(30\,\upmu \hbox {m}\) were observed between LHC fills and up to \(10\,\upmu \hbox {m}\) within a single fill, corresponding to a variation of \(0.5\,\upmu \hbox {m}\hbox { h}^{-1}\).

5.1.2 Vertical movements of the Pixel detector

Another systematic deformation on short timescales is a change in the vertical position (global-y direction) of the Pixel detector by up to \(8\,\upmu \hbox {m}\) at the start of an LHC fill. Figure 8 shows the Pixel detector vertical movement from the start of an LHC fill. The position is computed every 20  min, which is the shortest time interval used in the ATLAS prompt calibration loop. As is evident from Fig. 8, the average position across an LHC fill does not accurately describe the position of the Pixel detector.

The cause of this movement is understood to be the following. When the Pixel detector is switched on at the start of a fill, modules reach their new temperature almost immediately as a result of the strong thermal coupling between the modules and the evaporative cooling system [1, 42]. The LV current in the read-out electronics also increases immediately, while the temperature in the Pixel detector volume rises gradually during the first 60 min. The smaller mass load due to the change in density of the bi-phase cooling liquid causes the Pixel detector to rise. After this initial rise, as the instantaneous luminosity and thus the occupancy decreases over the course of the fill, LV digital currents, module temperature and Pixel volume temperature gradually decrease as well. This in turn causes an additional slow drift in the direction opposite to the initial movement. The speed of this slow drift depends on the peak luminosity per LHC fill. This speed increased during 2016 to reach values of \(0.2\,\upmu \hbox {m}\hbox { h}^{-1}\), as shown in Fig. 9. The vertical speed is determined as the average speed of the Pixel detector excluding the first hour after the start of data taking. This vertical drift was monitored and corrected for throughout Run 2.

Fig. 9
figure 9

Vertical speed of the Pixel detector as a function of the peak luminosity of an LHC fill, extracted from alignment corrections. Only fills longer than 8 h are considered

Table 3 Typical alignment configurations used throughout Run 2 data taking to derive dynamic alignment corrections
Fig. 10
figure 10

Bowing magnitude averaged over the 14 IBL staves relative to the baseline alignment (blue full circles) and the geometry after dynamic alignment (red open circles) with its statistical uncertainty. The IBL operation temperature (\(\hbox {T}_{\mathrm {set}}\)) for each period is shown

5.2 Dynamic alignment on short timescales

In an effort to mitigate the effects of systematic short-timescale distortions and ensure adequate data quality for all analyses relying on tracking, conceptual improvements within the alignment framework and strategy were made. A key improvement was the introduction of a new alignment DoF, \(B_x\), to parameterise the IBL distortion deformation magnitude M. The \(B_x\) DoF correlates the local-x coordinate of each module along the IBL stave using the parabolic function defined in Eq. (21). Minimising the global \(\chi ^{2}\) with respect to \(B_x\) provides corrections for varying degrees of IBL stave distortion using a single DoF, which can be done with small amounts of data. In contrast, a full level 3 alignment, which relies on a large amount of data, had been required previously, which did not allow short-timescale movements to be determined.

The automated alignment scheme that is performed within the ATLAS prompt calibration loop in Run 2 data taking determines level 1 and IBL \(B_x\) (per stave) dynamic alignment constants every 20 min at the start of a fill and every 100 min for the rest of the fill. This level of granularity in time is adequate to mitigate the effects of short-timescale vertical movements on track parameter resolution. The alignment is performed in two iterations of the level 1 calibration loop (level 1 CL) followed by two dedicated iterations to correct for IBL distortions. The \(B_x\) correction in the level 1 CL corresponds to a collective, uniform correction for all IBL staves. The dedicated IBL bowing iterations determine \(B_x\) individually for each stave, as summarised in Table 3. The SCT barrel is used as the reference in the dynamic alignment.

Fig. 11
figure 11

IBL distortion magnitude in the transverse plane per luminosity block (LB) range (left) and the IBL local-x unbiased residual distributions (right) for an LHC fill averaged over all 14 IBL staves. The open blue squares (baseline alignment) show the average IBL distortion in the transverse plane after the baseline ID alignment. The open red circles show the fill-averaged correction and the solid black circles show the full dynamic alignment correction

Fig. 12
figure 12

Average correction of the IBL bowing magnitude, \(B_x\), (top), IBL and Pixel detector’s horizontal position, \(T_x\), (middle), IBL and Pixel detector’s vertical position, \(T_y\), (bottom) relative to the baseline alignment in 2016 pp collision runs between LHC technical shutdown period 1 and LHC machine development period 1. The correction is calculated every 20 min for the first 60 min of the data taking, and every 100 min for the rest of the data-taking period. Each connected series of points represents a continuous data-taking period

5.3 Inner Detector stability during Run 2 data taking

5.3.1 Time-dependent corrections for all subsystems

The performance of the dynamic alignment scheme using 2016 pp collision data is shown in Figs. 10 and 11. The average bowing magnitude of the 14 IBL staves relative to the baseline alignment is compared with the results of the dynamic alignment in Fig. 10. Figure 11 shows the average IBL distortion computed after different alignment corrections versus time in the form of luminosity blocks (LB), which correspond to stable data-taking conditions in periods of approximately 1 min. It also compares the unbiased local-x residuals computed using the a fill-averaged correction (for illustration only) with those obtained after computing the full dynamic alignment correction, which is derived in short time-intervals. A clear improvement in the residual distributions is seen after applying dynamic alignment corrections. Figures 10 and 11 illustrate that, averaged over an LHC fill, even very large values of M (up to \(30\,\upmu \hbox {m}\)) are accurately corrected for using \(B_x\) as an alignment DoF. These features were present for all Run 2 data, although there was some saturation of the effect in the later years of Run 2, as observed in the radiation damage studies of the IBL [41].

The long-term trend of the Pixel and IBL detector movements relative to the baseline alignment correction is shown in Fig. 12 for the average \(B_x\) correction, the global \(T_x\) correction, and the global \(T_y\) correction. For the sake of clarity, the plots in Fig. 12 show only a fraction of the Run 2 data; the remaining data follow the same trend.

5.3.2 Final alignment precision of each subsystem

The final alignment precision of each ID subsystem is determined from the track-hit residuals of individual silicon modules for each LHC fill in 2015 and 2016 data taking after the dynamic alignment corrections are applied. These dynamic alignment corrections are computed either for large structures (e.g. the Pixel detector) as a collective movement of all modules or using a simplified parameterisation (like \(B_x\)). In this context, less significant module-to-module movements remain uncorrected by the dynamic alignment. This effect is seen as a residual time-dependent misalignment or ‘instability’ of the modules. This instability is estimated for each silicon layer and module z-position by integrating modules over \(\phi \) into one group. Results are presented for the ‘in-plane’ translation DoFs only (local-x and local-y).

Fig. 13
figure 13

Estimated \(\sigma _{x}^{\mathrm {time}}\) as a function of the module global-z position for the IBL and Pixel barrel layers. The vertical bar on each marker represents the standard deviation of the estimated value over modules at the same z-position along different staves. The global-z position is slightly modified from its true value for visualisation purposes

Fig. 14
figure 14

Estimated \(\sigma _{y}^{\mathrm {time}}\) as a function of the module global-z position for Pixel barrel layers. The vertical bar on each marker represents the standard deviation of the estimated value over modules at the same z-position along different staves. The global-z position is slightly modified for the different Pixel layers for visualisation purposes

Fig. 15
figure 15

Estimated \(\sigma _{x}^{\mathrm {time}}\) as a function of module global-z position for SCT barrel layers. The vertical bar on each marker represents the standard deviation of the estimated value over modules at the same z-position along different staves. The global-z position is slightly modified for the different SCT layers for visualisation purposes

For each module, the average track-hit residual, \(\langle r_{x,y}\rangle \), is computed for each LHC fill, for both local-x and local-y, on a set of calibration data, whose size is approximately independent from the fill conditions. Its statistical uncertainty, \(\sigma _{r_{x,y}}/\sqrt{N}\), where N is given by the number of tracks per module and \(\sigma _{r_{x,y}}\) is the standard deviation of the residuals, is computed assuming that the residual distribution is approximately Gaussian. The dispersion \(\sigma _{\langle r_{x,y}\rangle }\) of the distribution in \(\langle r_{x,y}\rangle \) obtained from all LHC fills is an estimate of the total instability of the module position after all alignment corrections are applied. This total uncertainty can be divided into a statistical component \((\sigma _{r_{x,y}}/\sqrt{N})\) and a component describing residual instability due to uncorrected time-dependent movements and stochastic fluctuations, \(\sigma _{x,y}^{\mathrm {time}}\):

$$\begin{aligned} \sigma _{\langle r_{x,y}\rangle } \sim \sigma _{x,y}^{\mathrm {time}} \oplus \frac{\sigma _{r_{x,y}}}{\sqrt{N}}. \end{aligned}$$

As the size of the statistical contribution per module per LHC fill is generally small, \(\sigma _{x,y}^{\mathrm {time}}\) is estimated by

$$\begin{aligned} \sigma _{x,y}^{\mathrm {time}} \equiv \sqrt{\sigma _{\langle r_{x,y}\rangle }^{2} - \left( \frac{\sigma _{r_{x,y}}}{\sqrt{N}}\right) ^{2}}. \end{aligned}$$
Fig. 16
figure 16

Estimated \(\sigma _{x,y}^{\mathrm {time}}\) as a function of module \(\eta \) index for Pixel endcap layers. The vertical bar on each marker represents the standard deviation of the estimated value over modules of the same layer

Figures 13, 14 and 15 show the estimated instability in local-x and local-y of the Pixel and SCT barrel layers as a function of module z-position. Figure 16 shows the local-x and local-y instabilities of the modules in the Pixel endcap layers. The alignment in the central pseudorapidity range for both the Pixel and SCT barrel modules is controlled to a precision better than \(0.5\,\upmu \hbox {m}\) and \(2\,\upmu \hbox {m}\) in local-x and local-y, respectively. This level of control is considered to be very good given the time-dependent corrections of \({\mathcal {O}}(10\,\upmu \hbox {m})\) due to the IBL distortion and the vertical movement of the Pixel detector. The same level of precision is not achieved for the outermost IBL modules (3D sensors) corresponding to the range \(|\eta |>2.5\). There the alignment precision in local-x (local-y) is measured with an uncertainty better than \(3\,\upmu \hbox {m}\) (\(15\,\upmu \hbox {m}\)). This region is particularly challenging due to the low number of tracks. Moreover, the tracks in this region have only small overlap with other ID tracking layers. Furthermore, the large IBL local-y uncertainty may be related to a deformation of IBL staves in local-z over time, which is not corrected for in the prompt calibration loop.

The instability for Pixel endcap modules is larger than for barrel modules; the local-x and local-y instabilities are 2–4 \(\upmu \hbox {m}\) and 4–7 \(\upmu \hbox {m}\), respectively. This instability corresponds to the size of the movements of the Pixel endcap modules relative to the baseline alignment over time. The precision achieved is nearly one order of magnitude better than the required precision [2]. This required precision was defined in order to limit the degradation of the resolution of the track parameters for high-momentum tracks to less than 20% in comparison with a perfectly aligned detector. While succeeding in its primary goal, these results, specifically the residual Pixel endcap movements, also imply that the current dynamic alignment scheme, which allows time-dependent alignment of the entire Pixel detector as one unit, is not optimal. A higher level of precision might be achieved if the Pixel endcap disks were aligned individually. This improvement is under study for LHC Run 3, including detailed cross-checks for new weak modes that may be introduced due to the additional DoFs within the calibration loop.

6 Momentum biases

The alignment corrections described in Sects. 4 and 5 target misalignments that change the \(\chi ^2\) of the track fit in Eq. (1). In contrast, correlated geometrical distortions referred to as weak modes leave the \(\chi ^2\) of the fitted tracks virtually unchanged and can systematically bias the reconstructed track parameters. Momentum biases induced by correlated detector misalignments can generally be classified into two categories:

  • Sagitta deformations consist of detector geometry distortions in the bending plane that affect the reconstructed track curvature differently for positively and negatively charged particles (Fig. 17 left).

  • Length scale biases are characterised by detector geometry distortions along the track trajectory and affect the reconstructed curvature identically for positively and negatively charged particles (Fig. 17 right).

These biases can be mitigated through the use of constraints either on track parameters (Sect. 3.2.4) or on alignment parameters (Sect. 3.2.5), or on both simultaneously. Residual momentum biases, after corrections to the detector alignment have been made, are sufficiently small that they can be accounted for by directly correcting track parameters.

Fig. 17
figure 17

A simplified representation of two common weak modes that bias the track momentum. A sagitta bias (left) is caused by a deformation in the bending plane of the tracks, e.g. a rotation of the detector layers depending linearly on the radius. A length scale bias (right) caused by a deformation along the track trajectory, e.g. a radial expansion of the detector layers depending linearly on the radius. The real (dashed black line) and fitted (solid black line) particle trajectories are shown. Red stars indicate real measurement positions and grey stars show the reconstructed hit positions (biased measurements)

6.1 Sagitta bias

Displacements of the reconstructed hits in the bending plane orthogonal to the track path result in a charge-antisymmetric alteration of the track curvature, which is parameterised as

$$\begin{aligned} p' = p ~(1+q ~p_{\mathrm {T}} ~\delta _{\mathrm {sagitta}})^{-1} , \end{aligned}$$
(22)

where the un-primed quantities correspond to the true values, the primed quantities correspond to the reconstructed values, q refers to the sign of the electric charge of the particle and \(\delta _{\mathrm {sagitta}}\) is a bias parameter common to all measured momenta and uniquely defines the detector geometry deformation.

Table 4 Event selection criteria for \(Z\rightarrow \mu ^{+} \mu ^{-}\) (Sect. 6.1.1) and \(Z\rightarrow e^{+} e^{-}\) (Sect. 6.1.2) candidate events for the analyses of the sagitta biases in data. Events triggered by the lowest-threshold unprescaled single and double electron and muon triggers are used to select \(Z\rightarrow \mu ^{+} \mu ^{-}\) and \(Z\rightarrow e^{+} e^{-}\) event candidates. \(\gamma (\mu ^{+}, \mu ^{-})\) is the opening angle between the muons. \(\Delta d_0(\mu ^{+}, \mu ^{-})\) and \(\Delta z_0(\mu ^{+}, \mu ^{-})\) are defined as the difference in \(d_0\) and \(z_0\) between the two muons

Two iterative methods are used to determine the sagitta biases. The first method uses \(Z\rightarrow \mu ^+\mu ^-\) decays. The second uses the electromagnetic calorimeter as a reference, and utilises the ratio of the measured energy deposited in the calorimeter (E) to the momentum (p) measured by the ID for electrons and positrons. Both methods allow the detector to be segmented arbitrarily in \(\eta \) and \(\phi \), allowing the study of localised sagitta biases. Sagitta biases have, to a great extent, been corrected for during the determination of the alignment constants by adding constraints to the parameters of the tracks used to perform the detector alignment, as given by Eqs. (14) and (16) and explained in Sects. 3.2.4 and 3.2.5, and also in Refs. [3, 43]. The methods used to calculate the constraints are described below, and the residual sagitta biases after alignment corrections are shown.

6.1.1 Measuring sagitta biases using \(Z \rightarrow \mu ^{+} \mu ^{-}\) decays

The invariant mass, m, of two highly relativistic opposite-charge particles is given approximately by

$$\begin{aligned} m^2 = 2 p^+ p^- (1-\cos \gamma ), \end{aligned}$$

where \(p^+\) and \(p^-\) are the magnitudes of the momenta of the positively and negatively charged particles, and \(\gamma \) is defined as their opening angle. In the following, \(+\) and − superscripts refer to the properties of the positively and negatively charged muons respectively. Sagitta biases can be measured using any particle (of reasonably narrow width) that decays into pairs of stable particles. In LHC conditions, resonances that decay into pairs of muons (such as \(J/\psi \), \(\Upsilon \) and Z) present the advantage that the dimuon signature can be clearly distinguished from the large hadronic background. For \(\delta _{\mathrm {sagitta}}\) studies, \(Z\rightarrow \mu ^+\mu ^-\) decays are preferred due to the high momentum of the Z decays products. Data quality selection criteria, summarised in Table 4, are applied to both the selected muon candidates and the dimuon system. In total, more than 70 million \(Z\rightarrow \mu ^{+} \mu ^{-}\) candidate events were selected.

In general, geometrical distortions that bias sagitta measurements can be localised in specific regions of the detector. As a result, the sagitta bias parameter explicitly depends on the path of the track, which can be approximated by the direction of the track at the pp interaction point, given by \(\eta \) and \(\phi \): \(\delta _{\mathrm {sagitta}} \rightarrow \delta _{\mathrm {sagitta}} (\eta ,\phi )\). The difference at leading order in \(\delta _{\mathrm {sagitta}} (\eta ,\phi )\) between the reconstructed dimuon invariant mass using the uncorrected geometry (\(m_{\mu \mu }\)) and the expected mass (\(m_{Z}\)) for each event is given by:

$$\begin{aligned}&m_{\mu \mu }^{2}- m^{2}_{Z} \approx m^{2}_{Z}~\left( p'^{+}_{\mathrm {T}} ~\delta _{\mathrm {sagitta}} (\eta ^+, \phi ^+) \right. \nonumber \\&\left. \quad -p'^{-}_{\mathrm {T}}~ \delta _{\mathrm {sagitta}}(\eta ^-, \phi ^-) \right) \, . \end{aligned}$$
(23)

where \(m^{2}_{Z}\) is a reference mass (in this case the world average mass for the Z boson [44]).

An iterative procedure is used to determine \(\delta _{\mathrm {sagitta}} (\eta ,\phi )\). For the i-th iteration, \(\delta _{\mathrm {sagitta},i}(\eta ,\phi )\) is computed for every muon in the \(Z\rightarrow \mu ^+\mu ^-\) sample with:

$$\begin{aligned}&\Delta \delta _{\mathrm {sagitta},i} (\eta ,\phi ) \nonumber \\&\quad =-q~\frac{m_{\mu \mu }^{2}- m^{2}_{Z}}{2~m^{2}_{Z}}~ \frac{\left( 1+~q~p'_{\mathrm {T}} ~ \langle \delta _{\mathrm {sagitta},{i-1}} (\eta ,\phi )\rangle \right) }{p_{\text {T}} '}, \end{aligned}$$
(24)

where \(\langle \delta _{\mathrm {sagitta},{i-1}} (\eta ,\phi )\rangle \) is the result of the previous iteration. The corrections are updated by adding the average of the current iteration to the result of the previous iteration:

$$\begin{aligned} \langle \delta _{\mathrm {sagitta},{i}} (\eta ,\phi )\rangle = \langle \delta _{\mathrm {sagitta},{i-1}} (\eta ,\phi )\rangle + \langle \Delta \delta _{\mathrm {sagitta},i} (\eta ,\phi ) \rangle , \end{aligned}$$
(25)

where \(\langle \Delta \delta _{\mathrm {sagitta},i} (\eta ,\phi ) \rangle \) is the average bias in a \((\eta ,\phi )\) region. The value of \(m_{\mu \mu }^{2}\) is computed using Eq. (23) with the values of \(\delta _{\mathrm {sagitta}}\) from the previous iteration. The iterations are repeated until convergence is reached.

The method, as described by Eq. (23), is only sensitive to relative sagitta biases in different sectors of the detector. An alternative method, comparing the \(p_{\text {T}} \) spectrum of the \(\mu ^+\) and \(\mu ^-\) [45] was also tested. This method is sensitive to global sagitta biases, although it is also subject to detector acceptance effects and requires more data to achieve the same statistical precision as the mass-based method.

Figure 18 shows the measured sagitta distortions depending on the track direction using this technique. The central barrel region of the detector is largely free of sagitta bias, while the endcap regions exhibit some areas of small residual sagitta bias. The distribution of \(\delta _{\mathrm {sagitta}}\) for the full Run 2 data is shown in Fig. 18. Figure 19 shows the average \(\delta _{\mathrm {sagitta}}\) versus \(\eta \) and \(\phi \), as well as its RMS. The distributions, split by data-taking year, have compatible shapes indicating a consistent and stable detector geometry during Run 2.

Fig. 18
figure 18

Sagitta biases versus \(\eta \) and \(\phi \) (left) for 2018 data and the overall sagitta biases in the Run 2 data (right) for the \(Z\rightarrow \mu ^+\mu ^-\) method. The error bars represent the statistical uncertainty

Fig. 19
figure 19

Measured sagitta biases versus \(\eta \) (top) and \(\phi \) (bottom) using the \(Z\rightarrow \mu ^+\mu ^-\) method. The average (left) and the RMS (right) of the sagitta bias is shown. The markers of the data points of the different years are slightly shifted in \(\eta \) and \(\phi \) for better visibility. The error bars represent the statistical uncertainty

6.1.2 Measuring sagitta biases using the E/p ratio of electrons and positrons

Assuming that the calorimeter response is independent of the charge of the incoming particle and that a perfectly aligned detector reconstructs the momentum of charged particles correctly, charge-dependent momentum biases are expected to result in differences in the E/p ratio of positive and negative particles. This ratio is defined as the ratio of the calorimeter energy measurement (E) to the track momentum measurement (p). This technique is mainly suitable for electrons and positrons. In the presence of a sagitta bias, the \(\langle E/p \rangle \) ratio would be modified as \(\langle E/p' \rangle = \langle E/p \rangle + q ~\langle E_\mathrm {T} \rangle ~\delta _{\mathrm {sagitta}}\), where \(E_\mathrm {T} \equiv E/\cosh {\eta }\) is referred to as the transverse energy. Assuming that the average transverse momentum of positrons and electrons is equal, the sagitta bias can be estimated [3] as

$$\begin{aligned} \delta _{\mathrm {sagitta}} = \frac{ \left\langle E/p' \right\rangle ^{+} - \left\langle E/p' \right\rangle ^{-} }{2 \left\langle E_\mathrm {T}\right\rangle }. \end{aligned}$$

To take into account any biases introduced by the aforementioned assumptions the value of \(\delta _{\mathrm {sagitta}}\) is determined iteratively, correcting the momentum using Eq. (22) at each iteration. It should be noted that biases in the calorimeter energy scale cancel out to first order and any residual dependence would be reduced by this iterative procedure. In addition, this method is, by construction, sensitive to global sagitta biases. Data quality selection criteria are applied to both the selected electron candidates and the electron–positron system and are summarised in Table 4.

Figure 20 shows the \(\delta _{\mathrm {sagitta}}\) as obtained from the E/p method. These results support the observations from Sect. 6.1.1: the central barrel region of the detector is largely free of sagitta bias, while the endcap regions exhibit regions of small residual sagitta bias. Compared to Fig. 18 (right) a global offset of \({\sim }0.05\,\text {TeV}^{-1}\) can be seen in Fig. 20 (right) indicating the presence of a small global sagitta bias. Figure 21 shows the average \(\delta _{\mathrm {sagitta}}\) versus \(\eta \) and \(\phi \), as well as its RMS. The \(\delta _{\mathrm {sagitta}}\) distributions from the E/p method split by data-taking year have comparable shape to those obtained from the \(Z\rightarrow \mu ^+\mu ^-\) mass method, further supporting the observation of a consistent and stable detector geometry during Run 2. The change in position due to the residual sagitta bias (\({\sim }0.1\,\text {TeV}^{-1}\)) when extrapolating a track from the detector origin to the outermost SCT endcap disk (radius of 500 mm and a z-axis distance of 2720 mm from the detector origin) is less than \(10\,\upmu \hbox {m}\).

Fig. 20
figure 20

Sagitta biases versus \(\eta \) and \(\phi \) (left) for 2018 data and the overall sagitta biases of the Run 2 data (right) for the E/p method. The error bars represent the statistical uncertainty

Fig. 21
figure 21

Sagitta biases versus \(\eta \) (top) and \(\phi \) (bottom) for the E/p method. The average (left) and the RMS (right) of the sagitta bias is shown. The markers of the data points of the different years are slightly shifted in \(\eta \) and \(\phi \) for better visibility. The error bars represent the statistical uncertainty

Fig. 22
figure 22

Ratio of the measured mass to the reference as a function of \(\sin ^2\alpha \). Due to event kinematics, \(J/\psi \rightarrow \mu ^+\mu ^-\) events (left) cover the entire \(\sin ^2\alpha \) range while \(Z\rightarrow \mu ^+\mu ^-\) events (right) cover a smaller range. Error bars represent the statistical uncertainty. The red lines show the fit to Eq. (28) from which the values of \(\varepsilon _s\) and \(\varepsilon _{r'}\) are extracted

Fig. 23
figure 23

The measured momentum scale bias \(\varepsilon _{s}\) as a function of track \(p_{\text {T}}\). Error bars represent the statistical uncertainty. Left: \(J/\psi \rightarrow \mu ^+\mu ^-\) decays; right: \(Z\rightarrow \mu ^+\mu ^-\) decays

6.2 Length scale biases

Displacements of the reconstructed hits parallel to the track direction result in a charge-symmetric alteration of the measured track curvature. In a tracker with a solenoidal magnetic field these can be induced by changes in the radial or longitudinal length scale of the detector with little impact on the track fit quality.

If the actual radius of a detector module, R, is assumed to be \(R~(1+\varepsilon _{r})\), then for small distortions (\(|\varepsilon _{r}| \ll 1\)), the reconstructed momentum will be:

$$\begin{aligned} \begin{aligned} p'_{\mathrm {T}}&= p_{\mathrm {T}} \left( 1+ \varepsilon _{r}\right) \\ p'_{\mathrm {z}}&= p_{\mathrm {z}}. \end{aligned} \end{aligned}$$
(26)

Equation (26) assumes that the length scale in the bending plane also expands by a factor \((1+\varepsilon _{r})\), which implies that dimensions of sensitive detector modules would also expand by the same factor. If it is assumed that detector modules do not expand in the bending plane then the reconstructed transverse momentum will be biased by a factor of \((1+2\varepsilon _{r})\).

Similarly, if the actual longitudinal dimension of a detector module, z, is assumed to be \(z~(1+\varepsilon _{z})\), the reconstructed momentum will be:

$$\begin{aligned} \begin{aligned} p'_{\mathrm {T}}&= p_{\mathrm {T}} \\ p'_{\mathrm {z}}&= p_{\mathrm {z}} \left( 1+ \varepsilon _{z}\right) \,. \end{aligned} \end{aligned}$$
(27)

Linear combinations of both the radial and longitudinal biases are also considered. It is worth noting that there is a degeneracy between the effects of a bias in the magnetic field and a global scaling of the detector (radial and longitudinal: \(\varepsilon _{s}\)), as both lead to a momentum bias of the form \(\pmb {p}~(1+ \varepsilon _{s})\). Consequently, if B is assumed to be \(B~(1+\varepsilon _s)\) the particle momentum scales as \(\pmb {p}~(1+ \varepsilon _{s})\).

The relationship between the reconstructed invariant mass of a particle decaying into two muons (\({m'}_{\mu \mu }\)), and the true mass (\(m_{\mu \mu }\)), assuming that the radial and longitudinal biases in Eqs. (26) and (27) are both small, is given by:

$$\begin{aligned} \begin{aligned} {m'}_{\mu \mu }^2&\approx m_{\mu \mu }^2 + 2 E^+E^- \left[ \left( \beta _{\mathrm {T}}^+ \right) ^2 - \pmb {\beta }_{\mathrm {T}}^+ \cdot \pmb {\beta }_{\mathrm {T}}^- \right] \varepsilon _{r}(\eta ^+,\phi ^+) \\&\quad + 2 E^+E^- \left[ \left( \beta _{\mathrm {T}}^- \right) ^2 - \pmb {\beta }_{\mathrm {T}}^+ \cdot \pmb {\beta }_{\mathrm {T}}^- \right] \varepsilon _{r}(\eta ^-,\phi ^-) \\&\quad + 2 E^+E^- \left[ \left( \beta _{\mathrm {z}}^+ \right) ^2 - \pmb {\beta }_{\mathrm {z}}^+ \cdot \pmb {\beta }_{\mathrm {z}}^- \right] \varepsilon _{z}(\eta ^+,\phi ^+) \\&\quad + 2 E^+E^- \left[ \left( \beta _{\mathrm {z}}^- \right) ^2 - \pmb {\beta }_{\mathrm {z}}^+ \cdot \pmb {\beta }_{\mathrm {z}}^-\right] \varepsilon _{z}(\eta ^-,\phi ^-) \, , \end{aligned} \end{aligned}$$

where the \(\pmb {\beta } = \pmb {p}/E\) is the velocity of the particle. This approximation is valid to first order in \(\varepsilon \).

In a simpler case, where only a global radial and longitudinal bias are present, the reconstructed mass is:

$$\begin{aligned} \begin{aligned} {m'}_{\mu \mu }^2&\approx m_{\mu \mu }^2 + 2 E^+E^- \left[ \pmb {\beta }_{\mathrm {T}}^+ - \pmb {\beta }_{\mathrm {T}}^- \right] ^2 \varepsilon _{r} \\&\quad + 2 E^+E^- \left[ \pmb {\beta }_{\mathrm {z}}^+ - \pmb {\beta }_{\mathrm {z}}^- \right] ^2 \varepsilon _{z} \,, \end{aligned} \end{aligned}$$

which, in the limit where the muon mass is ignored leads to

$$\begin{aligned} {m'}_{\mu \mu }^2\approx & {} m_{\mu \mu }^2 + 2 m_{\mu \mu }^2\varepsilon _{r}~\sin ^2\alpha + 2 m_{\mu \mu }^2~\varepsilon _{z}~\cos ^2\alpha \\\approx & {} m_{\mu \mu }^2 + 2 m_{\mu \mu }^2 \left( \varepsilon _{s}\ + \varepsilon _{r'}~\sin ^2\alpha \right) \, , \end{aligned}$$

where

$$\begin{aligned} \sin ^2\alpha = E^+E^- \left[ \pmb {\beta }_{\mathrm {T}}^+ - \pmb {\beta }_{\mathrm {T}}^- \right] ^2 /m_{\mu \mu }^2\,, \end{aligned}$$
(28)

\(\varepsilon _{s} = \varepsilon _{z}\), and \(\varepsilon _{r'} = \varepsilon _{r}-\varepsilon _{z}\) is the difference between the radial and longitudinal components of the momentum scale.Footnote 6

Thus, by measuring the mass as function of \(\sin ^2\alpha \) it is possible to differentiate between radial and scale biases. Figure 22 shows the measured scale using \(J/\psi \) and Z decays into \(\mu ^+\mu ^-\) in the barrel of the ID. The results show a clear momentum scale bias but no significant radial scale (\(\varepsilon _{r'}\)) as the reconstructed mass is constant as a function of \(\sin ^2\alpha \).

An analysis using an iterative procedure similar to the \(\delta _{\mathrm {sagitta}}\) method, Eq. (24), is also performed. Here, the momentum scale factor (\(\varepsilon _s\)) is computed and consequently used to update the momentum of the tracks at the next iteration. This method allows biases to be measured as a function of any kinematic or geometric parameter. The results as a function of the track \(p_{\text {T}}\) are presented in Fig. 23. The magnitude of the momentum scale bias is observed to be constant as a function of track \(p_{\text {T}}\) as expected from a length scale or magnetic field strength bias.

The magnitude of the measured scale bias is consistent for the two studies, demonstrating that there is a global momentum scale bias of \(\varepsilon _s \approx - 0.9 \times 10^{-3}\). This result is in agreement with the momentum scale at the ID for muons [38]. As previously highlighted, the origin of such a global momentum scale bias cannot be unambiguously resolved by these studies. It should be noted that: the measurement of the absolute scale of the magnetic field has an uncertainty, which is about four times smaller than the observed scale bias [7].

7 Impact parameter biases

The weak modes of the alignment can also lead to a bias in the transverse (\(d_0\)) and longitudinal (\(z_0\)) impact parameters. For example, a rotation of the IBL or radial distortions of the Pixel layers can lead to transverse impact parameter biases. The quality of the detector alignment can be assessed by analysing impact parameter biases as a function of track \(p_{\mathrm {T}}\) and \(\eta \). For this study, events are selected using a combination of single-jet triggers with several jet \(p_{\mathrm {T}}\) thresholds starting at \(100\,\text {GeV}\). The standard ATLAS event cleaning selection is applied, ensuring that all detectors were fully operational. In order to disentangle the biases due to residual misalignment from those originating from the track reconstruction algorithms, recorded data are compared with a dijet Monte Carlo simulation sample generated with Pythia [47]. The primary vertex of each selected event must have at least three tracks associated with it. Tracks are selected by requiring them to be assigned to jets using ghost association [48], a procedure that treats them as four-vectors of infinitesimal momentum magnitude during the jet reconstruction and assigns them to the jet with which they are clustered. Jets are reconstructed using the anti-\(k_t\) algorithm [49] with radius parameter \(R = 0.4\). In addition, tracks are required to have at least 9 silicon (Pixels + SCT) hits for \(|\eta |\le 1.65\), at least 11 silicon hits for \(|\eta |>1.65\), a maximum of 2 SCT holes,Footnote 7 no Pixel hole, \(p_{\mathrm {T}}>3\,\text {GeV}\), \(|\eta |<2.5\), and an opening angle \(\Delta R({\text {track},\text {jet}})<0.4\) relative to the reconstructed jet axis. A track \(p_{\mathrm {T}}\) of \(3\,\text {GeV}\) corresponds to the lowest momentum threshold typically used within the alignment to reduce MCS effects (see Sect. 3.1). The impact parameters are obtained relative to the primary vertex by extrapolating the particle trajectory to its position. This is particularly relevant for the longitudinal impact parameter, as the width of the luminous region in the z direction is very broad. The impact parameter biases are extracted by iterativelyFootnote 8 fitting the distribution of impact parameters relative to the primary vertex with a Gaussian function within a \(\pm 2\sigma \) range until the fitted \(\mu \) and \(\sigma \) are stable within 1%. The resulting value of the Gaussian mean (\(\mu \)) represents the estimate of the impact parameter bias.

Fig. 24
figure 24

The transverse (left) and longitudinal (right) impact parameter biases as function of the Run 2 delivered luminosity. The red dotted line indicate the change in the underlying ATLAS ID alignment geometry description. This splits the 2016 data in two periods. The grey dotted lines indicate the change of the data-taking years. The \(\sim \,4\hbox { fb}^{-1}\) corresponding to 2015 data are not shown in this plot. Only statistical uncertainties are shown

Fig. 25
figure 25

The transverse (left) and longitudinal (right) impact parameter biases as function of the track \(p_{\mathrm {T}}\). The 2016 data entries in this figure are taken from the second part of the 2016 data visible in Fig. 24; the first part of the 2016 data also show no impact parameter dependence on track \(p_{\mathrm {T}}\)

Fig. 26
figure 26

The transverse (left) and longitudinal (right) impact parameter biases as function of the track \(\eta \). The 2016 data entries in this figure are taken from the second part of the 2016 data visible in Fig. 24; the first part of 2016 data also shows no impact parameter dependence on track \(\eta \)

Figure 24 shows the transverse and longitudinal impact parameter biases as a function of the delivered luminosity in Run 2. Data collected in 2016 have a period-dependent \(d_0\) bias of \(-4\,\upmu \hbox {m}\) (early 2016) and \(+3\,\upmu \hbox {m}\) (late 2016). This bias was introduced by a change in the underlying geometry description of the ATLAS ID and a misconfiguration of the beam-spot constraint.Footnote 9 Data collected in 2017 and 2018 show overall \(d_0\) biases of less than \(1\,\upmu \hbox {m}\). The longitudinal impact parameter bias is negligible and constant across the years (below \(0.5\,\upmu \hbox {m}\)). In Figs. 25 and 26 the transverse and longitudinal impact parameter biases are shown as function of the track transverse momentum and track \(\eta \), respectively. The small bias in the longitudinal impact parameter as a function of track \(\eta \) is present in simulation and data and is consequently not introduced by the track-based alignment because it is not applied to simulation (where perfect alignment is assumed). The resulting bias has no significant effect on the ATLAS tracking performance as the longitudinal impact parameter resolution is on the order of \(100\,\upmu \hbox {m}\) for tracks with \(p_{\mathrm {T}} > 10\,\text {GeV}\).

8 Conclusion

This paper describes the precision alignment of the ATLAS Inner Detector (ID) for Run 2 and quantifies the impact of alignment uncertainties on track parameter biases. The alignment procedure consists of a track-based algorithm that minimises track-hit residuals. It calculates the track parameters at each measurement surface and encodes the relationship between track-hit residuals and the alignment parameters of each alignable structure. To resolve ambiguities, it imposes externally determined constraints on track parameters, e.g. using tracks from resonance decays. The alignment procedure is performed at different hierarchical levels, starting from the largest physical structures and proceeding to individual detector modules or sensor elements. The number of degrees of freedom increases for each subsequent alignment level. In total, more than 36,000 degrees of freedom are considered when aligning all silicon modules (IBL, Pixel and SCT) and more than 700,000 degrees of freedom are added for the TRT.

It has been observed that operational conditions affect the positions of ID elements. The Pixel detector moves rapidly upwards every time the data acquisition is activated. The staves of the IBL bow depending on the temperature; the degree of variation depends on the thermal load and is a function of the accumulated radiation dose and of the luminosity. The remaining detector structures are quite stable during an LHC fill; the movements of individual modules in the barrel have an RMS at the micrometer level while those in the endcap regions range from \(2\,\upmu \hbox {m}\) to \({\mathcal {O}}(10\,\upmu \hbox {m}\)).

An automated alignment procedure that corrects for relatively rapid movements of the Pixel detector and IBL and the relative positions of all of other subdetectors is executed for every LHC fill for which the ID collects data. The detailed alignment of all the other structures (subdetectors, barrel, endcaps, layers, disks, modules or wires) is determined in dedicated alignment campaigns. The impact of alignment weak modes, namely distortions that leave the track fit quality largely unchanged and can bias the measured track parameters, is minimised during these campaigns by employing external constraints on track parameters. Independent measurements are performed to quantify potential biases, enabling them to be largely removed. The residual sagitta bias and momentum scale bias after the full Run 2 alignment are reduced to less than \({\sim }0.1\hbox { TeV}^{-1}\) and \(0.9\times 10^{-3}\), respectively. Remaining track parameter biases do not significantly impact ATLAS physics analyses.