THE INFLUENCE OF CONCENTRATION AND DYNAMICAL STATE ON SCATTER IN THE GALAXY CLUSTER MASS–TEMPERATURE RELATION

Hsiang-Yi Karen Yang; Paul M. Ricker; P. M. Sutter

doi:10.1088/0004-637X/699/1/315

1. INTRODUCTION

Galaxy clusters are potentially valuable cosmological probes because of their unique position in the hierarchy of structure formation. They are the largest gravitationally bound objects, having just separated from the cosmic expansion and collapsed from density fluctuations in the past few billion years. Therefore, statistical measures of clusters, such as their mass distribution as a function of redshift, are sensitive to the cosmic matter density parameter Ω_m, the dark energy density parameter Ω_de, the normalization of the primordial fluctuation spectrum σ₈, and the dark energy equation-of-state parameter w (Haiman et al. 2005). Future cluster surveys will yield cosmological parameter constraints that are complementary to those from upcoming microwave background probes (e.g., Planck) and Type Ia supernova observations.

However, clusters present us with a dilemma: their masses are well predicted by numerical simulations, but in the real world, 80%–85% of their mass is in the form of invisible dark matter. In order to make contact with observations, observational mass proxies are needed. Fortunately, cluster masses correlate with many observable quantities, such as X-ray temperature T_X, X-ray luminosity L_X, optical richness, infrared luminosity, and the Sunyaev–Zel'dovich effect (Mohr et al. 1999; Lin et al. 2003; Popesso et al. 2005; Hansen et al. 2005; Stanek et al. 2006). These mass-observable relations indicate that clusters are fairly regular and close to equilibrium, despite having dynamical timescales of order 1/10 the age of the universe.

For clusters to provide meaningful constraints on the cosmological parameters, the systematic errors in mass estimates based on these relations must be well understood. For example, masses determined under the assumption of hydrostatic equilibrium in general underestimate the spherical overdensity mass M₂₀₀ by ∼20% (Nagai et al. 2007b). The X-ray temperature determined through spectral fitting is also known to bias low with respect to the emission-weighted temperature commonly used in numerical simulations (Rasia et al. 2005). The discrepancy in the normalization of the M–T_X relation between observations and numerical simulations has been alleviated only recently by taking these two effects into account. This emphasizes the importance of using mock-observation tools to make direct comparisons between results from numerical simulations and observational data.

The origin and distribution of scatter in these relations are also important. Due to the exponential shape of the mass function, scatter in the M–X relation (for observable X) boosts the number density of clusters observed in logarithmic bins of X, as the overall number of lower-mass clusters scattering to higher values of X far exceeds the number of high-mass clusters scattering in the opposite direction. Underestimating this scatter can lead to an overestimate of σ₈, for instance (Randall et al. 2002). Attempts have been made to reduce the observed scatter to get better constraints on cluster masses. Such attempts include the use of core-excised quantities to reduce the large scatter in the L_X–T_X relation due to the effects of cool cores (Allen & Fabian 1998; O'Hara et al. 2006) and the invention of the X-ray counterpart of the Compton y-parameter, Y_X ≡ M_gasT_X, to obtain a very tight M–Y_X correlation (Kravtsov et al. 2006). These successful examples illustrate the possibility of obtaining better mass estimates if our knowledge of the physical origin of scatter is improved. It is also possible to self-calibrate cluster surveys (Levine et al. 2002; Majumdar & Mohr 2004; Lima & Hu 2004), fitting the mass-observable relation as an unknown together with the cosmological parameters. However, this technique requires assumptions about the functional form and the mass and redshift dependence of the scatter, and errors in these assumptions can lead to misinterpretation of the obtained constraints (Lima & Hu 2005).

In this paper, we begin a systematic study of the influence of internal cluster physics on the scatter in cluster mass-observable relations. We focus initially on X-ray observables, as they are less directly dependent on the uncertain details of galaxy formation than, for example, optical observables. Our aim in this paper is to examine the effects of mergers and dynamical state in isolation from effects due to radiative cooling, feedback due to stars and black holes, and diffusive transport. Future papers will examine these other effects separately. Accordingly, the simulation described here includes only dark matter and gasdynamics. While we examine substructure-based measures of dynamical state to make contact with previous work, we extend this work by considering direct measurements of dynamical state that may not be observable but that can give us physical insight into the origin and form of the scatter. For example, we use halo merger histories derived from halo catalogs created every 100 h⁻¹ Myr to identify which clusters are merging at any given epoch. Section 2 describes our numerical methods and simulation parameters. In Section 3, we describe our merger tree and virial analysis procedures, our method for generating simulated X-ray observations, and the substructure measures we employ. We present our results in Section 4 and discuss them in Section 5. Finally, we summarize our conclusions in Section 6.

Throughout this paper, we have taken the Hubble constant to be H₀ = 100 h km s⁻¹ Mpc⁻¹, with h = 0.708. When quoting masses or radii defined using an overdensity criterion (e.g., M₂₀₀, R₂₀₀), we refer to overdensities relative to the critical density at the relevant epoch.

2. COSMOLOGICAL SIMULATION

2.1. Numerical Methods

The simulation described here was performed using FLASH, an Eulerian hydrodynamics plus N-body code originally developed for simulations of Type Ia supernovae and related phenomena (Fryxell et al. 2000). The flexibility of FLASH's application framework has enabled it to be applied to a wide range of problems. In the process it has been extensively validated, both for hydrodynamical (Calder et al. 2002) and cosmological N-body (Heitmann et al. 2005, 2008) applications. We used version 2.4 of FLASH together with the local transform-based multigrid Poisson solver described by Ricker (2008). The Euler equations describing the behavior of the intracluster medium (ICM) were solved using the piecewise parabolic method (PPM); extensive details of the FLASH implementation of PPM are given by Fryxell et al. (2000). The N-body component describing the behavior of the dark matter was handled using the particle-mesh technique with cloud-in-cell interpolation. Because we are concerned in this paper only with the effect of gravity-driven variations in dynamical state on mass-observable relations, the calculation described here did not employ radiative cooling or feedback due to star formation or active galaxies.

2.2. Simulation Details

The results presented here are based on a FLASH simulation of structure formation in the ΛCDM cosmology within a three-dimensional cubical volume spanning 256 h⁻¹ Mpc. Initial conditions were generated for a starting redshift z of 66 using GRAFIC (Bertschinger 2001) with an initial power spectrum generated using CMBFAST (Seljak & Zaldarriaga 1996). The cosmological parameter values used were chosen to be consistent with the third-year Wilkinson Microwave Anisotropy Probe (WMAP) results (Spergel et al. 2007): present-day matter density parameter Ω_m0 = 0.262, present-day baryonic density parameter Ω_b0 = 0.0437, present-day cosmological constant density parameter Ω_Λ0 = 0.738, and matter power spectrum normalization σ₈ = 0.74. The simulation contains 1024³ dark matter particles with a particle mass m_p = 9.2 × 10⁸ h⁻¹M_☉. The mesh used for the gasdynamics and potential solution was fully refined to 1024³ zones, which corresponds to a zone spacing of 250 h⁻¹ kpc. Considering the effect of resolution on the computed abundances of halos of different mass (Lukić et al. 2007), with these parameters we are able to capture all halos containing more than 3150 particles (i.e., total mass 2.9 × 10¹² h⁻¹M_☉) and 1150 particles (i.e., 1.1 × 10¹² h⁻¹ M_☉) at z = 0 and z = 1, respectively. The halos are identified using the friends-of-friends (FOF) algorithm. The overdensity mass and radius, M_Δ and R_Δ, are then found by growing spheres around each FOF center until the averaged total density is Δ times the critical density of the universe.

The simulation was carried out using 800 processors of the Cray XT4 system at Oak Ridge National Laboratory, requiring a total of 16,500 CPU-hours. Gas and particle snapshots were written to disk every 100 h⁻¹ Myr beginning at z = 2, yielding a total of 117 snapshots containing 15 TB of data.

3. ANALYSIS OF THE SIMULATION

3.1. Merger Tree Analysis

In order to directly quantify the dynamical state of clusters without relying on morphology, we generate merger trees for each cluster in our simulation and find the time since last merger. Here we summarize how the merger trees are extracted.

First, our simulation generates output files that contain particle tags and positions every 100 h⁻¹ Myr between z = 2 and z = 0. We run an FOF halo finder with linking length parameter b = 0.2 on the particle positions to find all the groups containing more than 10 particles. Although some of the very small groups are under our halo completeness limit, they have little effect on our results because we only look at minor or major mergers for which the mass of the smaller object is above our completeness limit. Thus, all mergers for the objects we study are being counted. Between successive outputs at times t = t_n, we find the progenitors at time t_n−1 for all the halos at t_n by tracing the particle tags, which are uniquely assigned to each particle in the beginning of the simulation. A halo A is identified as a progenitor of another halo B if A contains at least one particle that is also in B. For each halo we record the masses of its progenitors, their contributed masses, and the number of unbound particles. Then the merger trees are constructed by linking all the progenitors identified in the previous outputs for halos above our halo completeness limit at z = 0. Deriving the mass accretion histories is straightforwardly accomplished by following the mass of the most massive progenitor back in time.

To find the time since last merger for any given halo, we need to define what a "merger" is. There are many different ways to define mergers in cosmological simulations. In our analysis we adopt two definitions: the mass-jump definition, in which a merger is present if there is a mass jump in the halo's assembly history; and the mass-ratio definition, which identifies a merger if the ratio of contributed masses from the first- and second-ranked progenitors is less than a certain value (Cohn & White 2005). To study the variations in cluster observables induced by different types of mergers, we use 1.2 and 1.33 as thresholds for the mass-jump definition and 10:1, 5:1, and 3:1 in the mass-ratio definition. For each of the five criteria, the time since last merger is found for all clusters at z = 0 and z = 1.

In the later discussions we refer by "merging clusters" at a given lookback time to those identified by at least one of the five merger diagnostics in the preceding 3 Gyr, the typical time for clusters to return to virial equilibrium within R₅₀₀ (Poole et al. 2006). The mergers are "major" if the mass jump is larger than 1.2 or if the mass ratio is less than 5:1; "minor" mergers, on the other hand, have mass ratios between 10:1 and 5:1.

3.2. Simulated X-ray Observations

The observational X-ray temperature of the ICM, T_X, usually refers to the spectroscopic temperature, obtained by fitting a single-temperature model to the integrated cluster spectrum. Using simulation-based temperature proxies, such as emission-weighted temperature, has been shown to overestimate T_X (Mathiesen & Evrard 2001; Rasia et al. 2005). To directly compare simulated cluster properties with observations, we create mock Chandra images for our simulated clusters using the following procedure.

First, energy-dependent surface-brightness maps are constructed by projecting X-ray emission from all the gas cells associated with each cluster along each of the three orthogonal axes. The energy dependence is stored as the third dimension of the map between 0.04 and 10 keV with spacing ΔE = 0.0498 keV. With the gas density ρ and temperature T of each cell in each simulation output and an assumed metallicity Z, the X-ray emissivity, epsilon _E = ρ²Λ_E(T, Z, z), is computed using the single-temperature MEKAL model (Mewe et al. 1985; Kaastra & Mewe 1993; Liedahl et al. 1995) implemented in the utility XSPEC (Arnaud 1996).

With these energy-dependent surface-brightness maps in hand for each cluster, we then use MARX⁴ to simulate Chandra X-ray observations. Given the position and spectrum of a source, MARX can perform ray generation, apply built-in models for Chandra's aspect motion and mirror and detector responses, and output photon event files in FITS format for analysis with standard observational tools. The user-defined source model supported by MARX allows us to generate light rays from astronomical sources with arbitrary shape and spectrum. We assign the positions and energies of photons based on the probability distribution defined by our three-dimensional surface-brightness maps. Each cluster is "observed" with Chandra's CCD chip ACIS-S using an exposure time that ensures two million photons are collected, longer than typical deep observations. The particular long exposure time is chosen to minimize observational uncertainties since we are interested in the intrinsic scatter in cluster observables. The photon event files thus produced are processed with CIAO to extract the spectra within apertures of size R₅₀₀ centered on each cluster's surface-brightness peak. The spectroscopic temperatures T_X are obtained by using XSPEC to fit the spectra in the range 0.5–10 keV, the range often used by observers (e.g., Vikhlinin et al. 2006). We tested the above procedure using a set of isothermal clusters with β-model density profiles, varying the input temperature from 0.5 to 5 keV and using two different metallicities, Z = 0 and Z = 0.3 Z_☉. We have verified that for our chosen exposure time the input temperatures are recovered within 1σ errors in all cases. For simplicity we assume zero metallicity in the following analysis. A detailed description and verification tests of the same X-ray simulator can be found in the appendix of Zuhone et al. (2008).

The left panel in Figure 1 shows T_X versus the emission-weighted temperature T_ew measured within R₅₀₀ for clusters with M₅₀₀ above 2 × 10¹³ M_☉ at z = 0. We find that T_ew is on average biased high with respect to T_X with a fractional bias of (T_ew–T_X)/T_X ∼ 23%, consistent with previous findings (Mathiesen & Evrard 2001; Rasia et al. 2005). As pointed out by previous authors, this systematic shift is due to the superposition of cluster gas with different temperatures along the line of sight. To verify that this explains our result, we performed a test for a set of β-model clusters whose gas is composed of two different temperatures, with varied cool-gas fractions. We found that T_ew is higher than T_X in all cases. We can understand this result by expressing these temperature measures in the following way: T ≡ ∫W(T)TdV/∫W(T)dV, where W(T) is a weighting function. For T_ew, $W(T)=\Lambda (T) \propto \sqrt{T}$ if dominated by bremsstrahlung emission. Assuming T_X can be approximated by the spectroscopic-like temperature, then W(T) ∼ T^−3/4 (Rasia et al. 2005), that is, T_X tends to weight more on the cool gas in a cluster and thus is systematically lower than T_ew in general.

To see whether this systematic shift is dependent on merger types, we also plot in the right panel the normalized distributions of deviations from the best-fit relation for relaxed clusters, minor mergers, and major mergers. Comparing their distributions using the Wilcoxon Rank-Sum (R-S) test shows that the systematic shift between T_X and T_ew is similar for relaxed clusters and minor mergers, but is larger at a statistically significant level for major mergers, based on our merger definitions. This is also in agreement with Mathiesen & Evrard (2001), who found that the difference between T_X and T_ew is larger for merging clusters because the spectral fit is largely contributed by X-ray photons from the bright and cool accreted subclumps.

3.3. Substructure Measures

High-resolution X-ray images of clusters taken with Chandra and XMM-Newton have revealed disturbed ICM structures such as shocks, bubbles, and cold fronts (e.g., Bîrzan et al. 2004; Hallman & Markevitch 2004). Various substructure measures have been used to quantify the irregularity, such as centroid offset (Mohr et al. 1995) and power ratios (Buote & Tsai 1995, 1996). In previous work, this irregularity is often assumed to be associated with mergers. To examine the effectiveness of the substructure measures and the adequacy of using them as indicators of dynamical state, we calculate the centroid offset and power ratios in addition to other theory-based definitions of cluster mergers in our simulation.

There are many ways to define the centroid offset. For observed clusters, it can be defined as the variance in the centroids of cluster regions above several surface-brightness isophotes (e.g., O'Hara et al. 2006). Since the offset is essentially a measure of the distance between the surface-brightness peak and the cluster centroid, Kay et al. (2007) used a simpler definition for their simulated clusters,

$\begin{equation} w = \frac{ | \vec{R}_{\Sigma,{\rm max}} - \vec{R}_{\Sigma,{\rm cen}} | }{R_{500}}, \end{equation} \tag{ 1 }$

where $\vec{R}_{\Sigma,{\rm max}}$ is the position of the surface-brightness peak and $\vec{R}_{\Sigma,{\rm cen}}$ is the surface-brightness centroid. We compared the centroid offsets calculated using both methods and found that these two definitions give similar results. We will use the latter definition hereafter since it is more strongly correlated with the power ratios for our simulated clusters.

Power ratios are the multipole moments of surface brightness measured within a circular aperture centered on the cluster's centroid. The moments, a_m and b_m (defined below), are sensitive to substructures in the surface-brightness distribution. This method is motivated by the multipole expansion of the two-dimensional gravitational potential,

$\begin{eqnarray} \Psi (R,\phi) &=& -\! 2Ga_0 \ln \left(\frac{1}{R} \right) - 2G \sum ^\infty _{m=1} \frac{1}{mR^m}\nonumber\\ &&\times (a_m \cos m\phi + b_m \sin m\phi). \end{eqnarray} \tag{ 2 }$

The moments a_m and b_m are

$\begin{eqnarray} a_m(R) &=& \int _{R^{\prime } \le R} \Sigma (\vec{x}^{\prime }) (R^{\prime })^m \cos m\phi ^{\prime } d^2 x^{\prime }, \nonumber \\ b_m(R) &=& \int _{R^{\prime } \le R} \Sigma (\vec{x}^{\prime }) (R^{\prime })^m \sin m\phi ^{\prime } d^2 x^{\prime }, \end{eqnarray} \tag{ 3 }$

where $\vec{x}^{\prime } = (R^{\prime }, \phi ^{\prime })$ , R is the aperture radius, and Σ is the surface mass density, or surface brightness in the case of X-ray observations.

The mth power P_m is the azimuthal average of the amplitude of Ψ_m, the mth term in the multipole expansion of the potential given in Equation (2),

$\begin{equation} P_m(R) = \frac{1}{2\pi } \int ^{2\pi }_0 \Psi _m (R,\phi) \Psi _m (R, \phi) d\phi. \end{equation} \tag{ 4 }$

For m = 0 and m > 0, we have

$\begin{eqnarray} P_0 &=& [a_0 \ln (R)]^2, \nonumber \\ P_m &=& \frac{1}{2m^2 R^{2m}} \big(a_m^2 + b_m^2\big), \end{eqnarray} \tag{ 5 }$

respectively. The power ratios are thus P_m/P₀, the mth power normalized by the flux within R.

For each cluster we compute P₂/P₀, P₃/P₀, and P₍₁₎/P₀. The quantities without parenthetical subscripts are evaluated about the surface-brightness centroid (therefore P₁ vanishes by definition). The quadrupole power, P₂, is related to the degree of flattening or ellipticity. The next odd moment, P₃, is sensitive to unequal bimodal structures. P₍₁₎/P₀, which is calculated about the surface-brightness peak, measures gas distribution around the peak and thus is similar to the centroid offset. Since the power ratios are sensitive to substructures at the scale of the aperture radius R, each of the power ratios is calculated using three different radii, R₅₀₀, R₂₀₀, and 1 Mpc, for each cluster.

4. RESULTS

Figure 2 shows the M₅₀₀–T_X,500 relation for clusters with M₅₀₀ above 2 × 10¹³ M_☉ at z = 0. All quantities are viewed along the x-direction in the simulation box. The best-fit relations for all clusters as well as different subgroups are given in Table 1. Note that the best-fit slopes for relaxed and merging clusters are different. The difference in the slopes implies that there is variation in the slopes derived from observed clusters using different selected subsamples. This systematic uncertainty in the slope, which is ∼5% by comparing the slopes of the merging and relaxed clusters in Table 1, will contribute to the uncertainties in cosmological parameters. For example, it will translate into a systematic uncertainty of ∼3% in σ₈ when using observed cluster samples including and excluding merging clusters. This is relatively small compared to the current level of uncertainty (∼10%) in determining the cosmological parameters, but as the constraints are improved to a few percent in the future, one has to keep it in mind when comparing results using different cluster selection criteria.

Table 1. Best-fit Parameters in the Mass–Temperature Relations, log(T_X,500) = a + blog(M₅₀₀), for Clusters at z = 0

Subgroup	Count	a	b
All	619	−8.067 ± 0.004	0.5842 ± 0.0003
Relaxed	478	−7.981 ± 0.004	0.5780 ± 0.0003
Merging	141	−8.409 ± 0.008	0.6093 ± 0.0006
Minor	55	−8.379 ± 0.013	0.6068 ± 0.0009
Major	86	−8.050 ± 0.011	0.6163 ± 0.0008

Note. Errors in the parameters are the 1σ errors.

Download table as: ASCII Typeset image

While we are able to reproduce the self-similar M₂₀₀–T_ew scaling relation (e.g., Bryan & Norman 1998), the slope and normalization of the M₅₀₀–T_X,500 relation are both smaller compared to observed values. This deviation is mainly due to missing baryonic physics and the use of different mass estimates (e.g., Borgani et al. 2004; Nagai et al. 2007a). Matching the slope and normalization of the scaling relations with observed values is a subject of great interest on its own and will be investigated in our future papers when we incorporate more realistic models for the baryonic component. For the purpose of this paper, we will focus on the scatter in these relations and the relative importance of clusters of different merger histories.

4.1. Distribution of Intrinsic Scatter

Figure 3 shows the normalized distribution of the logarithmic deviations of temperature from the best-fit M₅₀₀–T_X,500 relation at z = 0. The rms scatter for our whole sample is 6.10%, which is smaller than the values obtained by simulations with cooling and heating, such as 13.6% in Nagai et al. (2007a) and 20% in O'Hara et al. (2006), who also used the spectroscopic temperature and true mass. It is difficult to directly compare with observed values because the observationally estimated intrinsic scatter is dependent on how the measurement errors are determined and thus displays a wide range in the literature, from 3.9% in Arnaud et al. (2005) to 17% in O'Hara et al. (2006).

In the left panel of Figure 3 we compare the distributions of merging (red) and relaxed (black) clusters, while the right panel plots for minor and major mergers individually. The two distributions on the left display no apparent difference in their mean values. The standard deviation for merging clusters appears to be larger than that of the relaxed clusters. To test the hypothesis that the merging and relaxed distributions differ, we performed the Wilcoxon R-S test and the F-variance (F-V) test to see whether these two populations have significantly different mean values or variances, respectively. A small value (<0.05 for a significance level of 5%) returned by the tests is often adopted to indicate a significant difference between these two populations. We performed the tests on different merging subgroups and summarize the results in Table 2.

Table 2. Significance Tests on the Distribution of Scatter for Different Populations at z = 0 and z = 1

Subgroup	z	N	Mean	R-S Test	σ_rms(%)	F-V Test	K-S Test
Relaxed	0	478	1.31 × 10⁻²	...	5.87	...	2.5 × 10⁻²
Merging	0	141	8.67 × 10⁻³	0.134	6.90	0.014	2.5 × 10⁻²
Minor	0	55	1.11 × 10⁻²	0.146	7.21	0.029	2.0 × 10⁻¹
Major	0	86	7.09 × 10⁻³	0.253	6.72	0.086	2.5 × 10⁻²
Relaxed	1	102	7.14 × 10⁻³	...	4.85	...	2.6 × 10⁻⁴
Merging	1	121	8.72 × 10⁻³	0.493	5.71	0.091	6.5 × 10⁻³
Minor	1	46	1.41 × 10⁻²	0.230	5.25	0.504	8.0 × 10⁻²
Major	1	75	5.45 × 10⁻³	0.290	5.98	0.049	6.5 × 10⁻³

Note. The R-S and F-V test results for each subgroup are relative to the relaxed clusters.

Download table as: ASCII Typeset image

The R-S test results show that the mean values do not differ significantly among all populations, which means that the intrinsic scatter is unbiased for merging and relaxed clusters at z = 0. The F-V tests for all mergers and minor mergers show that their standard deviations, or the amount of scatter, are significantly larger than that of the relaxed ones. For clusters at z = 1, there is also no bias between merging and relaxed populations. The amount of scatter for merging clusters also tends to be greater than relaxed clusters, although only major mergers show a significant result. We will discuss the possible reasons for this trend in the following two sections.

Since the form of scatter can affect the observed scaling relations, we also test the Gaussianity of the distribution of scatter in log space by fitting it with a Gaussian curve and using the Kolmogorov–Smirnov (K-S) test to see if the two distributions differ significantly. Again, a small value represents a significant deviation from the Gaussian distribution. The test results show that the distributions of scatter for all populations but minor mergers differ from a lognormal distribution at a significant level. There is an even stronger signature of deviation from lognormal at z = 1. We note that the deviation of the scatter from a lognormal distribution may affect results from some of the self-calibration studies assuming a lognormal distribution of scatter (e.g., Lima & Hu 2005) and should be taken into account to correctly interpret the obtained constraints on the cosmological parameters.

4.2. Intrinsic Scatter versus Halo Concentration

In the following two sections, we will investigate the physical origin of the M₅₀₀–T_X,500 scatter by correlating it with cluster properties that are related to how the clusters are formed. In particular, we first show that the intrinsic scatter depends strongly on halo concentration. Then we discuss the contribution of scatter from recent merging events in the following section.

Figure 4 shows a strong positive correlation between the scatter in the M₅₀₀–T_X,500 relation and the scatter in the M₅₀₀–(R₂₀₀/R₅₀₀) relation. The correlation coefficient is 0.64, with a probability of zero, given by the Spearman Rank-Order Correlation test (Press et al. 1992, Section 14.6; probability of 1 means no correlation). To ensure that this result is not biased by the lower-mass clusters whose R₅₀₀ values are close to the resolution of the simulation, we raised the mass threshold to M₅₀₀ ⩾ 10¹⁴ M_☉ and repeated the analyses for these well-resolved systems. We found that for the 67 selected massive clusters, the correlation still holds, with Spearman correlation coefficient 0.36 and probability 0.003. Note that we correlate with dlog(R₂₀₀/R₅₀₀) instead of the raw value of R₂₀₀/R₅₀₀ because the latter is a function of cluster mass. By doing so we exclude the effect of different cluster masses, focusing on the variation in halo concentrations. R₂₀₀/R₅₀₀ is a monotonically decreasing function of the halo concentration parameter, usually defined as c ≡ R₂₀₀/R_s, where R_s is the scale radius of a cluster. Therefore, for clusters with similar masses, more concentrated clusters tend to lie under the mean mass–temperature relation, while the puffier clusters tend to scatter high.

The relation between R₂₀₀/R₅₀₀ and c is less obvious, so we derive their relation assuming an NFW profile (Navarro et al. 1995, 1996, hereafter NFW) in the following. For a cluster that has an NFW density profile, the mass enclosed within a normalized radius of x ≡ R/R_s is

$\begin{eqnarray} M(<x) &=& 4\pi \rho _s R_s^3 \left[ \ln (1+x) - \frac{x}{1+x} \right] \nonumber \\ &\equiv & 4\pi \rho _s R_s^3 f(x), \end{eqnarray} \tag{ 6 }$

where ρ_s is the density at the scale radius R_s. Also, the mass in the spherical overdensity definition is

$\begin{equation} M_\Delta = \frac{4\pi }{3} \Delta \rho _{\rm crit} (x_\Delta R_s)^3, \end{equation} \tag{ 7 }$

where Δ is the overdensity and ρ_crit is the critical density of the universe. Equating Equations (6) and (7) gives

$\begin{equation} \Delta \frac{x_\Delta ^3}{f(x_\Delta)} = 3 \frac{\rho _s}{\rho _{\rm crit}}, \end{equation} \tag{ 8 }$

which is a constant for the cluster under consideration. Therefore, the relation between x₅₀₀ and x₂₀₀ (or c, recall c ≡ R₂₀₀/R_s) is

$\begin{equation} \frac{x_{500}^3}{f(x_{500})} = \frac{2}{5} \frac{c^3}{f(c)}. \end{equation} \tag{ 9 }$

We can use this relation to numerically solve for x₅₀₀ as a function of c, and then the relation between R₂₀₀/R₅₀₀ and c is simply

$\begin{equation} \frac{R_{200}}{R_{500}} = \frac{c}{x_{500}(c)}. \end{equation} \tag{ 10 }$

This relation is plotted in the right panel of Figure 4. We choose to use the parameter R₂₀₀/R₅₀₀ instead of the original halo concentration parameter c because it has two advantages. The first is to avoid introducing the uncertainty in fitting an NFW profile, especially for less massive clusters, since the fitting is very sensitive to the resolution in the central region of the cluster. Moreover, our analyses involve not only relaxed clusters but also merging ones, for which R₂₀₀/R₅₀₀ is actually more well defined than c.

One can understand why the correlation between the M₅₀₀–T_X,500 scatter and the halo concentration exists using the virial theorem. Consider the simplest case for an isolated system: 2T + W = 0, where T and W are the total kinetic and gravitational binding energy of the system, respectively. Then, in general,

$\begin{equation} \frac{k_B T_{\rm vir}}{\mu m_p} \propto \frac{GM_{\rm vir}}{R_{\rm vir}}, \end{equation} \tag{ 11 }$

where k_B is the Boltzmann constant, μ is the mean molecular mass of the gas, m_p is the mass of a proton, and T_vir, M_vir, and R_vir are the virial gas temperature, mass, and radius of the system, respectively. Since the relations between the virial quantities and the quantities in the overdensity definition depend on individual cluster profiles, or halo concentrations, the M_Δ–T_Δ relation derived from above would have a normalization which is a function of concentration. This is why we expect the scatter in the M₅₀₀–T_X,500 relation to correlate with the concentration parameter.

Halo concentrations have been shown to be related to the epoch at which the halo formed (Navarro et al. 1997; Wechsler et al. 2002; Zhao et al. 2003; Neto et al. 2007). Since we have the mass assembly histories of all the simulated clusters, it is straightforward to derive the cluster formation time based on the definition that a cluster "forms" when it first exceeds a certain fraction of its final mass. The commonly adopted thresholds include 10%, 25%, 50%, and 70%. Since more concentrated halos tend to form at higher redshift, a negative correlation between the M₅₀₀–T_X,500 scatter and the formation redshift is expected. Indeed we find significant negative correlations with the time when the cluster first reached one half, t_1/2, and one quarter, t_1/4, of its final mass (see Figure 5). These correlations are not as tight as the one with halo concentrations, probably due to the fact that the correlation between the halo concentration and the cluster formation time itself has a very large scatter, and also that the variation in halo concentrations cannot be fully accounted for by the variation in cluster formation time (Neto et al. 2007). But the significance of these correlations with cluster formation times supports our finding that the scatter correlates with halo concentrations. Therefore, we can say that cluster assembly histories leave imprints on the shapes of clusters at the present day that help to determine clusters' positions on the mass-observable scaling relation.

**Figure 5.** Left: M₅₀₀–T_X,500 scatter correlated with cluster formation time, defined as the time when a cluster first obtained half of its final mass. The Spearman correlation coefficient is −0.14, with a probability of 1.95 × 10⁻⁴ (probability of 1 means no correlation). Right: same plot with the x-axis being the time a cluster reached 1/4 of its final mass. The correlation coefficient is −0.11, with a probability of 9.4 × 10⁻³. Both plots show that clusters that are formed at earlier/later times tend to scatter low/high.
Download figure:
Standard image High-resolution image

The strong correlation in Figure 4 suggests that the variation in halo concentrations contributes to a significant amount of the intrinsic scatter in the M₅₀₀–T_X,500 relation. It also implies that when we are provided the best-fit relation, dlog T_X = 1.81 × dlog(R₂₀₀/R₅₀₀), it is possible to use dlog(R₂₀₀/R₅₀₀) as a third parameter to normalize T_X in the M₅₀₀–T_X,500 relation, and thus reduce the scatter. After removing the effect of halo concentrations, we find that the rms scatter decreases from 6.10% to 4.49%, a reduction of ∼26% of its original value. The corrected M₅₀₀–T_X,500 relation is shown in Figure 6.

**Figure 6.** M₅₀₀–T_X,500 relation corrected for halo concentrations. Notations are the same as in Figure 2.
Download figure:
Standard image High-resolution image

We can explain the trend found in the previous section using this correlation too. Figure 7 shows the distributions of halo concentration for merging and relaxed clusters at z = 0 and z = 1. For both redshifts, the distribution of halo concentration for merging clusters is more dispersed than for relaxed clusters, as also found in Neto et al. (2007). If the variation in halo concentrations is important to the M₅₀₀–T_X,500 scatter, as the above correlation suggests, then it is reasonable that merging clusters have a greater amount of scatter than relaxed ones. One may try to relate this trend to the dynamical state of clusters because the temperature excursions during mergers could also drive the scatter. However, in the following section, we will show that this is not the case.

In summary, we have found that the scatter in the M₅₀₀–T_X,500 relation partly originates from the variation in halo concentrations, with more concentrated or early-formed clusters lying below the mean (they are cooler), while puffier clusters that are formed recently tend to be hotter than clusters with similar masses. Using the strong correlation between the scatter and halo concentrations, the scatter can be greatly reduced to get a much tighter relation to be used for cosmology. The correlation can also explain the trend seen in the previous section that merging clusters have a greater amount of scatter than relaxed ones. Note that our simulation adopted the value of σ₈ = 0.74 from WMAP3 results, which is smaller than σ₈ = 0.796 from WMAP5. Although choosing a smaller σ₈ would decrease the average concentration of clusters with a fixed mass (e.g., Duffy et al. 2008), it is the variation of concentration that correlates with the scatter. Therefore, the correlation should still hold if a higher σ₈ is used.

4.3. Intrinsic Scatter Versus Recent Mergers

The other possible origin of the scatter in our simulation is the departure from hydrostatic equilibrium due to cluster mergers. Cluster mergers are among the most energetic events in the universe. When two clusters merge, their gas is compressed and heated by merger shocks. This effect can boost the luminosity and temperature of the cluster a few times higher than its pre-merger value, as found in ideal merger simulations (Ricker & Sarazin 2001; Poole et al. 2007). Therefore, our aim is to investigate how merger events statistically influence the cluster scaling relations.

In order to see how merging events influence the observable quantities of clusters, we correlate their mass–temperature scatter with their dynamical state. Two different methods are used to quantify the dynamical state. The first is based on the actual cluster merger histories, where we use the time since last merger as an indicator. The second one is motivated by observations that unrelaxed clusters often have more substructures than relaxed clusters. We discuss results using both methods in the following.

First, we correlate the M₅₀₀–T_X,500 scatter with the time since last merger, t_last (thus clusters that just underwent mergers would have t_last = 0). As shown in Figure 8, there is a trend for more recently merged clusters to lie below the mean relation, but only the correlation for major mergers has a high probability. Note that although the scatter here is uncorrected, this correlation is not due to the effect of halo concentrations because halo concentrations work in the opposite direction, as described in Section 4.2. Therefore, this trend may be due to clusters that have just merged with a cool clump and are still on their way to virialization, as illustrated by the following example. Figure 9 is the time history constructed from our merger tree analysis for a cluster undergoing a minor merger. Here we plot the evolution of mass, temperature, and substructure measures versus lookback time (t = 0 for today). We can see that at t = 1.3 Gyr, a clump of cold gas merged into the primary cluster. The cold accreted gas caused a jump in mass and substructure measures, but it reduced the average temperature of the cluster. This behavior thus tends to make merging clusters lie below the M₅₀₀–T_X,500 relation when they have just merged and then gradually move up when they become virialized.

**Figure 8.** M₅₀₀–T_X,500 scatter vs. the time since last merger for major mergers (left panel) and minor mergers (right panel) at z = 0. The Spearman correlation coefficients are 0.24 and 0.008, with probabilities of no correlation being 0.02 and 0.95 for major and minor mergers, respectively.
Download figure:
Standard image High-resolution image

**Figure 9.** Example time history of a cluster that undergoes a minor merger. The solid (dashed) line is M₅₀₀ (M_fof) in the first panel, and T₅₀₀ (T_max) in the second panel. The other panels show the evolution of the substructure measures.
Download figure:
Standard image High-resolution image

This trend is also seen when we correlate scatter with some of the substructure indicators, although the probabilities are not high. Figure 10 shows the correlation with the two substructure measures that give the highest probabilities, P₂/P₀ and P₃/P₀ measured with a fixed aperture radius 1 Mpc. Here we use a fixed aperture size because this gives power ratios that are weaker functions of cluster mass than substructures computed using R₂₀₀ or R₅₀₀. This is to minimize covariance with halo concentrations through the mass dependence. This negative correlation between scatter and substructures again supports the idea that merging clusters tend to be cooler than clusters with similar masses.

But do recent mergers cause a problem statistically? Again we want to compare the distributions of scatter for the merging and relaxed populations in the same way as in Section 4.1. But as discussed earlier, the trends are probably dominated by the effect of halo concentrations. Since we want to study the effect of recent mergers in isolation from other effects, we compute the statistics using the M₅₀₀–T_X,500 relation after correcting for halo concentrations (Figure 6) instead of the raw relation. In this way, we can see whether the dynamical state of clusters is the second dominant factor in the scatter.

The distributions of the concentration-corrected scatter for merging and relaxed clusters at z = 0 are shown in Figure 11. Again we use the R-S test and F-V test to detect whether there is any difference in the mean values and variances of these two populations. The results are summarized in Table 3. We find that after removing the effect of halo concentration, the trend that merging clusters have a larger rms scatter becomes insignificant. This supports our earlier statement that the behavior of the raw scatter is determined more by the distribution of halo concentrations than by the dynamical state of clusters. As for the mean values, there is a significant relative bias for merging clusters to have smaller means than relaxed clusters at z = 0. This bias can be due to the incomplete virialization of merging clusters we just described. However, it is also possible that we have overcorrected for the halo concentrations for merging clusters in comparison with relaxed clusters, because merging clusters are generally less concentrated (or have larger R₂₀₀/R₅₀₀, see Figure 7).

Table 3. Significance Tests on the Distribution of Scatter After Removing the Effect of Halo Concentrations for Different Populations at z = 0 and z = 1

Subgroup	z	N	Mean	R-S Test	σ_rms(%)	F-V test
Relaxed	0	478	1.44 × 10⁻²	...	4.42	...
Merging	0	141	4.27 × 10⁻³	0.012	4.67	0.403
Minor	0	55	6.39 × 10⁻³	0.058	4.63	0.612
Major	0	86	2.92 × 10⁻³	0.034	4.72	0.406
Relaxed	1	102	1.03 × 10⁻²	...	4.25	...
Merging	1	121	6.08 × 10⁻³	0.329	4.88	0.151
Minor	1	46	1.30 × 10⁻²	0.275	4.96	0.202
Major	1	75	1.86 × 10⁻³	0.145	4.81	0.241

Note. The R-S and F-V test results for each subgroup are relative to the relaxed clusters.

Download table as: ASCII Typeset image

In summary, by correlating the scatter with the dynamical state of clusters using both merger tree and substructure analysis, we find a very weak trend that merging clusters tend to be cooler than relaxed clusters with similar masses. But this trend has a minor effect on the statistical properties of the scatter, i.e., the scatter for the merging clusters is neither biased nor wider spread compared to the relaxed ones at z = 0 and z = 1 (see Table 3). This implies that the dynamical state is not the second most important factor that contributes to the scatter, but that still other sources need to be found. However, the fact that the distributions of scatter for the merging and relaxed clusters are indistinguishable even out to higher redshift is good news for using cluster scaling relations in cosmology.

5. DISCUSSION

5.1. Effect of Dynamical State

In the previous section, we have shown that the dynamical state of clusters has very little influence on the overall scatter in the M₅₀₀–T_X,500 relation. When we look at the merging population, there is even a tendency for merging clusters to be cooler than relaxed ones of similar masses. Although this can be explained by incomplete virialization of clusters merging with a cooler clump, it is still somewhat contrary to our intuition that merger shocks can heat the ICM and raise the temperature of a merging cluster. We discuss the possible reasons for this result in the following.

One reason is that sometimes the merger shock is not captured by the projected R₅₀₀ aperture. At the beginning of mergers, the shocks often occur in the outskirts of clusters. So in order for the shock to be captured inside R₅₀₀, either the shock has to propagate into the R₅₀₀ region of the main cluster, or the two clusters have to merge roughly along the line of sight in order to affect T_X,500. The cluster history shown in Figure 12 is one example of such a case. At t = 1.3 Gyr, the substructure measures increase, indicating the start of the merger event. The maximum temperature in the cluster is increased by the merger shock, but T_X,500 is unaffected because the shock-heated gas lies beyond the projected aperture radius R₅₀₀. Even if the merger shock is within R₅₀₀, the spectroscopic temperature is not as sensitive to shocks as the emission-weighted temperature because when the shock and other cooler gas in the cluster are projected along the line of sight, the spectral fit tends to put more weight on the cooler gas (Mazzotta et al. 2004).

**Figure 12.** Example time history of a cluster that undergoes a major merger. The solid (dashed) line is M₅₀₀ (M_fof) in the first panel, and T₅₀₀ (T_max) in the second panel. The other panels show the evolution of the substructure measures.
Download figure:
Standard image High-resolution image

Second, the duration of the temperature boost is typically only ∼0.5 Gyr (Ricker & Sarazin 2001; Poole et al. 2007), and hence only a fraction of the unrelaxed clusters are observed during the transient excursion period. Moreover, even for those clusters that have just undergone major mergers, the increase in mass and temperature are often comparable. For example, a 3:1 merger would have a mass jump of M_f/M_i ∼ 1.33 and a temperature jump of T_f/T_i ∼ 2 depending on the impact parameter of collision. Thus, clusters tend to evolve roughly parallel to the scaling relation, as also suggested by previous works (e.g., Poole et al. 2007).

Finally, merging clusters are the minority population compared to relaxed clusters. The fraction of clusters that had a merger within the past 3 Gyr is ∼23% at z = 0 and ∼54% at z = 1, while major mergers are rarer. All these effects combined are responsible for diluting the influence of mergers on the scatter and making their distributions indistinguishable from relaxed clusters.

5.2. Effectiveness of Substructure Indicators

Substructure measures, such as centroid offsets and power ratios, have often been used in earlier studies to identify unrelaxed clusters (e.g., O'Hara et al. 2006; Kay et al. 2007; Jeltema et al. 2008). Observationally, they are quantities that link most effectively to the dynamical state of clusters (Mohr et al. 1993; Buote & Tsai 1996). In simulations, they are easy to derive and to compare with observations. However, they still have some limitations, such as the projection effect.

In our study we have extended the analysis of the dynamical state of clusters by constructing cluster merging histories, because these provide more information about the true dynamics of clusters during mergers than morphology-based measures, which are subjected to observational limitations. This is the first time that these two approaches have been compared to see how effectively the substructure measures can recover the true merging population.

Figure 13 shows the negative correlations between two of the substructure measures as seen in the x projection of the simulation box and the time since the last ⩽5:1 merger for z = 0. The filled and open circles are merging and relaxed clusters identified by the merger tree diagnostic, respectively. The two horizontal dashed lines mark the upper and lower 20% of all clusters that have the highest and lowest substructure values. We can see that, indeed, recently merged clusters tend to have more substructure than relaxed clusters. However, the merging and relaxed populations overlap over a wide range of substructure values because of the large variation in substructures even for clusters at the same dynamical state. Therefore, there is not a clean cut to separate these two populations using the substructure measures.

In order to facilitate future studies in this area, we compute the completeness and contamination of merging and relaxed clusters defined by the substructure measures, assuming those identified using the cluster merger histories represent the true populations. We refer by "completeness" to the fraction of merging/relaxed clusters found using substructure among the true merging/relaxed population, while "contamination" is the fraction of clusters that are detected as merging/relaxed but are actually relaxed/merging. In Table 4, we summarize the results for different substructure measures. In the definitions using substructures, clusters that lie above/below the 20% thresholds are defined as merging/relaxed. In order to see how much the results are affected by the projection effect, we list the values obtained using information from both one and all projections. In the latter case, we select clusters by comparing the maximum value among three projections to the selection thresholds. In other words, clusters are identified as merging if the most disturbed value among three projections is above the 20% threshold, and relaxed clusters must have their most disturbed value below the lower 20% threshold.

Table 4. Completeness and Contamination of Merging and Relaxed Clusters Identified Using Power Ratios and Centroid Offsets

	P₂/P₀			P₃/P₀			P₍₁₎/P₀			w
	(1 Mpc)	(R₂₀₀)	(R₅₀₀)	(1 Mpc)	(R₂₀₀)	(R₅₀₀)	(1 Mpc)	(R₂₀₀)	(R₅₀₀)	(R₅₀₀)
Comp. of merging	39.6	38.2	31.3	36.8	34.7	28.5	22.2	21.5	23.6	47.9
Cont. of merging	54.0	55.7	63.7	57.3	59.7	66.9	74.2	75.0	72.6	44.3
Comp. of relaxed	22.8	23.2	21.8	22.2	23.4	21.6	19.3	19.3	18.0	23.9
Cont. of relaxed	12.8	11.2	16.8	15.2	10.4	17.6	26.4	26.4	31.2	8.94
	(P₂/P₀)_max			(P₃/P₀)_max			(P₍₁₎/P₀)_max			w_max
Comp. of merging	40.3	44.4	33.3	42.4	37.5	29.9	25.7	25.0	22.9	54.9
Cont. of merging	53.2	48.4	61.3	50.8	56.5	65.3	70.2	71.0	73.4	36.1
Comp. of relaxed	21.6	23.2	22.0	22.8	22.8	22.2	20.1	19.5	21.6	25.0
Cont. of relaxed	17.6	11.2	16.0	12.8	12.8	15.2	23.2	25.6	17.6	4.9

Notes. Clusters lying above/below the 20% thresholds are defined as merging/relaxed. For the upper half of the table, the selection is based on the values of substructure measures in one projection, while for the bottom half, clusters are found by comparing the maximum value among three projections to the thresholds. See the text for details on the definition of completeness and contamination.

Download table as: ASCII Typeset image

By comparing the numbers in Table 4, we find that in general P₂/P₀ and P₃/P₀ give similar results, and P₍₁₎/P₀ is not as useful as the other two power ratios. The centroid offset, having the greatest completeness and the least contamination, is the most successful one among all measures. When different aperture sizes are compared for the power ratios, the results using aperture sizes of 1 Mpc and R₂₀₀ are similar, while using R₅₀₀ is generally a little worse than the others both in completeness and contamination. When all three projections are considered, merging and relaxed clusters are better distinguished for all measures, with the values of completeness and contamination changed by ∼10%–20%. The centroid offset, w, improves the most when all projections are used.

According to the above analysis, the centroid offset does a better job in distinguishing merging and relaxed clusters than the power ratios. This is probably because each individual power ratio is only sensitive to a certain type of substructure. They are more powerful when combined to distinguish between different morphological types (Buote & Tsai 1995, 1996). The centroid offset, on the other hand, is a more general feature of all disturbed clusters. However, all the substructure measures have limitations. Their effectiveness is influenced by the viewing projection. More importantly, their values have a large variation even for clusters at the same dynamical state, as shown in Figure 13. Thus, only ∼40% of the true merging clusters are detected, and ∼25% of the relaxed clusters. Among the detected clusters, ∼50% of the "merging" clusters are actually relaxed, and ∼15% of the "relaxed" clusters are actually merging. Therefore, although substructure measures are useful in distinguishing the dynamical state of clusters, caution is still required to interpret the results correctly.

5.3. Comparison with Previous Work

Several studies have explored the effect of clusters with different dynamical states on the M–T_X scaling relation. Most previous works used substructure measures such as the power ratios and the centroid offset to quantify the dynamical state. The main difference in our study is that we directly analyze cluster merging histories to identify the recently merged clusters, which provides another line of evidence for our results in addition to those derived from the substructure measures.

O'Hara et al. (2006) investigated the effect of mergers and core structure on the X-ray scaling relations for both observed and simulated clusters. For the observed sample, they found that cool core clusters and clusters with less substructure exhibit a larger amount of scatter. Their simulated clusters, on the other hand, have a tendency to have a larger amount of scatter for clusters with more substructure, though they argued that the evidence is weak. Since their simulations also do not include radiative cooling, we can compare directly with their results without worrying about other baryonic effects. We also find the same trend that merging clusters, which we have shown to have more substructure, have a larger amount of scatter. However, we further explore the origin of the intrinsic scatter and find a strong correlation with the halo concentration of clusters. We also show that the trend seen above is due to the fact that merging clusters have a larger variation in their concentrations.

Jeltema et al. (2008) studied the correlation between cluster substructures and cluster observables using hydrodynamical simulations with non-gravitational heating and cooling. Despite the difference of input baryonic physics in the simulations and the definition of T_X, they found no dependence on cluster substructures in the M–T_ew relation when the true mass is considered. Although we find negative correlations between the scatter and some of the substructure indicators, the probabilities are not high. The lack of correlation supports their result. However, they reported that there is a significant trend for the relaxed clusters to have lower temperatures for their masses in the M–T_ew relation measured within R₅₀₀, whereas we do not find any significant bias, and an opposite trend is found by Kravtsov et al. (2006), Nagai et al. (2007a), and Kay et al. (2007). Taking the average relations between T_ew and T_X for merging and relaxed clusters from Nagai et al. (2007b), Jeltema et al. (2008) argued that the discrepancy cannot be explained by using different temperature definitions. However, we find that the bias between T_ew and T_X is larger for merging clusters than relaxed ones (see Section 3.2), in the direction that can alleviate this discrepancy. Therefore, the difference between the conclusions reached by Jeltema et al. (2008) and the others regarding the offset of merging and relaxed clusters may be due to the use of T_ew instead of the spectroscopic temperature T_X.

On the other hand, Ventimiglia et al. (2008) have recently found significant negative correlations between the substructure measures and the scatter in the mass–temperature relation for their simulated clusters, both for the emission-weighted temperature T_ew and the spectroscopic-like temperature T_sl. They also found relative offsets between merging and relaxed clusters in the mean scaling relation, that is, merging clusters tend to be cooler than relaxed clusters of similar masses. As discussed in their paper, the different conclusion than Jeltema et al. (2008) may come from different implementations of feedback mechanisms. Although the trends can be explained by incomplete relaxation of merging clusters, which we also observed, we do not find a significant separation in the normalization of the M–T_X relation between merging and relaxed populations, especially when the intrinsic scatter without cooling is largely contributed by the effect of halo concentration. Therefore, the main reason resulting in the differences is probably due to radiative cooling, as is suggested by the fact that it is included in all the studies which observed the relative offset. Since current simulations with radiative cooling tend to produce relaxed clusters with steeper temperature profiles than real clusters, the average temperature of relaxed clusters can possibly be biased high. This can explain why they found relaxed clusters hotter than expected, while our results do not show any significant offset between relaxed and merging clusters in the M–T_X relation.

6. CONCLUSIONS

Galaxy clusters are invaluable cosmological probes. Accurate measurement of cluster masses is crucial and often relies on the mass-observable relations. However, to constrain the cosmological parameters to the few percent level, the systematics and scatter in these relations must be thoroughly understood. In this work, we investigate the sources of intrinsic scatter in the M–T_X relation using a hydrodynamics plus N-body simulation of galaxy clusters in a cosmological volume. In order to compare directly to observations without worrying about different definitions or systematics, we produced mock Chandra X-ray images using MARX and extracted the spectroscopic temperature T_X as observers do. Also, all the quantities in our analysis and discussions are measured within R₅₀₀, which is usually used for X-ray data. Radiative cooling and heating mechanisms are not included, since we would like to disentangle the scatter driven by the gravitational effects from other baryonic physics. We chose to focus on the M–T_X relation for several reasons. First, it is less sensitive to resolution and to cooling and heating mechanisms than other scaling relations, such as the L_X–T_X relation. Therefore, our results are still representative of reality even though the input physics is not complete. Second, the relative insensitivity to additional baryonic physics provides a window into better understanding of the physical origin of the scatter, despite our incomplete knowledge regarding the cooling and feedback mechanisms. Moreover, the intrinsic scatter in the M–T_X relation is among the smallest of all the observed scaling relations. If one can further reduce it based on the knowledge of its physical origin, the M–T_X relation will be extremely useful for cluster cosmology.

Our aim is to find out what determines the positions of clusters in the M₅₀₀–T_X,500 relation, in particular whether the intrinsic scatter is driven by recent merging activity or the overall assembly histories of clusters. We split our simulated cluster samples into merging and relaxed subgroups based on our merger tree analysis, and then we compare the distributions of the intrinsic scatter for individual subgroups. We also correlate the scatter with quantities that are related to the recent merging activity or cumulative cluster assembly histories, including the time since last merger, substructure measures, and the halo concentration. Here we summarize our findings.

We find a strong correlation between the scatter in the M₅₀₀–T_X,500 relation and the halo concentration. More concentrated clusters tend to lie below (cooler than) the mean relation, while puffier clusters tend to be hotter than expected for their masses. This is confirmed by the negative correlation between scatter and the formation lookback time of clusters, since it is well known that more concentrated clusters tend to form at earlier times. We showed that using this correlation, the scatter can be effectively reduced from 6.10% to 4.49%.

There is no bias in the M₅₀₀–T_X,500 relation between merging and relaxed clusters, but the amount of scatter for merging clusters is larger than that for the relaxed ones. This trend can be explained by the fact that merging clusters have larger variations in their concentrations than relaxed clusters.

When we correlate the scatter with the dynamical state of clusters, either the time since last merger or the substructure measures, there is a weak trend for recently merged clusters to be cooler, probably due to incomplete virialization of clusters that have just merged with a cool clump. However, statistically the influence of departure from hydrostatic equilibrium of merging clusters is negligible. Possible reasons are discussed in detail in Section 5.1.

There are significant deviations from lognormality of the distributions of scatter for our simulated clusters at both z = 0 and z = 1. This effect should be taken into account in future self-calibration studies to correctly interpret the obtained constraints on the cosmological parameters. Future simulation studies of larger volumes are needed in order to accurately characterize the distribution of scatter, including the tails of the distribution.

In conclusion, we find that when radiative cooling and feedback mechanisms are neglected, the intrinsic scatter in the M₅₀₀–T_X,500 relation is driven more by the variation in halo concentrations, or the overall assembly histories of clusters, than the recent merging events. Using an analytic approach, Balogh et al. (2006) investigated whether the amount of scatter in the observed M–T_X and M–L_X relations can be explained by the variations in halo concentrations or different entropy floors. Although they focused more on the scatter in the M–L_X relation and showed that it requires a wide range of entropy floors, a significant portion of the scatter in the M–T_X relation is determined by the range of halo concentrations predicted in their model. This is confirmed by our results since we have explicitly shown the strong correlation between the M–T_X scatter with the halo concentration using a numerical simulation.

The lack of dependence of the scatter on the dynamical state of clusters is also seen by O'Hara et al. (2006) for observed clusters. They suggested that the scatter in the mass-observable relations is not dominated by recent mergers, but by the cooling-related core properties or probably the overall assembly histories of clusters. The latter relationship is indeed found in our simulation. As for the effects of radiative cooling, it has been shown to be a main source of intrinsic scatter in the L_X–T_X relation (Allen & Fabian 1998). Therefore, our next step is to include cooling and feedback mechanisms in the simulation to examine their individual influence on the scatter in the mass-observable relations.

Exploring the intrinsic scatter in the cluster scaling relations not only provides physical insights into the formation of galaxy clusters, but also has important implications for using clusters in cosmology. For example, the strong correlation with halo concentrations can be used for observed clusters to reduce the scatter in the scaling relations. Also, the weak influence of merging clusters is good news for cluster cosmology, because it implies that when deriving the scaling relations from the observed clusters it is unnecessary to worry much about whether to include the unrelaxed systems or not. This is good because it is much more difficult to separate out the unrelaxed systems at higher redshifts. We expect that detailed studies of the intrinsic scatter in the scaling relations, not only in X-rays but also at other wavelengths, will continue to yield invaluable information both for cluster physics and cluster cosmology.

H.Y.Y. wishes to thank J. Cohn for hospitality during a visit to Berkeley and J. Zuhone for useful conversations. P.M.S. acknowledges support from the DOE Computational Science Graduate Fellowship (DEFG02-97ER25308). We acknowledge support under a Presidential Early Career Award from the U.S. Department of Energy, Lawrence Livermore National Laboratory (contract B532720) and from NASA (grant NNX06AG57G). The work described here was carried out using the resources of the National Center for Supercomputing Applications (allocation MCA05S029) and the National Center for Computational Sciences at Oak Ridge National Laboratory (allocation AST010). FLASH was developed largely by the DOE-supported ASC/Alliances Center for Astrophysical Thermonuclear Flashes at the University of Chicago.

THE INFLUENCE OF CONCENTRATION AND DYNAMICAL STATE ON SCATTER IN THE GALAXY CLUSTER MASS–TEMPERATURE RELATION

Article metrics

Permissions

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION