Abstract

We present a new approach for modelling galaxy/halo bias that utilizes the full non-linear information contained in the moments of the matter density field, which we derive using a set of numerical simulations. Although our method is general, we perform a case study based on the local Eulerian bias scheme truncated to second order. Using 200 N-body simulations covering a total comoving volume of 675 h−3 Gpc3, we measure several two- and three-point statistics of the halo distribution to unprecedented accuracy. We use the bias model to fit the halo–halo power spectrum, the halo–matter cross-spectrum and the corresponding three bispectra for wavenumbers in the range 0.04 ≲ k ≲ 0.12 h Mpc−1. We find that the constraints on the bias parameters obtained using the full non-linear information differ significantly from those derived using standard perturbation theory at leading order. Hence, neglecting the full non-linear information leads to biased results for this particular scale range. We also test the validity of the second-order Eulerian local biasing scheme by comparing the parameter constraints derived from different statistics. Analysis of the halo–matter cross-correlation coefficients defined for the two- and three-point statistics reveals further inconsistencies contained in the second-order Eulerian bias scheme, suggesting it is too simple a model to describe halo bias with high accuracy.

1 INTRODUCTION

The clustering statistics of the galaxy distribution contain a wealth of information about the cosmological model. However, in the absence of a robust theory for galaxy formation, extracting this information can only be achieved in part. In practice, to do this requires us to assume a specific phenomenological relationship between the density field of galaxies and that of the underlying matter, more commonly referred to as galaxy bias. Whilst still incomplete, our leading theories of galaxy formation, do provide a great deal of insight about the distribution of galaxies. For instance they predict that galaxies should only reside in dark matter haloes and be strongly associated with the distribution of substructures (for a detailed review of galaxy formation see Mo, van den Bosch & White 2010). This greatly simplifies our ability to construct a phenomenological model for the galaxy distribution on large scales: it should be closely related to a weighted average of the dark matter halo overdensities (e.g. Smith, Scoccimarro & Sheth 2007).

There are a number of detailed analytical approaches for characterizing the bias of dark matter haloes with respect to the mass distribution. However, it has yet to be determined which model provides the most accurate description of galaxy bias. In the simplest method, the local Eulerian bias model (hereafter LEB), one assumes that the overdensities of the biased tracers can be written as some function of the matter density field at the same location. If both densities are smoothed over the patch scale R, then the biased field may be written as a Taylor-series expansion (Fry & Gaztanaga 1993). If one considers sufficiently large patches, then high-order corrections are guaranteed to be small and the series may be truncated after a finite number of terms.

Halo-clustering predictions of the LEB expressed in terms of standard perturbation theory (hereafter SPT, for a review see Bernardeau et al. 2002) have been examined in numerous works (Scoccimarro et al. 2001; Smith et al. 2007; Guo & Jing 2009; Manera & Gaztañaga 2011; Roth & Porciani 2011; Chuen Chan & Scoccimarro 2012; Pollack, Smith & Porciani 2012). One of the results to emerge from these studies is that, when the model is applied to halo counts within finite volumes of linear size R, the coefficients of the bias expansion show a running with the ‘cell’ size. However, halo-clustering statistics such as the n-point correlation functions (or the corresponding n-spectra) do not contain any smoothing scale and should not depend on R. There has been much debate in the literature on how to reconcile these seemingly contrasting results.

This has led some to discuss an ‘effective’ or ‘renormalized’ bias approach where the scale dependence of the bias coefficients is compensated by the contribution of small-scale perturbations in the matter density (Heavens, Matarrese & Verde 1998; McDonald 2006; Schmidt, Jeong & Desjacques 2013). Whilst such a scheme may be plausible (Jeong & Komatsu 2009; Smith, Hernández-Monteagudo & Seljak 2009), the development of a unique renormalization method is still ongoing, especially for dynamically evolved configurations in Eulerian space. On the other hand, it was recently proposed by Chuen Chan & Scoccimarro (2012) that the bias parameters obtained by counting haloes within cells of size R are only relevant for describing perturbations of wavenumber k ≃ 0.8/R in the halo distribution. While there is no challenge to their argument when analysing power spectrum data, it does present a complication when using higher order statistics such as the bispectrum. In order to interpret the galaxy bispectrum one would be required to compute bias coefficients separately for each configuration of wavevectors. This approach appears somewhat cumbersome to implement.

Currently, most observational analyses of galaxy clustering assume that galaxy bias can be described by the truncated LEB and that the statistical properties of the non-linear matter density field can be modelled using SPT. To leading order in the perturbations, this requires only one bias parameter for two-point statistics of the tracers and two parameters for three-point statistics. Present-day galaxy surveys, however, do not cover enough comoving volume to accurately sample the spatial scales at which tree-level results provide an accurate description of galaxy clustering. The presence of rare large-scale structures, for instance, significantly alters the measurements of three-point statistics (e.g. Nichol et al. 2006). On smaller scales, where data are more robust, dynamical non-linearities pose a serious challenge to the models. Adopting the simplified LEB+SPT model may therefore generate systematic errors and thus influence the characterization of the bias or the estimation of the cosmological parameters.

The LEB truncated to second order is the standard workhorse for studying three-point statistics of galaxy clustering. Its predictions to leading perturbative order have been used to interpret measurements from the two-degree field (2dF) galaxy redshift survey (Verde et al. 2002; Jing & Börner 2004; Wang et al. 2004; Gaztañaga et al. 2005), the Sloan Digital Sky Survey (SDSS; Kayo et al. 2004; Hikage et al. 2005; Pan & Szapudi 2005; Kulkarni et al. 2007; Nishimichi et al. 2007; McBride et al. 2011a,b; Marín 2011; Guo et al. 2014), and the WiggleZ Dark Energy Survey (Marín et al. 2013). In our previous study (Pollack et al. 2012), we demonstrated that, in order to robustly model three-point statistics with the LEB, one must necessarily have an accurate model for the clustering statistics of the non-linear matter density on the relevant scales. This is imperative to recover the correct values of the bias parameters in controlled numerical experiments. Therefore, it is not surprising that past investigations based on the LEB+SPT model reached inconsistent conclusions. For example, studying the galaxy bispectrum on scales 0.1 < k < 0.5 h Mpc−1, Verde et al. (2002) concluded that 2dF galaxies are unbiased tracers of the mass distribution. On the other hand, using the complete 2dF sample, Gaztañaga et al. (2005) found strong evidence for non-linear biasing from the analysis of the three-point correlation function with triangle configurations that probe separations between 9 and 36 h−1 Mpc (see also Jing & Börner 2004; Wang et al. 2004).

In this paper, we build upon our past experience and present a general method to model the clustering of biased tracers of the mass distribution on mildly non-linear scales k < 0.1 h Mpc−1. This is key to extend studies of galaxy clustering to smaller spatial separations where observational data are less uncertain. Our method relies on using N-body simulations to measure the relevant statistics for the clustering of the underlying mass distribution. Related approaches have been presented by Sigad, Branchini & Dekel (2000) and Szapudi & Pan (2004) for galaxy counts in cells (see also Pan & Szapudi 2005 for an application to correlation functions). We apply our general framework to the modelling of n-point clustering statistics of non-linear, Eulerian, locally biased tracers. In our framework, bias parameters run with the patch scale R. We address the running of the bias by treating the filter scale as a nuisance parameter to be marginalized over. The major advantage of our scheme is that we exactly recover the matter polyspectra used in the bias model at every order. The only truncation necessary in the model is the choice as to what level to truncate the bias expansion, and this may be selected by the data in a Bayesian model comparison. We test our modelling framework up to quadratic order in the local bias expansion (as commonly done in recent observational studies), for the power and bispectra of haloes and their cross-spectra with matter measured from a large ensemble (200 realizations) of measurements from a series of large Λ cold dark matter (ΛCDM) N-body simulations. This ensemble of simulations resolves the haloes that should host luminous red galaxies over a total comoving volume of 675 h−3 Gpc3, and so provides us with a very stringent statistical test ground for our model.

The sections are organized as follows. In Section 2 we set our mathematical notation and introduce the LEB. The numerical simulations used in this work are briefly described in Section 3 and used in Section 4 to measure several statistical quantities for the matter and halo distributions. In Section 5 we use Bayesian statistics to estimate the free parameters of the LEB and describe our main results. Finally, in Sections 6 and 7 we further discuss our findings and present our conclusions.

2 A NEW FRAMEWORK FOR MODELLING THE CLUSTERING OF BIASED TRACERS

2.1 General formalism

Consider some discrete tracers of the large-scale structure (dark matter haloes or galaxies) with mean density |$\bar{n}_{\rm h}$| and physical density |$\rho _{\rm h}({\boldsymbol x})=[1+\delta _{\rm h}({\boldsymbol x})]\,\bar{n}_{\rm h}$|⁠. We want to relate this random field to the underlying distribution of matter with local density |$\rho ({\boldsymbol x})=[1+\delta ({\boldsymbol x})]\,\bar{\rho }$|⁠. If we assume that the density contrast of the tracers averaged over some patch of linear size R, |$\delta _{\rm h}({\boldsymbol x}|R)$|⁠, is locally related to the density of matter in the same patch, then we may write
\begin{equation} \delta _{\rm h}({\boldsymbol x}|R)={\mathcal {F}}\left[\delta ({\boldsymbol x}|R)\right], \end{equation}
(1)
where |${\mathcal {F}}$| denotes a generic function |$\mathbb {R} \rightarrow \mathbb {R}$| and the symbols
\begin{equation} \delta _{\alpha }({\boldsymbol x}|R)\equiv \int {\rm d}^3\!{\boldsymbol y}\,W(|{\boldsymbol x}-{\boldsymbol y}|,R)\,\delta _{\alpha }({\boldsymbol y}) \end{equation}
(2)
(where α stands for haloes or matter) denote smoothed overdensity fields, W being a rotation-invariant filter function with size R.
Since we are dealing with smooth mathematical functions we may Taylor expand equation (1) to obtain (Fry & Gaztanaga 1993)
\begin{equation} \delta _{\rm h}({\boldsymbol x}|R) = \sum _{n=1}^{\infty } \frac{b_{n}}{n!}\left[ \delta ^n({\boldsymbol x}|R)-\langle \delta ^n({\boldsymbol x}|R)\rangle \right], \end{equation}
(3)
where the terms bn are the Eulerian bias coefficients of order n, which depend on both the smoothing scale and the exact definition of the tracers (e.g. halo mass, etc.). Note that the subtraction of the terms |$\langle \delta ^n({\boldsymbol x}|R)\rangle$| at each order ensures that |$\langle \delta _{\rm h}({\boldsymbol x}|R)\rangle =0$|⁠, where 〈…〉 denote an ensemble average. On Fourier transforming the above relation one finds, for |$|{\boldsymbol k}|\ne 0$|⁠,
\begin{equation} \tilde{\delta }_{\rm h}({\boldsymbol k}|R) = \sum _{n=1}^{\infty }\frac{b_{n}}{n!}\,\Delta ^{(n)}({\boldsymbol k}|R), \end{equation}
(4)
where |$\Delta ^{(n)}({\boldsymbol k}|R)\equiv \widetilde{\delta ^n}({\boldsymbol k}|R)$| can be written as
\begin{equation} \Delta ^{(n)}({\boldsymbol k}|R)\equiv (2 \pi )^3 \int \delta ^{D}({\boldsymbol k}-{\boldsymbol q}_{1\ldots n})\, \prod _{i=1}^{n} \,\tilde{\delta }({\boldsymbol q}_i|R)\,\frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}.\ \end{equation}
(5)
In the last expression |$\delta ^{D}({\boldsymbol k})$| denotes the Dirac-delta distribution and we have made use of the compact notation |${\boldsymbol q}_{1\ldots n}={\boldsymbol q}_1+\ldots +{\boldsymbol q}_n$| and |$\tilde{\delta }({\boldsymbol q}|R)\equiv \tilde{\delta }({\boldsymbol q})\,\widetilde{W}(qR)$|⁠.
We now define the power spectrum of the biased tracers and their cross-spectrum with the matter in terms of the correlators:
\begin{equation} \langle \tilde{\delta }_{\alpha }({\boldsymbol k}_1|R)\,\tilde{\delta }_{\beta }({\boldsymbol k}_2|R)\rangle \equiv (2\pi )^3\,\delta ^{D}({\boldsymbol k}_{12})\,{\mathcal {P}}_{\alpha \beta }(k_1) . \end{equation}
(6)
Similarly, the corresponding bispectra can be defined as
\begin{eqnarray} &&\!\!\!\!&&{\langle \tilde{\delta }_{\alpha }({\boldsymbol k}_1|R)\,\tilde{\delta }_{\beta }({\boldsymbol k}_2|R)\,\tilde{\delta }_{\gamma }({\boldsymbol k}_3|R)\rangle} \nonumber \\ &&\qquad\equiv (2\pi )^3\,\delta ^{D}({\boldsymbol k}_{123})\,{\mathcal {B}}_{\alpha \beta \gamma }({\boldsymbol k}_1,{\boldsymbol k}_2), \end{eqnarray}
(7)
where we have suppressed the dependence of the bispectrum on the third wavevector, since the Dirac-delta distribution imposes |${\boldsymbol k}_3=-{\boldsymbol k}_{12}$|⁠. On inserting equation (4) into equation (6), we find
\begin{eqnarray} &&\!\!\!\!&&{\langle \tilde{\delta }_{\alpha }({\boldsymbol k}_1|R)\tilde{\delta }_{\beta }({\boldsymbol k}_2|R)\rangle} \nonumber \\ &&\qquad=\sum _{l,m=1}^{\infty }\frac{\Gamma ^{\alpha }_{l}}{l!}\frac{\Gamma ^{\beta }_{m}}{m!} \langle \Delta ^{(l)}({\boldsymbol k}_1|R)\,\Delta ^{(m)}({\boldsymbol k}_2|R)\rangle {}, \end{eqnarray}
(8)
with |$\Gamma _l^{\rm h}=b_l$| and |$\Gamma _l^{\rm m}=\delta _{l1}^{\rm K}$| (for haloes and matter, respectively), where |$\delta _{ij}^{\rm K}$| denotes the Kronecker-delta function. Similarly for equation (7) we have
\begin{eqnarray} \langle \tilde{\delta }_{\alpha }({\boldsymbol k}_1)\,\tilde{\delta }_{\beta }({\boldsymbol k}_2)\tilde{\delta }_{\gamma }({\boldsymbol k}_3)\rangle &=& \sum _{l,m,n=1}^{\infty }\frac{\Gamma ^{\alpha }_{l}}{l!}\frac{\Gamma ^{\beta }_{m}}{m!} \frac{\Gamma ^{\gamma }_{n}}{n!}\nonumber \\ &&\times \langle \Delta ^{(l)}({\boldsymbol k}_1|R)\,\Delta ^{(m)}({\boldsymbol k}_2|R)\,\Delta ^{(n)}({\boldsymbol k}_3|R)\rangle {}.\nonumber \\ \end{eqnarray}
(9)
It is convenient to introduce the functions |${\mathcal {P}}_{(l,m)}$| and |${\mathcal {B}}_{(l,m,n)}$| such that
\begin{equation} \langle \Delta ^{(l)}({\boldsymbol k}_1|R)\,\Delta ^{(m)}({\boldsymbol k}_2|R)\rangle = (2\pi )^3\,\delta ^{D}({\boldsymbol k}_{12})\,{\mathcal {P}}_{(l,m)}({\boldsymbol k}_1)\; \, \end{equation}
(10)
and
\begin{eqnarray} &&\!\!\!\!&&{\langle \Delta ^{(l)}({\boldsymbol k}_1|R)\,\Delta ^{(m)}({\boldsymbol k}_2|R)\,\Delta ^{(n)}({\boldsymbol k}_3|R)\rangle} \nonumber \\ &&\qquad=(2\pi )^3\,\delta ^{D}({\boldsymbol k}_{123})\,{\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2). \end{eqnarray}
(11)
In simple words, |${\mathcal {P}}_{(l,m)}$| denotes the cross-power spectrum between the smoothed random fields |$[\delta ({\boldsymbol x}|R)]^l-\langle [\delta ({\boldsymbol x}| r)]^l\rangle$| and |$[\delta ({\boldsymbol x}|R)]^m-\langle [\delta ({\boldsymbol x}|R)]^m\rangle$|⁠, while |${\mathcal {B}}_{(l,m,n)}$| is the corresponding bispectrum. Thus for the halo and matter power and bispectra we have
\begin{eqnarray} {\mathcal {P}}_{\alpha \beta }({\boldsymbol k}_1) & = & \sum _{l,m=1}^{\infty }\frac{\Gamma ^{\alpha }_{l}}{l!}\frac{\Gamma ^{\beta }_{m}}{m!} \,{\mathcal {P}}_{(l,m)}({\boldsymbol k}_1), \end{eqnarray}
(12)
\begin{eqnarray} {\mathcal {B}}_{\alpha \beta \gamma }({\boldsymbol k}_1,{\boldsymbol k}_2) \!\!\! & = & \!\!\!\! \sum _{l,m,n=1}^{\infty }\frac{\Gamma ^{\alpha }_{l}}{l!}\frac{\Gamma ^{\beta }_{m}}{m!} \frac{\Gamma ^{\gamma }_{n}}{n!}\, {\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2). \end{eqnarray}
(13)
The above sets of equations provide us with models for the power spectra and the bispectra of halo counts in cells of size R. However, what we really want to model is the halo two- and three-point functions, Pαβ and Bαβγ. We assume that these quantities can be approximately recovered by ‘de-smoothing’ |${\mathcal {P}}_{\alpha \beta }$| and |${\mathcal {B}}_{\alpha \beta \gamma }$| (Smith et al. 2007; Smith, Sheth & Scoccimarro 2008; Sefusatti 2009):
\begin{eqnarray} P_{\alpha \beta }({\boldsymbol k}_1) & = & \frac{{\mathcal {P}}_{\alpha \beta } ({\boldsymbol k}_1)}{W^2(k_1 R)}, \end{eqnarray}
(14)
\begin{eqnarray} B_{\alpha \beta \gamma }({\boldsymbol k}_1, {\boldsymbol k}_2, {\boldsymbol k}_3) & = & \frac{{\mathcal {B}}_{\alpha \beta \gamma } ({\boldsymbol k}_1, {\boldsymbol k}_2, {\boldsymbol k}_3)}{W(k_1 R) W(k_2 R) W(k_3 R)}. \end{eqnarray}
(15)
Note that when considering a model of halo bias beyond linear order this operation does not fully remove the dependence of the theory on R. In Section 5, we will use the models presented in equations (14) and (15) to fit simulation data. Nevertheless, our choice to ‘de-smooth’ the theoretical model is equivalent to analysing counts in cell data with a smoothed model. This is due to the fact that in Fourier-space the smoothing kernels can be treated as multiplicative factors, which means that if we factorize the expressions by dividing out the product of the window functions the relation between the model and the data still holds. Hence, fitting counts in cells data with a smoothed model is indifferent to analysing unsmoothed data with a ‘de-smoothed’ or factorized model.

The smoothing scale must therefore be considered as a free parameter of the model, and so it must be either determined by fitting a set of data or marginalized over.

In Appendices A1 and A2 we show how the terms |${\mathcal {P}}_{(l,m)}$| and |${\mathcal {B}}_{(l,m,n)}$| are related to the p-point matter spectra, where p = l + m or p = l + m + n, respectively. In Appendix A3 we prove that the functions |${\mathcal {P}}_{(l,m)}$| are totally symmetric in l and m. For lmn, the functions |${\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)$| are not in general symmetric in l, m, and n, unless the wavevectors |${\boldsymbol k}_i$| are also exchanged, i.e. whilst |${\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)={\mathcal {B}}_{(m,l,n)}({\boldsymbol k}_2,{\boldsymbol k}_1,{\boldsymbol k}_3)$|⁠, |${\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)\ne {\mathcal {B}}_{(m,l,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)$|⁠. Note that in this study we choose to work with n-point spectra, |${\mathcal {P}}_{(\alpha _1\ldots \alpha _n)}$|⁠, that are symmetric to an exchange of their vectorial arguments, and we accomplish this through the symmetrization operation
\begin{equation} {\mathcal {P}}^{\rm (s)}_{(\alpha _1\ldots \alpha _n)} = \frac{\sum _{i_1,\ldots ,i_n}^{n} |\epsilon _{i_1\ldots i_n}|\,{\mathcal {P}}_{(\alpha _1\ldots \alpha _n)}({\boldsymbol k}_{i_1},\ldots ,{\boldsymbol k}_{i_n})}{\sum ^{n}_{i_1,\ldots ,i_n} |\epsilon _{i_1\ldots i_n}|}, \end{equation}
(16)
where |$\epsilon _{i_1\ldots i_n}$| denotes the n-dimensional generalization of the Levi-Civita symbol and we take its absolute value.

In previous studies, the functions |${\mathcal {P}}_{(l,m)}$| and |${\mathcal {B}}_{(l,m,n)}$| have been modelled through the use of a combination of perturbation theory and semi-empirical models. In Pollack et al. (2012) we recovered these functions exactly from an N-body simulation and demonstrated that they are essential to measure the bias parameters in an unbiased way. We will revisit these issues in Sections 4 and 5.

2.2 Case study: biasing to second order

As an example, let us evaluate the case when the bias is taken to second order and all higher order bias coefficients are vanishing. This is a widespread assumption often used to interpret observational data from massive redshift surveys (see Section 1 for a long list of references). We will consider a unique set of dark matter haloes. For the case where we have multiple halo bins (e.g. mass selected), the expressions are more cumbersome but no more complicated. Starting with the two-point statistics, one can formulate the halo auto- and cross-power spectra with the total mass up to second order in the LEB
\begin{equation} \mathcal {P}_{\rm hm}({\boldsymbol k}) = b_1 \mathcal {P}_{(1,1)}({\boldsymbol k}) + \frac{b_2}{2} {\mathcal {P}}_{(2,1)}({\boldsymbol k}), \end{equation}
(17)
\begin{equation} \mathcal {P}_{\rm hh}({\boldsymbol k}) = b_1^2 \mathcal {P}_{(1,1)}({\boldsymbol k}) + b_1 b_2 {\mathcal {P}}_{(2,1)}({\boldsymbol k}) + \frac{b_2^2}{4} \mathcal {P}_{(2,2)}({\boldsymbol k}), \end{equation}
(18)
where from Appendix A1, we see that
\begin{equation} \mathcal {P}_{(2,1)}({\boldsymbol k}) \equiv \int \frac{{\rm d}^3{\boldsymbol q}}{(2 \pi )^3} \,\mathcal {B}({\boldsymbol q},{\boldsymbol k}-{\boldsymbol q},-{\boldsymbol k}), \end{equation}
(19)
\begin{equation} \mathcal {P}_{(2,2)}({\boldsymbol k})\equiv \int \frac{{\rm d}^3{\boldsymbol q}}{(2 \pi )^3} \frac{{\rm d}^3\!{\boldsymbol w}}{(2 \pi )^3} \,\mathcal {P}_{4}({\boldsymbol q},{\boldsymbol k}- {\boldsymbol q},{\boldsymbol w},-{\boldsymbol k}-{\boldsymbol w}). \end{equation}
(20)
Note that the |$\mathcal {P}_{(l,m)}$| functions are (l + m − 2)-dimensional integrals over the smoothed matter correlators of order n = l + m, |$\langle \tilde{\delta }({\boldsymbol k}_1|R)\ldots \tilde{\delta }({\boldsymbol k}_n|R) \rangle =(2\pi )^3\,\delta ^D({\boldsymbol k}_{1\ldots n})\,\mathcal {P}_{n}({\boldsymbol k}_1,\cdots ,{\boldsymbol k}_n)$|⁠. These include connected and disconnected terms (see Appendix B).
For the three-point statistics, the symmetrized auto (halo) and cross (halo-mass) bispectra with respect to the matter, up to second order in the bias model, may be written as
\begin{equation} {\mathcal {B}}^{(\rm s)}_{\rm hmm} = b_1 {\mathcal {B}}^{(\rm s)}_{(1,1,1)} + \frac{b_2}{2}{\mathcal {B}}^{(\rm s)}_{(2,1,1)}, \end{equation}
(21)
\begin{equation} \mathcal {B}^{(\rm s)}_{\rm hhm} = b_1^2 {\mathcal {B}}^{(\rm s)}_{(1,1,1)} + b_1b_2{\mathcal {B}}^{(\rm s)}_{(2,1,1)} +\frac{b_2^2}{4}{\mathcal {B}}^{(\rm s)}_{(2,2,1)}, \end{equation}
(22)
\begin{eqnarray} {\mathcal {B}}^{\rm (s)}_{\rm hhh} & = & b_1^3 {\mathcal {B}}^{(\rm s)}_{(1,1,1)} + \frac{3b_1^2b_2}{2}{\mathcal {B}}^{(\rm s)}_{(2,1,1)} +\frac{3b_1b_2^2}{4}{\mathcal {B}}^{(\rm s)}_{(2,2,1)} +\frac{b_2^3}{8}{\mathcal {B}}^{(\rm s)}_{(2,2,2)},\nonumber \\ \end{eqnarray}
(23)
where for brevity we suppressed the dependence of the bispectra on |$({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)$|⁠. In Appendix A2, the |${\mathcal {B}}^{\rm (s)}_{l,m,n}$| functions are (l + m + n − 3)-dimensional integrals of the polyspectra of order l + m + n. Specifically
\begin{eqnarray} {\mathcal {B}}^{\rm (s)}_{(2,1,1)} \!\! & \equiv & \!\! \frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}}{(2\pi )^3}\,\mathcal {P}_{4}({\boldsymbol q},{\boldsymbol k}_1-{\boldsymbol q},{\boldsymbol k}_2,{\boldsymbol k}_3) +2\,\rm cyc{}, \end{eqnarray}
(24)
\begin{eqnarray} {\mathcal {B}}^{\rm (s)}_{(2,2,1)} & \equiv & \frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} \frac{{\rm d}^3{\boldsymbol q}_2}{(2\pi )^3}{\mathcal {P}}_{\rm 5}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol k}_2-{\boldsymbol q}_2,{\boldsymbol k}_3) \nonumber \\ & & +\, \,2\,\rm cyc, \end{eqnarray}
(25)
\begin{eqnarray} {\mathcal {B}}^{\rm (s)}_{(2,2,2)} & \equiv &\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} \ldots \frac{{\rm d}^3{\boldsymbol q}_3}{(2\pi )^3} \nonumber \\ & & \times \,\, {\mathcal {P}}_{6}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol k}_2-{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol k}_3-{\boldsymbol q}_3). \end{eqnarray}
(26)
In Section 4 we show how one may estimate |$\mathcal {P}_{(l,m)}$| and |${\mathcal {B}}^{\rm (s)}_{(l,m,n)}$| directly from an N-body simulation.

3 N-BODY SIMULATIONS

In order to test the LEB and also to determine the covariance matrices of the various spectra we have simulated 200 realizations of a flat ΛCDM cosmological model. The specific cosmological parameters that we have adopted are {σ8 = 0.8, Ωm = 0.25, Ωb = 0.04, h = 0.7, ns = 1.0} where σ8 is the variance of linear mass fluctuations in top-hat spheres of radius R = 8 h−1 Mpc; Ωm and Ωb are the matter and baryon density parameters; h is the dimensionless Hubble parameter in units of 100 km s−1 Mpc−1; and n is the power-law index of the primordial density power spectrum. Our adopted values were inspired by the results from the Wilkinson Microwave Anisotropy Probe experiment (Komatsu et al. 2009).

All of the N-body simulations were run using the publicly available Tree-PM code gadget-2 (Springel 2005). This code was used to follow with high accuracy the non-linear evolution under gravity of N = 7503 equal mass particles in a periodic comoving cube of length L = 1500 h−1 Mpc, giving a total sample volume of V = 675 h−3 Gpc3. Newtonian two-body forces were softened below scales lsoft = 60 kpc h−1. The transfer function for the simulations was generated using the publicly available cmbfast code (Seljak & Zaldarriaga 1996), with high sampling of the spatial frequencies on large scales. Initial conditions were laid down at redshift z = 49 using the serial version of the publicly available 2LPT code (Crocce, Pueblas & Scoccimarro 2006).

We use only the simulation outputs at redshift z = 0 for analysis and identify dark matter haloes using the code BFoF. This is a friends-of-friends algorithm (Davis et al. 1985), where we adopted a linking length corresponding to b = 0.2 times the mean interparticle spacing. The minimum number of particles an object must contain to be considered a bound halo was set to 20. This implies a minimum halo mass of Mmin = 1.11 × 1013h−1 M and a mean number density of |$\bar{n}_{\rm h}\approx 3.7\times 10^{-4}$|h3 Mpc−3. Further details regarding this set of N-body simulations can be found in Smith (2009) and Smith et al. (2012).

4 ESTIMATING THE SPECTRA

In this section we describe how we estimate all the halo and matter polyspectra that enter the second-order LEB from the N-body simulations at redshift z = 0.

4.1 The halo auto- and cross-power and bispectra

To begin, the halo and matter density fields are interpolated on to a cubical Cartesian mesh using the cloud-in-cell (CIC) algorithm. Throughout we use mesh sizes corresponding to Ncell = 10243. We then Fourier transform these grids using the fast Fourier transform technique and correct each mode for the CIC assignment. The three power spectra Pmm, Phm, Phh, and the four bispectra, Bmmm, |$B^{(\rm s)}_{\rm hmm}$|⁠, |$B^{(\rm s)}_{\rm hhm}$|⁠, |$B^{(\rm s)}_{\rm hhh}$|⁠, are then estimated using the expressions
\begin{eqnarray} \hat{P}^{\rm d}_{\alpha \beta }({\boldsymbol k}_1) \!\! & = & \!\! \frac{L^3}{N(k_i)}\sum _{i}^{N(k_i)}\delta _{\alpha }({\boldsymbol k}_i)\delta _{\beta }(-{\boldsymbol k}_i), \end{eqnarray}
(27)
\begin{eqnarray} \hat{B}^{\rm d}_{\alpha \beta \gamma }({\boldsymbol k}_1,{\boldsymbol k}_2,\theta _{12}) \!\! & = & \!\! \frac{1}{3}\frac{L^6}{N_{\rm tri}}\sum ^{N_{\rm tri}}_{\epsilon ({\boldsymbol k}_i,{\boldsymbol k}_j)} \delta _{\alpha }({\boldsymbol k}_i)\delta _{\beta }({\boldsymbol k}_j) \nonumber \\ & & \times \, \delta _{\gamma }(-{\boldsymbol k}_i-{\boldsymbol k}_j)+2\,{\rm cyc}, \end{eqnarray}
(28)
where N(ki) is the number of Fourier modes in a narrow shell centred on k1, |$\epsilon ({\boldsymbol k}_i,{\boldsymbol k}_j)$| represents the pair of vectors which lie in thin shells centred on k1 and k2, whose angular separation lies in the angular bin centred on θ12, and NtriNtri(ki, kj, θij) is the total number of triangles with this configuration in Fourier space. The superscript ‘d’ denotes that these are spectra of a discrete distribution of points (i.e. haloes) and must be corrected for shot noise. The forms of the Poissonian shot-noise corrections we adopt were presented in Pollack et al. (2012):
\begin{eqnarray} \hat{P}^{\rm shot}_{\alpha \beta }({\boldsymbol k}_1) \!\! & = & \!\! \frac{\delta ^{\rm K}_{\alpha \beta }}{\bar{n}_{\alpha }} \end{eqnarray}
(29)
\begin{eqnarray} \hat{B}^{\rm shot}_{\alpha \beta \gamma }({\boldsymbol k}_1,{\boldsymbol k}_2) \!\! & = & \frac{1}{3}\frac{\delta ^{\rm K}_{\alpha \beta }}{\bar{n}_{\alpha }} \left[P_{\beta \gamma }({\boldsymbol k}_1)+2\,{\rm cyc}\right]\nonumber \\ & & +\,\frac{1}{3}\frac{\delta ^{\rm K}_{\beta \gamma }}{\bar{n}_{\beta }} \left[P_{\gamma \alpha }({\boldsymbol k}_1)+2\,{\rm cyc}\right]\nonumber \\ & & +\,\frac{1}{3}\frac{\delta ^{\rm K}_{\gamma \alpha }}{\bar{n}_{\gamma }} \left[P_{\alpha \beta }({\boldsymbol k}_1)+2\,{\rm cyc}\right]+ \frac{\delta ^{\rm K}_{\alpha \beta }\delta ^{\rm K}_{\alpha \gamma }}{\bar{n}_{\alpha }^2}, \end{eqnarray}
(30)
where |$\bar{n}_{\alpha }$| denotes the mean number density of either the matter particles or the halo population.

Fig. 1 presents the various power and bispectra averaged over the 200 realizations with the corresponding standard errors on the mean. All spectra were corrected for shot noise using equations (29) and (30). The bispectra were measured for triangle configurations with fixed lengths k1 = 0.04 h Mpc−1 and k2 = 2k1, but with varying angle θ12. We adopt the convention θ12 = 0 for |${\boldsymbol k}_1$| and |${\boldsymbol k}_2$| parallel. In order to use the same range of wavenumbers, the power spectra were measured over the scale range 0.04 < k < 0.12  h Mpc−1.

Figure 1.

Power-spectra and bispectra measurements averaged over 200 ΛCDM N-body simulations at redshift z = 0. Left: power spectra as a function of wavenumber. The blue, orange, and red symbols denote Phh, Phm, and Pmm, respectively. Right: bispectra as a function of triangle configuration. The blue, orange, green, and red symbols represent Bhhh, Bhhm, Bhmm, andBmmm, respectively. In both panels, the error bars show the standard error on the mean. On the other hand, the black lines denote the posterior mean for the different statistics obtained by fitting the second-order LEB to the simulation data. The shaded grey areas (which are unnoticeably narrow for the power spectrum) indicate the predictions for the models that are located within one rms value of the posterior distribution around the mean (see Section 5.3 for more details).

4.2 Estimating P(l, m) and |${ B}^{(\rm s)}_{(l,m,n)}$|

The polyspectra P(l, m) and |$B^{(\rm s)}_{(l,m,n)}$| that enter the expressions for the halo power and bispectra in the LEB are affected by the non-linear evolution of the matter fluctuations. While these terms are usually approximated with perturbative techniques, we measure them directly from our N-body simulations. We do this as follows. First, we correct each Fourier mode of the mass-density field for convolution with the CIC grid. Then we multiply the result by a Gaussian smoothing function W(kR) = exp [ − (kR)2/2] and inverse transform back to real space. Next, we generate the fields |$\delta ^l({\boldsymbol x}|R)$| for the relevant values of l and re-transform them into Fourier space. We then deconvolve these fields for the original smoothing, which means simply multiplying each Fourier mode by W−1(kR). Finally, the required P(l, m) and |$B^{(\rm s)}_{(l,m,n)}$| terms, defined in terms of |$\Delta ^{(l)}({\boldsymbol k}|R)$| (see equation 5), can be estimated as follows:
\begin{equation} \hat{P}_{(l,m)}({\boldsymbol k}_1) = \frac{L^3}{N(k_i)}\sum _{i}^{N(k_i)}\Delta ^{(l)}({\boldsymbol k}_i|R)\Delta ^{(m)}(-{\boldsymbol k}_i|R) \end{equation}
(31)
and
\begin{eqnarray} &&\!\!\!\hat{B}^{(\rm s)}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,\theta _{12})=\frac{1}{3}\frac{L^6}{N_{\rm tri}}\sum ^{N_{\rm tri}}_{\epsilon ({\boldsymbol k}_i,{\boldsymbol k}_j)}\Delta ^{(l)}({\boldsymbol k}_i|R) \nonumber \\ &&\quad \times \,\Delta ^{(m)}({\boldsymbol k}_j|R) \Delta ^{(n)}(-{\boldsymbol k}_i-{\boldsymbol k}_j|R)+2\,{\rm cyc}. \end{eqnarray}
(32)
We note that the functions P(l, m) and |$B^{(\rm s)}_{(l,m,n)}$| slowly vary with R and so can be smoothly interpolated. Based on this knowledge, we measure the spectral functions over the range: R = [2, 18] h−1 Mpc, in increments of ▵R = 2 h−1 Mpc, but including an additional measurement at R = 7  h−1 Mpc. The lower limit was adopted because we do not wish to smooth below the Lagrangian size of haloes, which for our sample is of the order of ∼ 3-4 h− 1 Mpc. The upper bound of R = 18 h−1 Mpc we justify by noting that we do not want the largest k-mode entering our computations of the halo power and bispectra to be too heavily smoothed.

Before inspecting the functions P(l, m) and |$B^{(\rm s)}_{(l,m,n)}$|⁠, we first report the level of non-linearity present in the smoothed matter density field, |$\delta ({\boldsymbol x}|R)$|⁠. We quantify this by measuring the variance of the density perturbations, σ2(R) and the fraction of cells where the density contrast exceeds unity, f, as a function of the filter scale (see Table 1). Our results show that σ2(R) < 1 for R ≳ 4 h−1 Mpc. We therefore expect the quadratic bias model to be a poor description for smaller values of R. However, we note that the fraction of the cells with δ ≥ 1 is f ≲ 0.1 for all of the filter scales considered. Furthermore, in our previous work (Pollack et al. 2012), we evaluated the scatter plots of δh versus δ measured from our N-body simulations for different smoothing radii. We found that expressing δh as a polynomial function at second order in δ can describe reasonably well the mean trend of the scatter.

Table 1.

Level of non-linearity in the smoothed mass-density field at redshift z = 0. Column 1: filter scale, R; column 2: variance of density fluctuations, σ2(R); and column 3: volume fraction with |δ(R)| > 1, f.

R ( h−1 Mpc)σ2(R)100 × f
22.4410.0
40.718.4
60.386.0
70.284.9
80.223.8
100.152.2
120.101.1
140.080.5
160.060.2
180.050.1
R ( h−1 Mpc)σ2(R)100 × f
22.4410.0
40.718.4
60.386.0
70.284.9
80.223.8
100.152.2
120.101.1
140.080.5
160.060.2
180.050.1
Table 1.

Level of non-linearity in the smoothed mass-density field at redshift z = 0. Column 1: filter scale, R; column 2: variance of density fluctuations, σ2(R); and column 3: volume fraction with |δ(R)| > 1, f.

R ( h−1 Mpc)σ2(R)100 × f
22.4410.0
40.718.4
60.386.0
70.284.9
80.223.8
100.152.2
120.101.1
140.080.5
160.060.2
180.050.1
R ( h−1 Mpc)σ2(R)100 × f
22.4410.0
40.718.4
60.386.0
70.284.9
80.223.8
100.152.2
120.101.1
140.080.5
160.060.2
180.050.1

4.3 Results for P(l, m) and B(l, m, n)

Figs 2 and 3 show the results for the ensemble-averaged de-smoothed power and bispectra, P(l, m) and B(l, m, n), respectively. Focusing on the power spectrum, the panels show (from left to right) the matter power spectrum P = P(1,1) = Pmm followed by the terms P(2,1), and P(2,2).

Figure 2.

Measurements of the de-smoothed terms P(1,1), P(2,1), and P(2,2) averaged over 200 N-body simulations. We show results for a number of smoothing scales within the range 2 ≤ R ≤ 18 h−1 Mpc in comparison with our basic CIC grid (see the main text for more details). The error bars denote the standard error on the mean.

Figure 3.

Same as for Fig. 2 but for the bispectrum terms B(1,1,1), |$B^{(\rm s)}_{(2,1,1)}$|⁠, |$B^{(\rm s)}_{(2,2,1)}$|⁠, and |$B^{(\rm s)}_{(2,2,2)}$|⁠.

For the bispectra, the panels show: the matter bispectrum B = B(1,1,1) = Bmmm (top left), |$B_{(2,1,1)}^{\rm (s)}$| (top right), |$B_{(2,2,1)}^{\rm (s)}$| (bottom left) and |$B_{(2,2,2)}^{\rm (s)}$| (bottom right). We have restricted the triangle configurations to those which enter the auto- and cross-halo bispectra shown in Fig. 1. Each panel shows six sets of points with error bars which denote the results obtained for different smoothing scales. The red crosses denote the resulting polyspectra when no Gaussian smoothing (and de-smoothing) is applied on top of the CIC assignment.

Comparing the different panels reveals how the amplitudes of the de-smoothed quantities vary. Obviously, for the matter power and bispectra, P(1, 1) and B(1,1,1), all of the spectra overlap with the CIC result as the smoothing and the de-smoothing procedures perfectly cancel each other out. However, for the remaining P(l, m) and B(l,m,n) functions, the de-smoothed quantities vary with the scale R. In particular, as R decreases, the overall amplitude of the spectra increases due to the contributions of small-scale modes. For the largest smoothing scales, the configuration dependence of the spectra is also modified. In order to gain some insight into the origin of this behaviour, let us consider, for instance, the term
\begin{equation} P_{(2,1)}({\boldsymbol k})=\int \frac{{\rm d}^3{\boldsymbol q}}{(2\pi )^3}\,B({\boldsymbol q},{\boldsymbol k}-{\boldsymbol q},-{\boldsymbol k})\,\mathcal {W}({\boldsymbol q},{\boldsymbol k}-{\boldsymbol q}), \end{equation}
(33)
where |$\mathcal {W}$| is a generic weighting function defined as
\begin{equation} \mathcal {W}({\boldsymbol k}_1,{\boldsymbol k}_2)= \frac{W(k_1 R)\,W(k_2 R)}{W(|{\boldsymbol k}_1 + {\boldsymbol k}_2|R)}. \end{equation}
(34)
For Gaussian smoothing, the weighting function in equation (33) can be re-expressed as |$\mathcal {W}({\boldsymbol q},{\boldsymbol k}-{\boldsymbol q}) =\exp {\left[-R^2\left( q^2-kq\mu \right) \right]}$|⁠, with |$\mu ={\boldsymbol k}\cdot {\boldsymbol q}/(kq)$| the cosine of the angle between |${\boldsymbol k}$| and |${\boldsymbol q}$|⁠.

The contribution to the integral from all modes with qR ≫ 1 is exponentially suppressed (i.e. |$\mathcal {W}\ll 1$|⁠). However, at fixed k, |$\mathcal {W}$| assumes values larger than unity for μ > 0 and q < kμ (independently of R) and presents an absolute maximum for q = k/2 and μ = 1, where it takes the value |$\mathcal {W}_{\rm max}=\exp [(kR)^2/4]$|⁠. Note that, when kR ≪ 1, |$\mathcal {W}_{\rm max}\simeq 1+(kR)^2/4\simeq 1$| so that all configurations where |$\mathcal {W}>1$| receive nearly the same weight. In this case, the parameter R regulates how quickly the function |$\mathcal {W}$| drops when q moves away from the region where |$\mathcal {W}>1$|⁠. In other words, |$\mathcal {W}$| behaves nicely as a smoothing function. This is not true, however, when kR ≫ 1 and the value of |$\mathcal {W}_{\rm max}$| grows very large. In this case, P(2,1) receives dominant contributions from a narrow shell of modes located at qk/2 and μ ≲ 1. This effect is clearly seen in Fig. 2 for R = 18  h−1 Mpc, where the oversmoothing (i.e. the fact that kR is significantly larger than unity for k ∼ 0.1) leads to a change in shape for P(2,1) which is particularly evident for the configurations with the largest wavenumbers.

It is interesting to investigate why, for kR ≪ 1, the configuration dependence of P(2,1) changes very little with R and only the overall normalization appears to depend on the smoothing scale. If we assume that the amplitude of the bispectrum |$B({\boldsymbol k}_1,{\boldsymbol k}_2,-{\boldsymbol k}_1-{\boldsymbol k}_2)$| keeps nearly constant at all scales assuming a value ≃B0, equation (33) then gives
\begin{equation} P_{(2,1)}({\boldsymbol k})\simeq \frac{\pi ^{3/2} \exp {((k R)^2/4)}}{ R^3}\, B_0. \end{equation}
(35)
The first term on the right-hand side gives the q-space volume over which the bispectrum is averaged to get P(2,1). At fixed k, this expression diverges as R−3 when R → 0 and exponentially as R → ∞ while it shows broad minimum around kR ∼ 2.5.

Clearly, had we not smoothed the density field, the resulting P(l, m) and B(l, m, n) would be divergent in any ΛCDM cosmology.

4.4 Modelling P(2,1) and |$B^{\rm (s)}_{(2,1,1)}$| with SPT

In order to better understand what drives the amplitude and functional form of the P(l, m) and |$B^{\rm (s)}_{(l,m,n)}$| terms we have attempted to model their signal with SPT. For simplicity, we have focused on the lowest order non-trivial terms P(2, 1) and |$B^{\rm (s)}_{(2,1,1)}$|⁠.

To leading order in the perturbations, the matter bispectrum can be written as |$B({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)=2 \,F_2({\boldsymbol k}_1,{\boldsymbol k}_2)\,P_{(0)}(k_1)\,P_{(0)}(k_2)+ {\rm 2 \ cyc}$| with F2 the second-order SPT kernel (see Appendix B) and P(0) the linear power spectrum. In Fig. 4 (left-hand panel) we show the results obtained after inserting this expression into equation (33) in comparison with the P(2,1) measurements from the N-body simulations. The SPT-based model displays the same scaling behaviour with k and R as the data. However, for R > 6 h−1 Mpc the SPT predictions are accurate to better than 13 per cent, which is still not at the level of precision required for future galaxy clustering data sets; the deviations become larger with smaller R. It follows from the definition of the |$B^{\rm (s)}_{(2,1,1)}$| term that (see Appendix B for |$\mathcal {B}^{\rm (s)}_{(2,1,1)}$|⁠)
\begin{equation*} B^{\rm (s)}_{(2,1,1)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) = \nonumber \frac{2}{3}\left[ P({\boldsymbol k}_2) P({\boldsymbol k}_3)\mathcal {W}({\boldsymbol k}_2,{\boldsymbol k}_3) + 2 \ {\rm cyc} \right] \end{equation*}
\begin{eqnarray} &&\quad+\,\,\frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} T({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)\mathcal {W}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1) \nonumber \\ &&\quad +\,\,{2\ \rm cyc}, \end{eqnarray}
(36)
where T denotes the matter trispectrum (i.e. the connected part of the four-point correlator). The SPT contribution to lowest non-vanishing order is simply
\begin{eqnarray} B^{\rm (s)}_{(2,1,1)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) \simeq \frac{2}{3}\!\left[ P_{(0)}({\boldsymbol k}_2) P_{(0)}({\boldsymbol k}_3)\mathcal {W}({\boldsymbol k}_2,{\boldsymbol k}_3) + 2 \ {\rm cyc} \right].\nonumber \\ \end{eqnarray}
(37)
In the right-hand panel of Fig. 4 we show that this approximation (star-shaped points) strongly underestimates the outcome from the N-body simulations (solid symbols with error bars) and does not display the same scaling behaviour with k and R as the data. A common approach performed during observational data analysis is to substitute in place of the linear power spectrum, P(0), shown in equation (37), the fully non-linear power spectrum, P(1,1). We found that performing this substitution has little effect on the resulting amplitudes, remaining roughly equivalent as the lowest non-vanishing contributions. We then go one step further and compute the next-to-leading-order corrections to |$B^{\rm (s)}_{(2,1,1)}$| which are of sixth order in terms of the linear density field. This gives
\begin{eqnarray} &&\!\!\!\!&&{B^{\rm (s)}_{(2,1,1)} ({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3)\simeq \frac{2}{3}\mathcal {W}({\boldsymbol k}_2,{\boldsymbol k}_3)\left[\frac{}{} P_{(0)}({\boldsymbol k}_2)P_{(0)}({\boldsymbol k}_3) \right.} \nonumber \\ &&\quad+\left. P_{(0)}({\boldsymbol k}_2) P_{(1\ell )}({\boldsymbol k}_3)\right. \nonumber + \left. \frac{}{}P_{(1\ell )}({\boldsymbol k}_2)P_{(0)}({\boldsymbol k}_3) \right]+ 2\ {\rm cyc} \nonumber \\ &&\quad +\,\,\frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\mathcal {W}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1) T_{(0)}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) \nonumber \\ &&\quad +\,\,{2\ \rm cyc}, \end{eqnarray}
(38)
where P(1ℓ) denotes the first loop correction to the power spectrum (i.e. PP(0) + P(1ℓ) + …) and the term T0 represents the tree-level contribution to the connected trispectrum. In Appendix B we provide the expressions needed for evaluating all these quantities, which are de-smoothed according to equation (15). Our final results are shown in Fig. 4 (solid lines). The SPT approximation shows the correct scaling with R, but for R > 2 h−1 Mpc it tends to overpredict the amplitude for collinear (i.e. θ ≃ 0 and θ ≃ π) configurations. For 4 < R ≤ 8 h−1 Mpc it also underpredicts the amplitude for triangles in which |${\boldsymbol k}_1$| and |${\boldsymbol k}_2$| are nearly perpendicular. However, as R increases the discrepancy lessens and at R = 10 h−1 Mpc SPT performs better. This suggests that using SPT to fit galaxy bispectra in the scale range 0.04 ≲ k ≲ 0.12 h Mpc−1 may possibly lead to seriously biased estimates for the parameters of the LEB.
Figure 4.

Left: measurements of the P(2,1) term from the simulations (points with error bars) are compared with the analytical predictions from leading-order SPT (solid lines) for different filter radii (from top to bottom: R = 2, 4, 6, 8, 10 h−1 Mpc). Right: same as in the left-hand panel but for |$B^{(\rm s)}_{(2,1,1)}$|⁠. The star-shaped points represent the contribution to |$B^{(\rm s)}_{(2,1,1)}$| from the disconnected parts of the fourth-order correlators at tree level in SPT (i.e. cyclical products of the linear power spectrum).

Nevertheless, whilst the analytic calculations of |${\mathcal {P}}_{(2,\! 1)}$| and |${\mathcal {B}}^{\rm (s)}_{(2,1,1)}$| are feasible, computing higher order terms becomes increasingly challenging. However, estimating these quantities from simulations is no more demanding than measuring the low-order terms and so our approach offers a distinct advantage over the classical SPT calculations.

5 ESTIMATION OF HALO BIAS

5.1 Bayesian parameter estimation

The second-order LEB contains three parameters: |$\boldsymbol \theta \equiv \lbrace b_1,b_2,R\rbrace$|⁠. In this section, we use Bayesian statistics to determine their values that best represent the halo power and bispectra extracted from our simulations. For simplicity, we assume that the cosmological parameters are perfectly known and that the measurement errors are Gaussian distributed, i.e.
\begin{eqnarray} {\mathcal {L}}({\boldsymbol x}|\boldsymbol \theta ) &=& (2\pi )^{-N/2}\,|{{\bf C}}|^{-1/2} {\rm e}^{ -\frac{1}{2}[({\boldsymbol x}-\mu (\boldsymbol \theta ))^{\rm T}{\bf C}^{-1}({\boldsymbol x}-\mu (\boldsymbol \theta ))]} \nonumber \\ &=& (2\pi )^{-N/2}\,|{{\bf C}}|^{-1/2} {\rm e}^{-\frac{\chi ^2({\boldsymbol x},\boldsymbol \theta )}{2}}, \end{eqnarray}
(39)
where |${\boldsymbol x}^{\rm T}$| is the N-dimensional vector containing the power spectra or bispectra for different configurations, |$\mu (\boldsymbol \theta )$| is the model prediction and |${\bf C}$| is the covariance matrix. In theory |${\bf C}$| is a model dependent quantity; however, owing to the technical challenge of estimating this matrix and its inverse, we have decided to determine |${\bf C}$| directly from the data.
Equation (39) gives the likelihood of the data given the model, but what we need in order to perform parameter estimation is the posterior probability of the model parameters given the data. This can be obtained using Bayes’ theorem
\begin{equation} P(\boldsymbol \theta |{\boldsymbol x})=\frac{\Pi (\boldsymbol \theta )\, {\mathcal {L}}({\boldsymbol x}|\boldsymbol \theta )}{p({\boldsymbol x})}, \end{equation}
(40)
where |$\Pi (\boldsymbol \theta )$| is the prior probability for the model parameter while the evidence,
\begin{equation} p({\boldsymbol x})\equiv \int \Pi (\boldsymbol \theta )\, {\mathcal {L}}({\boldsymbol x}|\boldsymbol \theta )\, {\rm d}^3\,\theta {}, \end{equation}
(41)
simply acts as a normalizing factor and does not influence the search for the best fit. In what follows we will always assume flat priors on the parameters, but bounded over a finite domain which is much more extended than the likelihood function. Moreover, b1 and R will always be assumed to be positive.

5.2 Covariance matrix estimation

The sample covariance matrix
\begin{equation} \widehat{{{\bf S}}}\equiv \frac{N}{N-1}\langle \Delta {\boldsymbol x}^{\rm T}\Delta {\boldsymbol x}\rangle _N; \hspace{14.22636pt}\Delta {\boldsymbol x}\equiv {\boldsymbol x}-\langle {\boldsymbol x}\rangle _N, \end{equation}
(42)
where 〈…〉N denotes the arithmetic mean over N independent measurements, and provides an unbiased estimator of the covariance matrix for the measurement errors.

However, this estimator is extremely unstable and inefficient. It generally provides matrices where the smallest eigenvalue is too small and the largest one is too big. Very large samples are thus needed to obtain accurate estimates of the covariance.

On using our ensemble of 200 simulations for both the power and the bispectra, we could measure the diagonal elements of the covariance with an accuracy of ∼10 per cent. On the other hand, the off-diagonal elements had a much smaller absolute value and were scattering around zero with errors of the order of ∼100 per cent. All this suggests that the covariance should be close to diagonal as expected for a Gaussian random field with infinitesimally narrow bins in k-space.

Due to these large uncertainties in the off-diagonal elements, we opted for implementing a shrinkage method to better estimate the covariance matrices of our power and bispectra. Shrinkage estimation is a variance reduction technique that shrinks an empirical estimation of the covariance like |$\widehat{{{\bf S}}}$| towards a theoretical model for how the covariance should be, represented by a structured matrix |${\bf T}$| (the target covariance). The shrinked estimator is given by the convex linear combination
\begin{equation} \widehat{{\bf C}}= \lambda {{\bf T}} + (1-\lambda ) \widehat{{\bf S}}, \end{equation}
(43)
where 0 < λ < 1 is the shrinkage intensity. This ensures the resulting covariance matrix to be positive definite even if |$\widehat{{\bf S}}$| is singular (because it is determined from |$N<{\mathrm{dim}}({\boldsymbol x})$| observations).
It has been demonstrated that shrinkage techniques provide a regularized estimate of the covariance |$\widehat{{\bf S}}$| which is both more accurate and statistically efficient than either of the individual estimators |$\widehat{{\bf S}}$| and |${\bf T}$|⁠, and they do so in a systematic way (Schäefer & Strimmer 2005). Without the need for specifying an underlying probability distribution, Ledoit & Wolf (2003) provided a theorem that determines the optimal value for λ through minimization of a quadratic loss function such as the mean-square error of the covariance matrix. This can be expressed in terms of the squared Frobenius norm
\begin{eqnarray} L(\lambda ) & = & \parallel \widehat{{\bf C}} - {\bf \Sigma }\parallel ^2_{F} \nonumber \\ & = & \parallel \lambda {\bf T} + (1 - \lambda ) \widehat{{\bf S}} - {\bf \Sigma }\parallel _{F} \nonumber \\ & = & \sum _{i,j=1}^{p}(\lambda t_{ij} + (1 - \lambda ) s_{ij} - \sigma _{ij})^2, \end{eqnarray}
(44)
which gives a measure of the distance between the true population covariance, Σ, and the inferred one, namely, |$\widehat{{\bf C}}$|⁠. The key is to select a suitable target, and we assume it to be a diagonal matrix with unequal variances coinciding with the sample variances:
\begin{equation} t_{ij} = \left\lbrace \begin{array}{@{}l@{\quad }l@{}}s_{ii}, & {\rm if}\quad {i = j} \\ 0, & {\rm if}\quad {i \ne j} \end{array}\right.. \end{equation}
(45)
Minimizing equation (44) gives the expression for the optimal shrinkage intensity:
\begin{equation} \lambda _{*} = \frac{\sum _{j>i} {\textrm {Var}}(s_{ij})}{\sum _{j>i}[\textrm {Var}(s_{ij})+ \sigma _{ij}^2]} = \frac{\sum _{j>i} {\textrm {Var}}(s_{ij})}{\sum _{j>i} \textrm {E}(s_{ij}^2)}, \end{equation}
(46)
where |${\textrm {E}}(\ldots )$| denotes the expectation value of a random variable. Following Schäefer & Strimmer (2005), we estimate the sampling variance of the elements of the sample covariance using
\begin{equation} \widehat{\textrm {Var}}(s_{ij})=\frac{N}{(N-1)^3}\sum _{j=1}^N \left(\Delta {\boldsymbol x}_j^{\rm T}\Delta {\boldsymbol x}_j-\langle \Delta {\boldsymbol x}^{\rm T}\Delta {\boldsymbol x}\rangle _N\right)^2. \end{equation}
(47)
However, while these authors approximate |${\mathrm{E}}(s_{ij}^2)$| in equation (46) with the square of the point estimate sij thus overestimating λ*, we adopt the square of the sample covariances |$s_{ij}^2$| as a proxy for |$\sigma _{ij}^2$| (e.g. Kwan 2011). In all cases, we found that the optimal shrinkage intensity was roughly |$\hat{\lambda }_{*} \sim 0.45$| for the power-spectra covariance and |$\hat{\lambda }_{*}\sim 0.23$| for the bispectra covariance, respectively. Note that the adopted algorithm only performs shrinkage of the off-diagonal elements of the covariance matrix.

5.3 Constraining the bias parameters: b1, b2 and R

We now determine the best-fitting model parameters for the various power and bispectra that we have estimated from the simulations within the scale range 0.04 < k < 0.12  h Mpc−1. We consider two second-order LEB models that differ in the polyspectra describing the non-linear matter distribution (see below for the details). In both cases, we map the likelihood function within a finite volume of the parameter space that we slice into a regular Cartesian mesh.

5.3.1 SPT tree-level model

The first model uses SPT at the lowest non-vanishing order to approximate the P(l, m) and B(l, m, n). This is what is most commonly done in the literature. For the power spectrum, the P(l, m) terms expressed at tree level of SPT are:
\begin{equation} P^{\rm tree}_{(1,1)} = P_{(0)}(k), \end{equation}
(48)
\begin{equation} P^{\rm tree}_{(i,j)} = 0 \ {\mathrm{for}}\ i+j>2, \end{equation}
(49)
where P(0)(k) denotes the linear matter power spectrum.

Thus, the bias relation is linear and carries no dependence on the filter scale, R, and on b2.

For the bispectrum, the evaluation of B(l, m, n) using only tree-level contributions gives
\begin{equation} B^{\rm (s), tree}_{(1,1,1)} = 2\,P_{(0)}(k_1)P_{(0)}(k_2) F_2({\boldsymbol k}_1,{\boldsymbol k}_2)+2\,{\rm cyc}, \end{equation}
(50)
\begin{equation} B^{\rm (s), tree}_{(2,1,1)} = \frac{2}{3}\,P_{(0)}(k_1)P_{(0)}(k_2)\mathcal {W}({\boldsymbol k}_1,{\boldsymbol k}_2)+2\,{\rm cyc}, \end{equation}
(51)
\begin{equation} B^{\rm (s), tree}_{(i,j,k)} = 0\ {\mathrm{for}} \ i+j+k>4, \end{equation}
(52)
where |$F_2({\boldsymbol k}_1,{\boldsymbol k}_2)$| is the second-order mode-coupling kernel function from SPT (e.g. Bernardeau et al. 2002).

5.3.2 Fully non-linear model

The second model considers the fully non-linear matter polyspectra extracted from the simulations. Note that, while evaluating the LEB when varying b1 and b2 at fixed R is a trivial exercise, varying R would, in theory, require recomputing all the relevant P(l,m) and B(l,m,n). However, as we mentioned earlier in our discussion of Fig. 3, these functions change smoothly with R. We therefore use a cubic-spline interpolation of log [P(l, m)] and log [B(l, m, n)] to model the R-dependence of the theory. This enabled us to map the likelihood function with arbitrary resolution.

A final comment is in order regarding the details of how the fit is performed. There is some arbitrariness in defining what exactly are the ‘observables’ and what is the ‘model’ in the simulations. For instance, we could have fit the outcome of each N-body simulation separately using the polyspectra extracted from the very same realization. While being a valid test of the LEB, this method would have not had much in common to actual galaxy redshift surveys (or even to the SPT model discussed above), where the underlying mass distribution is unknown and needs to be modelled independently. In fact, the presence of the same noise structure in the matter and halo power and bispectra would result in overfitting. There are a couple of alternative approaches one could follow to prevent this. The first is to generate smooth versions of the P(l,m) and B(l,m,n) terms by averaging over the entire ensemble of simulations. One can then use these ‘theoretical models’ to simultaneously fit the halo statistics extracted from all of our 200 independent realizations. The other alternative is to subdivide the total ensemble of simulations into two subsets, where one subset would be used to construct the smooth P(l, m) and B(l, m, n) terms by averaging over the total number of simulations in the subsample and the other subset would serve as the halo statistics to be analysed. The partition of the ensemble of realizations into two distinct subsets ensures that the ‘model’ and ‘data’ are indeed independent. Furthermore, one can exchange the roles of ‘model’ and ‘data’ for the two subsets and then sum the χ2s obtained from the two sets of analysis. We carried out both approaches however we only report the results from averaging over the 200 simulations, as the bias model constraints compared to the partitioning approach are in extremely good agreement.

5.4 Goodness of fit

In this section we use the classic χ2 goodness-of-fit test to quantify how well the second-order LEB fit our simulated data. We minimize the χ2 function over the parameter space using the simplex method. The best-fitting models determined this way basically coincide with those that minimize the χ2 function in the dense grid used for our Bayesian analysis. Since for all power and bispectra we have always used 20 bins in k or θ and the covariance matrices are full rank, the number of degrees of freedom totalled ν = 200 × 20 − 3 = 3997 for each fit.

5.4.1 Power spectra

The tree-level SPT models for the halo power spectra provide very poor fits to our data. The minimum χ2 value is much larger than the number of degrees of freedom, reaching |$\chi ^2_{\rm min}\simeq 7465$| for Phm and |$\chi ^2_{\rm min}\simeq 139\,821$| for Phh. These results may serve as indicators that halo biasing is non-linear and/or a result of the breakdown of linear SPT. To check the latter, we refit both spectra using equation (48) but after replacing |$P_{(1,1)}^{\rm tree}$| with the fully non-linear matter power spectrum P(1, 1). In this case, we acquire |$\chi ^2_{\rm min}\simeq 3903$| for Phm and |$\chi ^2_{\rm min}\simeq 4442$| for Phh. This significant improvement to the tree-level results demonstrates the need to model non-linearities in the matter distribution very accurately. Using the fully non-linear model with the additional free parameters b2 and R only slightly improves the goodness of fit for Phm, giving |$\chi ^2_{\rm min} \simeq 3901$|⁠. On the other hand, the improvement is marked for Phh for which we obtain |$\chi ^2_{\rm min} \simeq 3915$|⁠.

It is interesting to see how the |$\chi ^2_{\rm min}$| value changes when R is kept fixed. In this case, we find that all the fits to Phm are equally good. However, for Phh, the values of |$\chi ^2_{\rm min}$| undergo a sharp decrease (from |$3944\lesssim \chi ^2_{\rm min} \lesssim 3921$|⁠) for 2 < R ≲ 3.66  h−1 Mpc, then decrease slowly to the absolute minimum value at R ∼ 13.2  h−1 Mpc and finally begin to slowly rise again to our cutoff scale of R = 18  h−1 Mpc. Hence, it appears that there is a range of preferred smoothing scales that best fit the simulation data for Phh.

5.4.2 Bispectra

Turning now to the bispectra, we find that the fully non-linear model provides slightly better fits to the numerical data (⁠|$\chi ^2_{\rm min}\simeq 3906, 3908$| and 3913 for Bhmm, Bhhm, and Bhhh, respectively) than the tree-level model (⁠|$\chi ^2_{\rm min}\simeq 3923, 3922$| and 3925) which, however, already supplies |$\chi ^2_{\rm min}/\nu \lesssim 1$|⁠.

In all cases, if we keep R fixed and only consider two-parameter models, we find that the |$\chi ^2_{\rm min}$| value does not change much for 2 < R < 13  h−1 Mpc while it rapidly grows adopting larger smoothing scales. In terms of goodness of fit, the non-linear model for Bhmm outperforms the tree-level SPT model for all values of R. On the other hand, when Bhhm and Bhhh are considered, the non-linear model gives a better fit only for R ≲ 15  h−1 Mpc.

5.4.3 Posterior mean

In order to give a visual impression of the best-fitting models, in Fig. 1 we show the posterior mean (black line) and the posterior rms error (shaded grey region) for the halo power and bispectra resulting from our fits with the fully non-linear model in comparison with the simulation data. In all cases, the models agree with the simulations remarkably well. Note that the rms error on the best-fitting models for Phh and Phm is hardly visible on the scale of the plots.

5.5 Bias from the power spectrum

5.5.1 Effective bias

Due to its highly compressed ordinate, Fig. 1 gives the false impression that Phm and Phh are nicely described by rescaling the matter power spectrum with constant multiplicative factors ∼1.5 and 1.52, respectively. In order to examine the bias relation more closely as a function of scale, we introduce two effective bias coefficients by taking different ratios of the halo power spectra after1 averaging them over the 200 N-body simulations: bhm = 〈Phm〉/〈Pmm〉 and bhh = (〈Phh〉/〈Pmm〉)1/2. We present our results in Fig. 5. The solid points with error bars represent the effective biases estimated using the shot-noise corrected quantities of both the auto- and cross-halo power spectra. We compute the 1σ uncertainties via error propagation accounting for the covariance between the different observables. It can be seen that bhh and bhm do not perfectly match each other. On large scales (k < 0.06  h Mpc−1), bhh keeps roughly constant while it shows a significant scale dependence for k > 0.06  h Mpc−1, whereas bhm shows the opposite trend although its scale dependence is weaker for the large scales. At k ≃ 0.04  h Mpc−1, bhm and bhh assume very similar values. However, bhm > bhh for all wavenumbers. Our high-quality data also provide some hints for the presence of weak oscillatory features in the effective bias parameters on the scales of baryonic acoustic oscillations.

Figure 5.

The effective halo-bias parameters bhm = Phm/Pmm (orange symbols) and bhh = (Phh/Pmm)1/2 (blue symbols) extracted from our simulations as a function of the wavenumber. The black solid lines and shaded regions indicate the mean and the rms value of the effective bias obtained by averaging the predictions of the second-order LEB over the posterior probability of the model parameters.

Fig. 5 also tests how the fully non-linear second-order LEB model is able to reproduce the scale dependence of bhm and bhh in fine details. The black curves represent the posterior mean of the effective bias coefficients and the shaded grey regions denote the corresponding rms value of their posterior distribution. Although the models are not able to reproduce all the features which are present in the numerical data, they are in reasonable agreement with the simulations, especially for k > 0.08  h Mpc−1. Nevertheless, we see that for both bhm and bhh the power spectrum models actually are less accurate at small k (i.e. on the large scales) in the proximity of the point where the trend from constant-to-scale dependence (and vice versa) occurs. On these scales, the models systematically overpredict the effective biases and the largest discrepancy is of the order of ∼0.3 per cent.

5.5.2 Marginal credible regions

Now we compare the level of the consistency between the model-parameter constraints deriving from the fits to the halo power spectra, Phh and Phm. Fig. 6 shows (from left to right) the marginal posterior distributions for the parameter pairs b1b2, b1R and b2R of our fully non-linear model. The black and green contours denote the 68.3 and 95.4 per cent credible regions for the parameters of the LEB obtained from analysing Phh and Phm, respectively. The first apparent observation is that the contours of Phm and Phh span different regions of the parameter space: while the Phm data prefer b1 ≲ 1.5 and b2 ≳ 0, Phh favours b1 ≳ 1.5 combined with −0.15 ≲ b2 ≲ −0.2. In other words, the second-order LEB model provides a successful fit to Phh or Phm but requires two incompatible parameter sets. Improper modelling of the shot noise in Phh might be the primary cause of the inconsistency (e.g. Hamaus et al. 2010). Note, however, that the best-fitting values for b1 and b2 that we derive from Phh are in good agreement with the predictions of theories that follow the collapse of dark matter haloes (e.g. see equations 14 and 15 in Scoccimarro et al. 2001). It is also worth mentioning that, for Gaussian fluctuations in the matter density, the cross-spectrum of locally biased tracers is always exactly proportional to Pmm even though this is not apparent from the mathematical formulation of the LEB (Frusciante & Sheth 2012). The fact that our measurement of Phm needs b2 ≃ 0 might simply suggest that a similar relation holds true also in the presence of non-Gaussian perturbations (at least approximately, since bhm keeps nearly constant with k as shown in Fig. 5).

Figure 6.

Joint marginal probability distribution for the parameter pairs b1b2, b1R and b2R (from left to right) obtained using the fully non-linear model for Phh (black) and Phm (green). Contours correspond to the 68.3 and 95.4 per cent credible intervals.

5.6 Bias from the bispectrum

5.6.1 Effective bias

To investigate the bias relation as a function of scale using the halo bispectra, we define a set of coefficients by taking the following ratios: bhmm = 〈Bhmm200/〈Bmmm200, bhhm = (〈Bhhm200/〈Bmmm200)1/2, bhhh = (〈Bhhh200/〈Bmmm200)1/3. The results are shown in Fig. 7: all the effective bias coefficients present a characteristic configuration dependence and are in agreement within their 1σ uncertainties (although bhhh tends to assume slightly higher values for all triangle configurations). The posterior means of the effective bias coefficients from the fully non-linear models also closely match the data as expected from the χ2 test presented in Section 5.4.2. All this suggests that the second-order LEB provides a suitable description of the bias relation at the three-point level.

Figure 7.

As in Fig. 5 but for the effective bias parameters bhmm = Bhmm/Bmmm, bhhm = (Bhhm/Bmmm)1/2, and bhhh = (Bhhh/Bmmm)1/3.

5.6.2 Marginal credible regions

We now evaluate the consistency of the model-parameter constraints for the halo bispectra. Fig. 8 shows the marginal posterior distribution for the parameter pairs b1b2, b1R and b2R, respectively. Each panel refers to a particular bispectrum, as indicated from the label in the bottom-left corner. The black contours denote the 68.3 and 95.4 per cent credible regions for the parameters of the fully non-linear model. The red contours, instead, indicate the corresponding regions for the SPT tree-level model described in Section 5.3.1.

Figure 8.

Joint marginal probability distribution for the parameter pairs b1b2 (top), b1R (middle) and b2R (bottom) obtained fitting the data for Bhhh (left), Bhhm (centre), and Bhmm (right). Contours correspond to the 68.3 and 95.4 per cent credible intervals and refer to the full non-linear model (black) and to the approximation based on tree-level SPT (red).

The first thing that may be noticed is that the estimates for b1 and b2 from the tree-level and fully non-linear models are in disagreement: the tree-level constraints show a systematic shift, preferring lower b1 and slightly more negative b2 values. This implies that inferences made about the non-linearity of galaxy bias using the galaxy bispectrum and tree-level perturbation theory will be significantly biased. Note that this statement also applies to rather large scales k1 ≲ 0.12 h Mpc−1. If one uses triangle configurations on smaller scales (e.g. Verde et al. 2002) then the discrepancy becomes larger. Therefore, our programme to use N-body simulations for determining the matter terms in the bias relation is key to correctly estimate the bias (and thus the cosmological parameters) from forthcoming observational data.

The second important point to notice is that fits to Bhhh, Bhhm and Bhmm with the fully non-linear model give consistent constraints for b1, b2 and R. The precision with which we are able to determine the bias parameters increases as we go from Bhmm to Bhhh. This finding is consistent with our earlier results (Pollack et al. 2012).

Note that the best-fitting values for b1 appear relatively insensitive to variations in R. One has to consider rather large smoothing scales in order to see any changes in the best-fitting models. However, the best-fitting solutions for b2 are strongly degenerate with R: as R decreases, b2 becomes less negative and tends towards zero. This owes to the fact that, on changing R in the interval R ∈ [2, 13] h−1 Mpc, one can always find different combinations of b1 and b2 that fit the data with the same accuracy as previously described in Section 5.4. We can more directly understand the origin of the b2R degeneracy as follows. Let us consider the bias model for Bhmm since this only contains the terms |${\mathcal {B}}^{\rm (s)}_{(1,1,1)}$| and |${\mathcal {B}}^{\rm (s)}_{(2,1,1)}$|⁠. As shown in the top panels of Fig. 4, the de-smoothing of |${\mathcal {B}}^{\rm (s)}_{(1,1,1)}$| results in the matter bispectrum. However, the de-smoothing of |${\mathcal {B}}^{\rm (s)}_{(2,1,1)}$| results in a function that carries a dependence on R. If we take the ratio of |${\mathcal {B}}^{\rm (s)}_{(2,1,1)}$| defined for Ra with the same function but defined for Rb, then we will find something close to a constant for R = [4, 10] h−1 Mpc. Thus we can identify the degenerate combination:
\begin{equation} b_2^{\rm a} = \frac{{\mathcal {B}}^{\rm (s)}_{(2,1,1)}({\boldsymbol k}_1,{\boldsymbol k}_2,R_{\rm b})}{{\mathcal {B}}^{\rm (s)}_{(2,1,1)}({\boldsymbol k}_1,{\boldsymbol k}_2,R_{\rm a})}\, b_2^{\rm b} \approx A(R_{\rm a},R_{\rm b})\,b_2^{\rm b}, \end{equation}
(53)
where A(Ra, Rb) is a function that is independent of the triangle configuration. Hence, the value of b2 is correlated with the size of the smoothing scale, R.

5.7 Comparing all constraints

In Fig. 9 we present the marginal posterior probabilities for the single bias parameters extracted from the various probes that we have considered. The left-hand, central and middle columns show the results for b1, b2 and R, respectively. From top to bottom, the rows correspond to Phh, Phm, Bhhh, Bhhm, Bhmm, respectively. The black curves represent the results from the fully non-linear modelling, and the red curves show the results from the tree-level perturbation theory for the bispectra. The corresponding mean and rms values of the marginal probabilities for the full non-linear model are reported in Table 2.

Figure 9.

Marginal probability distributions for the single bias parameters b1 (left), b2 (centre) and R (right) obtained fitting various halo statistics (from top to bottom: Phh, Phm, Bhhh, Bhhm, Bhmm). Results obtained with the full non-linear model (black) are compared with those derived using tree-level SPT (red).

Table 2.

Posterior mean and rms error of the bias parameters b1, b2 and R obtained fitting various halo statistics with the full non-linear bias model.

Statistic|$b_1 \pm \sigma _{b_1}$||$b_2 \pm \sigma _{b_2}$|R ± σR
( h−1 Mpc)
Phh1.53 ± 0.02−0.18 ± 0.0212.0 ± 3.1
Phm1.48 ± 0.020.02 ± 0.0110.6 ± 4.1
Bhhh1.49 ± 0.03−0.18 ± 0.077.2 ± 2.6
Bhhm1.51 ± 0.03−0.26 ± 0.107.8 ± 2.8
Bhmm1.52 ± 0.05−0.31 ± 0.149.1 ± 3.1
Statistic|$b_1 \pm \sigma _{b_1}$||$b_2 \pm \sigma _{b_2}$|R ± σR
( h−1 Mpc)
Phh1.53 ± 0.02−0.18 ± 0.0212.0 ± 3.1
Phm1.48 ± 0.020.02 ± 0.0110.6 ± 4.1
Bhhh1.49 ± 0.03−0.18 ± 0.077.2 ± 2.6
Bhhm1.51 ± 0.03−0.26 ± 0.107.8 ± 2.8
Bhmm1.52 ± 0.05−0.31 ± 0.149.1 ± 3.1
Table 2.

Posterior mean and rms error of the bias parameters b1, b2 and R obtained fitting various halo statistics with the full non-linear bias model.

Statistic|$b_1 \pm \sigma _{b_1}$||$b_2 \pm \sigma _{b_2}$|R ± σR
( h−1 Mpc)
Phh1.53 ± 0.02−0.18 ± 0.0212.0 ± 3.1
Phm1.48 ± 0.020.02 ± 0.0110.6 ± 4.1
Bhhh1.49 ± 0.03−0.18 ± 0.077.2 ± 2.6
Bhhm1.51 ± 0.03−0.26 ± 0.107.8 ± 2.8
Bhmm1.52 ± 0.05−0.31 ± 0.149.1 ± 3.1
Statistic|$b_1 \pm \sigma _{b_1}$||$b_2 \pm \sigma _{b_2}$|R ± σR
( h−1 Mpc)
Phh1.53 ± 0.02−0.18 ± 0.0212.0 ± 3.1
Phm1.48 ± 0.020.02 ± 0.0110.6 ± 4.1
Bhhh1.49 ± 0.03−0.18 ± 0.077.2 ± 2.6
Bhhm1.51 ± 0.03−0.26 ± 0.107.8 ± 2.8
Bhmm1.52 ± 0.05−0.31 ± 0.149.1 ± 3.1

Considering the values for b1 from the bispectra, we see that, as noted earlier, the parameter constraints for the non-linear model are consistent with one another and are significantly different from the best-fitting b1 obtained from the tree-level expressions. On comparing the bispectra results with the power-spectra results we find reasonable consistency for the non-linear modelling, whereas for the tree-level bispectrum model, the results disagree at high significance (see also Pollack et al. 2012). However, the marginal distributions for b1 from Phh and Phm overlap very little. In fact, they exhibit opposite skewness although they are both narrow and located around b1 ≃ 1.5. The marginal distribution for b1 computed from Phm agrees remarkably well with the effective bias bhm = 1.503 ± 0.002. This is because the data require b2 ≃ 0 in this case. On the other hand, b1 > bhh = 1.49 ± 0.002 in the marginal distribution extracted from Phh which requires b2 < 0.

Examining the results of the fully non-linear model for b2, from the bispectra we find that the marginal posterior distributions are fairly broad and are peaked towards negative values (b2 ≃ −0.2 for Bhhh, b2 ≃ −0.3 for Bhhm and Bhmm). Overall, the various bispectra give consistent constraints. Note that the sharp cutoff in the marginal distributions at b2 ≃ 0 is due to the fact that our prior for R does not consider values R < 2 h−1 Mpc. Considering the results obtained using the tree-level SPT model, we see that the distributions for b1 and b2 shift towards different values (approximately the posterior mean of the bias parameters moves by Δb1 ≃ Δb2 ≃ −0.15). On comparing with the results obtained from the halo power spectra, we see that the marginal distribution for b2 extracted from Phh and Phm are narrowly peaked around b2 ∼ −0.18 and b2 ∼ 0.02, respectively.

We now turn to the question of whether there is a preferred smoothing scale for the haloes we have considered. On inspecting the bispectra, we see that the marginal distributions for R are reasonably consistent and display a broad peak between 5 and 12  h−1 Mpc. The power spectra, instead, tend to prefer slightly larger values of R: 10 < R < 15 h−1 Mpc for Phh and R > 5 h−1 Mpc for Phm, consistent with the behaviour of the goodness of fit reported in Section 5.4. In all cases, these optimal smoothing scales correspond to a few Lagrangian radii of the haloes. They are also comparable to (but a bit smaller than) the scales that we sample with the measurements of the power spectra and bispectra. Note that a sphere of radius ∼10 h−1 Mpc contains ∼1.5 haloes on average so that counts in cells of this extension are subject to sizable random fluctuations that create stochasticity in the bias relation.

5.8 Cross-correlation coefficients

There are three possible explanations as to why Phm, Phh and the bispectra show disagreement for the full non-linear model. One, we may require higher order terms in the bias expansion, e.g. b3, etc.; two, the LEB may be wrong; three, there may be uncorrelated stochasticity in the relation between halo overdensity and mass overdensity. We shall now explore this latter possibility.

A number of studies have demonstrated, using N-body simulations, that the relation between |$\delta _{\rm h}({\boldsymbol x}|R)$| against |$\delta ({\boldsymbol x}|R)$| contains scatter, and that this scatter depends on the scale which one uses to compute the density field (e.g. Dekel & Lahav 1999; Seljak & Warren 2004; Smith et al. 2007; Manera & Gaztañaga 2011; Roth & Porciani 2011; Chuen Chan & Scoccimarro 2012; Pollack et al. 2012). Another way to explore the stochasticity is through the cross-correlation coefficient between Fourier modes. For two-point statistics this can be defined (Dekel & Lahav 1999) as
\begin{equation} r \equiv \frac{\hat{P}_{\rm hm}}{\sqrt{\hat{P}_{\rm hh} \hat{P}_{\rm mm}}} = \frac{b_{\rm hm}}{b_{\rm hh}}. \end{equation}
(54)
If δh is a deterministic linear function of δ, then r = ±1. However, if there is uncorrelated random noise present, i.e. |$\delta ^{\rm h}({\boldsymbol x})=b\delta ({\boldsymbol x})+\epsilon ({\boldsymbol x})$|⁠, then the halo power spectrum would be Phh(k) = b2Pmm(k) + Pϵ(k), where Pϵ denotes the power spectrum of the noise distribution. This leads to
\begin{equation} r =\left(1+\frac{P_\epsilon }{P_{\rm mm}}\right)^{-1/2}<1. \end{equation}
(55)
We note that non-linearity in the bias relationship will also introduce deviations of r away from unity: consider the quadratic relation |$\delta _{\rm h}({\boldsymbol x})=b_1\delta ({\boldsymbol x})+b_2\,\delta ^2({\boldsymbol x})/2$|⁠, then one finds that the cross-correlation can be written as
\begin{eqnarray} r & = & \left[1+\frac{c_2}{2}\frac{\mathcal {P}_{(2,1)}}{\mathcal {P}_{(1,1)}}\right] \left[1+c_2\frac{\mathcal {P}_{(2,1)}}{\mathcal {P}_{(1,1)}} +\frac{c_2^2}{4} \frac{{\mathcal {P}}_{(2,2)}}{{\mathcal {P}}_{(1,1)}}\right]^{-1/2} \end{eqnarray}
(56)
\begin{eqnarray} & \approx & 1-\frac{c_2^2}{8} \frac{{\mathcal {P}}_{(2,2)}}{{\mathcal {P}}_{(1,1)}}, \end{eqnarray}
(57)
where c2b2/b1 and where the second equality follows for the case where |${\mathcal {P}}_{(2,1)}\ll {\mathcal {P}}_{(1,1)}$| and |${\mathcal {P}}_{(2,2)}\ll {\mathcal {P}}_{(1,1)}$|⁠.

In this case, we see that the cross-correlation function can be either greater or less than unity depending on the sign and magnitude of c2.

Fig. 10 shows the cross-correlation coefficient estimated from our ensemble of N-body simulations along with the standard errors on the mean. The open symbols show the result before we correct Phh for shot noise, the solid symbols show the result after the usual inverse number-density correction. We see that before correcting for the shot noise the function is less than 1 and decreases with scale. After the correction, r is brought within a few per cent from unity and is always larger than 1. Note that the difference from unity is very significant given the numbers of realizations and the comoving volume covered by our simulations.

Figure 10.

Left: linear cross-correlation coefficient between the fluctuations in the halo and matter density, r = bhm/bhh, for different wavenumbers. Closed and open symbols show the results obtained from the simulations when Phh is and is not corrected for shot noise, respectively. The black solid line and the shaded region around it indicate the mean and the rms value of the correlation coefficient obtained by averaging the predictions of the second-order LEB over the posterior probability of the model parameters derived from a joint fit to Phh and Phm. The solid curve, instead, shows the values of r that are computed using the means for Phh and Phm over the posterior distributions for the individual fits to Phh and Phm, respectively. Right: as in the left-hand panel but for the three-point coefficients r1 and r2 defined in equations (58) and (59). In this case, the shaded region is obtained averaging the model over the posterior distribution for the parameters derived from a joint fit to the relevant bispectra, while the solid curve uses the different means from the fits to the individual bispectra.

In order to derive r from the fully non-linear model, we jointly fit the numerical data for Phm and Phh. We acknowledge that utilizing 200 simulations prevents us from accurately estimating a 40 × 40 covariance matrix, in particular the cross-covariances between the different spectra. Therefore, we performed the joint fit in the following manner. In order to ensure that the different spectra can be treated as independent of each other, we generated two ensembles consisting of 100 simulations each to estimate a particular spectra. We then computed a 20 × 20 block covariance matrix, selecting every other bin from our auto- and cross-power spectrum estimates. The off-diagonal blocks of the covariance matrix were set equal to zero when analysing the auto- and cross-halo–matter power spectrum. The resulting best-fitting model (b1 ≃ 1.5, b2 ≃ −0.09, R ≃ 18) does not match to the data (⁠|$\chi ^2_{\rm min}\simeq 2242/1997$| with a contribution of 1170 coming from Phm) meaning that it is impossible to simultaneously fit Phh and Phm with the second-order LEB. Consequently, we find that the posterior mean of the cross-correlation coefficient, obtained by multiplying the likelihoods of the single fits to Phh and Phm, is always smaller than unity and does not provide a good description to the data (see the black line and the shaded region in Fig. 10). To investigate this further, we recompute r using the posterior means of Phh and Phm shown in the left-hand panel of Fig. 1. Inserting them in equation (54), we find excellent agreement with the data (see the blue line in Fig. 1). As shown previously, the best-fitting models for Phh and Phm prefer different values for b1 and b2 when analysed independently. The joint analysis of Phm and Phh, in this manner, shows more clearly the inconsistency obtained when using the second-order LEB as a model for halo biasing.

One can also define cross-correlation coefficients for higher order statistics. The second equality in equation (54) gives us a clear path to make this generalization. From the three-point effective bias coefficients we may form two independent ratios:
\begin{equation} r_1 \equiv \frac{b_{\rm hmm}}{b_{\rm hhh}} = \frac{B_{\rm hmm}}{B_{\rm mmm}^{2/3} B_{\rm hhh}^{1/3}} \end{equation}
(58)
\begin{equation} r_2 \equiv \frac{b_{\rm hhm}}{b_{\rm hhh}} = \frac{B_{\rm hhm}^{1/2}}{B_{\rm mmm}^{1/6} B_{\rm hhh}^{1/3}}, \end{equation}
(59)
where the dependence on the triangle configuration is understood. Note, that a third (dependent) ratio may be also computed: r3 = bhmm/bhhm = r1/r2. For a deterministic linear bias model with bias coefficient b, r1 = 1 and r2 = sgn(b). Once again, additional stochasticity or non-linear biasing will alter the cross-correlation coefficients.

In the right-hand panel of Fig. 10 we present the cross-correlation coefficients r1 and r2 extracted from the simulations as a function of the triangle configuration. Both functions are always a few per cent below unity even after shot-noise subtraction. In the same figure, we also plot the posterior mean and variance for the r coefficients obtained from joint fits to two bispectra (black line and shaded region) performed in the same manner as for the power spectrum. These results are very close to unity and do not adequately describe the simulated data. In fact, the joint fits prefer less negative values for b2 than the single fits (for example, the best simultaneous fit to Bhhh and Bhhm gives b1 ≃ 1.50, b2 ≃ −0.15 and R ≃ 5.5 with |$\chi ^2_{\rm min}/\nu \simeq 1954/1997$|⁠). On the other hand, if we compute r1 and r2 from the individual posterior means for Bhhh, Bhhm and Bhmm, we get results that are in good agreement with the data. This is somewhat puzzling as the fits to the various bispectra appear to give consistent bias parameters. However, in order to test how congruous the different fits really are, we derive models for one bispectrum type (say Bhhh) averaging over the joint posterior distribution for the bias parameters derived by fitting one of the other bispectra (Bhmm or Bhhm). An example is shown in Fig. 11: the fit based on Bhmm matches well the data for Bhhh for collinear triangles but systematically underestimates the halo bispectrum in all the other configurations. It is exactly in this more precise comparison that we see the failure of the non-linear local bias model when analysing the bispectra data.

Figure 11.

The halo bispectrum, Bhhh, measured from the simulations (solid symbols) is compared with the fully non-linear LEB adopting the parameters that best fit Bhmm. The line and shaded region show the mean and rms value of the LEB model for Bhhh averaged over the posterior distribution for b1, b2 and R coming from a fit to Bhmm. This shows that the parameter sets that nicely fit Bhmm (see Fig. 1) are not able to reproduce all the features seen in Bhhh.

6 DISCUSSION

Our high-precision measurements of the halo–halo and halo–matter spectra and bispectra enabled us to carry out a series of consistency tests of the second-order LEB. As seen in Fig. 9 and in Table 2, the marginal posterior distributions for b1 and b2 determined from Phh, Bhhh, Bhhm and Bhmm are all consistent with one another and, yet, are inconsistent with the constraints derived from the halo–matter cross-spectrum. The primary reason is that Phm requires a positive b2 that is close to 0, whereas the fits to the other spectra prefer a negative value for b2. In terms of statistical significance, the stronger discrepancy is with Phh as the posterior distributions for b2 extracted from all bispectra are rather broad. The incompatibility between the bias parameters obtained from Phh and Phm might indicate a breakdown in the modelling due to either incorrect shot-noise subtraction or incorrect parametrization of halo biasing. To better understand this issue, it is interesting to focus for a moment on to the shot-noise free spectra Phm and Bhmm. Comparing their mathematical expressions given in equations (17) and (21), we see that they have the same parametric form in terms of b1 and b2, it is only the non-linear matter terms which are different. Since we directly measure these terms from the simulations and shot noise does not play any role here, the fact that the model-parameter constraints from Phm and Bhmm are incompatible suggests that the LEB truncated to second order is incorrect or, at the very least, incomplete. The simplest improvement would be to consider higher order terms in the bias expansion given in equation (3). However, there are good reasons to believe that more sophisticated corrections are needed. Recent numerical work has provided strong evidence that dark matter haloes form out of linear density peaks (Ludlow & Porciani 2011). This suggests that the halo bias with respect to the matter fluctuations may actually be best understood as originating in Lagrangian space (Catelan et al. 1998; Catelan, Porciani & Kamionkowski 2000). However, even the simplest local Lagrangian biasing scheme generates a non-linear, non-local and stochastic scheme in Eulerian space (Catelan et al. 1998, 2000; Matsubara 2011) which can be parametrized in terms of the invariants of the deformation tensor (Catelan et al. 1998; Baldauf et al. 2012; Chan, Scoccimarro & Sheth 2012). Several terms should then be added to the bias expansion of the LEB and this might help bring the model-parameter constraints extracted from the different halo statistics into better agreement. We will revisit this issue in our future work.

7 CONCLUSIONS

The use of galaxy clustering to extract information on the cosmological parameters is currently limited to very large scales where both galaxy biasing and the process of structure formation are expected to be linear and thus simple to model. Although more precise data are already available on smaller scales, they are not usually considered to avoid daunting complications in the modelling that might introduce systematic effects in the results. Pursuing the goal of extending clustering studies to smaller scales, we propose to use N-body simulations to measure the relevant statistics for the matter distribution that enter any biasing scheme.

While our framework is explicitly general, as an example, we apply it to the Eulerian local bias model truncated to quadratic order. This scheme represents the minimal theoretical model for studying three-point statistics of the galaxy distribution on large spatial separations. Its predictions are easily computed to leading order in SPT and are commonly used to interpret observational results (Verde et al. 2002; Jing & Börner 2004; Kayo et al. 2004; Wang et al. 2004; Gaztañaga et al. 2005; Hikage et al. 2005; Pan & Szapudi 2005; Kulkarni et al. 2007; Nishimichi et al. 2007; McBride et al. 2011a,b; Marín 2011; Marín et al. 2013; Guo et al. 2014).

We use a set of 200 N-body simulations to study the clustering properties of dark matter haloes and their relation to the underlying matter distribution with unprecedented accuracy. Our halo catalogues cover a total comoving volume of 675 h−3 Gpc3, much larger than the effective volume of the SDSS Luminous Red Galaxy sample (0.26 h−3 Gpc3), the Baryon Oscillation Spectroscopic Survey and Baryon Acoustic Oscillation sample (2.4 h−3 Gpc3) and the planned spectroscopic survey of the Euclid satellite (19.7 h−3 Gpc3). We consider haloes with mass M > 1.11 × 1013h−1 M corresponding to a number density of 3.7 × 10−4h−3 Mpc3 so that the effective volume (i.e. the actual volume weighted by the factor |$\bar{n}\,P_{\rm hh}$|⁠) roughly coincides with the total volume for the wavenumbers analysed here (0.04 ≲ k ≲ 0.12 h Mpc−1) that match the observable scales of current and future surveys. All this allows us to measure the halo power spectrum to subpercent accuracy (better than 0.3 per cent at k ≃ 0.04 h Mpc−1) and the halo bispectrum to a few per cent accuracy.

We make a twofold use of our simulations: to measure the moments of the non-linear matter distribution on several scales (and compare them against SPT predictions) and to test how well the LEB truncated to quadratic order fits several statistics of the halo distribution. In particular, we consider the halo power spectrum, Phh, the halo-mass cross-spectrum, Phm, as well as all the possible bispectra Bhhh, Bhhm and Bhmm. Our main results can be summarized as follows.

  • In a ΛCDM model at z = 0, tree-level SPT does not accurately model non-linearities in the momenta of the matter distribution on spatial scales of the order of 10 − 30 h−1 Mpc.

  • The simple second-order LEB fits very well all halo spectra and bispectra when either N-body simulations or tree-level SPT are used in the modelling of the clustering amplitudes for the matter distribution. However, the bias parameters derived from the models based on SPT are heavily biased with respect to the case when non-linearities are accurately modelled. This might explain why studies that interpreted different statistics of the galaxy distribution based on SPT reached inconsistent conclusions regarding the non-linear bias of optically selected galaxies (e.g. Verde et al. 2002; Gaztañaga et al. 2005).

  • The LEB models applied to counts in cells requires an optimal smoothing scale of several  h−1 Mpc to match the halo statistics from the simulations. For our haloes, this corresponds to a few Lagrangian radii but is also of the same order of the spatial scales under analysis.

  • Comparing the parameter constraints for the fully non-linear LEB obtained from the different spectra, we find some inconsistencies. In particular, the non-linear bias parameter extracted from the cross-spectrum Phm is incompatible with the results from all the other statistics. The main difference is that Phm strongly favours a positive value for b2 that is very close to zero, whereas the posterior distributions derived from all other spectra prefer a negative b2 in the range −0.3 ≲ b2 ≲ −0.2. General agreement, instead, is found for the linear bias parameter, b1.

  • Non-trivial shot-noise corrections in Phh might be invoked to reconcile the bias parameters extracted from Phm and Phh. However this cannot explain the differences between the constraints from the shot-noise free statistics Phm and Bhmm. This suggests that further complexity should be added to second-order LEB in order to match all halo statistics.

  • Analysis of the cross-correlation coefficients defined for the two-point and three-point statistics reveal further subtle inconsistencies contained in the LEB truncated to second order, suggesting it is too simple a model to describe halo bias with high accuracy.

A final remark is in order. Our numerical study is based on simulations with a fixed background cosmology and focuses on retrieving the bias parameters when the cosmological parameters are perfectly known. However, this is not the case for actual galaxy surveys where bias and cosmology are generally estimated simultaneously. To transform our method into a resourceful tool for data analysis, we will need to explore how the shapes and amplitudes of the moments of the non-linear matter density field depend on the unknown cosmological parameters without having to run an exorbitant amount of N-body simulations (e.g. Angulo & White 2010) – a topic we shall explore in future work.

JEP thanks Xun Shi and Andrés Balaguera-Antolínez for useful discussions. We thank V. Springel for making public gadget-2 and for providing his B-FoF halo finder and R. Scoccimarro for making public his 2LPT code. JEP and CP were partially supported by funding provided through the SFB-Transregio 33 The Dark Universe by the Deutsche Forschungsgemeinschaft. RES acknowledges support from ERC Advanced grant 246797 GALFORMOD.

1

Very similar results are obtained if one averages the ratios instead of taking the ratio of the averages.

REFERENCES

Angulo
R. E.
White
S. D. M.
MNRAS
2010
, vol. 
405
 pg. 
143
 
Baldauf
T.
Seljak
U.
Desjacques
V.
McDonald
P.
Phys. Rev. D
2012
, vol. 
86
 pg. 
083540
 
Bernardeau
F.
Colombi
S.
Gaztañaga
E.
Scoccimarro
R.
Phys. Rep.
2002
, vol. 
367
 pg. 
1
 
Catelan
P.
Lucchin
F.
Matarrese
S.
Porciani
C.
MNRAS
1998
, vol. 
297
 pg. 
692
 
Catelan
P.
Porciani
C.
Kamionkowski
M.
MNRAS
2000
, vol. 
318
 pg. 
L39
 
Chan
K. C.
Scoccimarro
R.
Sheth
R. K.
Phys. Rev. D
2012
, vol. 
85
 pg. 
083509
 
Chuen Chan
K.
Scoccimarro
R.
Phys. Rev. D
2012
, vol. 
86
 pg. 
103519
 
Crocce
M.
Pueblas
S.
Scoccimarro
R.
MNRAS
2006
, vol. 
373
 pg. 
369
 
Davis
M.
Efstathiou
G.
Frenk
C. S.
White
S. D. M.
ApJ
1985
, vol. 
292
 pg. 
371
 
Dekel
A.
Lahav
O.
ApJ
1999
, vol. 
520
 pg. 
24
 
Frusciante
N.
Sheth
R. K.
J. Cosmol. Astropart. Phys.
2012
, vol. 
11
 pg. 
16
 
Fry
J. N.
Gaztanaga
E.
ApJ
1993
, vol. 
413
 pg. 
447
 
Gaztañaga
E.
Norberg
P.
Baugh
C. M.
Croton
D. J.
MNRAS
2005
, vol. 
364
 pg. 
620
 
Guo
H.
Jing
Y. P.
ApJ
2009
, vol. 
702
 pg. 
425
 
Guo
H.
Li
C.
Jing
Y. P.
Boerner
G.
ApJ
2014
, vol. 
780
 pg. 
139
 
Hamaus
N.
Seljak
U.
Desjacques
V.
Smith
R. E.
Baldauf
T.
Phys. Rev. D
2010
, vol. 
82
 pg. 
043515
 
Heavens
A. F.
Matarrese
S.
Verde
L.
MNRAS
1998
, vol. 
301
 pg. 
797
 
Hikage
C.
Matsubara
T.
Suto
Y.
Park
C.
Szalay
A. S.
Brinkmann
J.
PASJ
2005
, vol. 
57
 pg. 
709
 
Jeong
D.
Komatsu
E.
ApJ
2009
, vol. 
691
 pg. 
569
 
Jing
Y. P.
Börner
G.
ApJ
2004
, vol. 
607
 pg. 
140
 
Kayo
I.
, et al. 
PASJ
2004
, vol. 
56
 pg. 
415
 
Komatsu
E.
, et al. 
ApJS
2009
, vol. 
180
 pg. 
330
 
Kulkarni
G. V.
Nichol
R. C.
Sheth
R. K.
Seo
H.-J.
Eisenstein
D. J.
Gray
A.
MNRAS
2007
, vol. 
378
 pg. 
1196
 
Kwan
C. C. Y.
eJournal Spreadsheets Educ.
2011
, vol. 
4
 pg. 
6
 
Ledoit
O.
Wolf
M.
J. Empiri. Financ.
2003
, vol. 
10
 pg. 
603
 
Ludlow
A. D.
Porciani
C.
MNRAS
2011
, vol. 
413
 pg. 
1961
 
McBride
C. K.
Connolly
A. J.
Gardner
J. P.
Scranton
R.
Newman
J. A.
Scoccimarro
R.
Zehavi
I.
Schneider
D. P.
ApJ
2011a
, vol. 
726
 pg. 
13
 
McBride
C. K.
Connolly
A. J.
Gardner
J. P.
Scranton
R.
Scoccimarro
R.
Berlind
A. A.
Marín
F.
Schneider
D. P.
ApJ
2011b
, vol. 
739
 pg. 
85
 
McDonald
P.
Phys. Rev. D
2006
, vol. 
74
 pg. 
103512
 
Manera
M.
Gaztañaga
E.
MNRAS
2011
, vol. 
415
 pg. 
383
 
Marín
F.
ApJ
2011
, vol. 
737
 pg. 
97
 
Marín
F. A.
, et al. 
MNRAS
2013
, vol. 
432
 pg. 
2654
 
Matsubara
T.
Phys. Rev. D
2011
, vol. 
83
 pg. 
083518
 
Mo
H.
van den Bosch
F. C.
White
S.
Galaxy Formation and Evolution
2010
Cambridge
Cambridge Univ. Press
Nichol
R. C.
, et al. 
MNRAS
2006
, vol. 
368
 pg. 
1507
 
Nishimichi
T.
, et al. 
PASJ
2007
, vol. 
59
 pg. 
1049
 
Pan
J.
Szapudi
I.
MNRAS
2005
, vol. 
362
 pg. 
1363
 
Pollack
J. E.
Smith
R. E.
Porciani
C.
MNRAS
2012
, vol. 
420
 pg. 
3469
 
Roth
N.
Porciani
C.
MNRAS
2011
, vol. 
415
 pg. 
829
 
Schäefer
J.
Strimmer
K.
Stat. Appl. Genetics Mol. Biol.
2005
, vol. 
4
 pg. 
1544
 
Schmidt
F.
Jeong
D.
Desjacques
V.
Phys. Rev. D
2013
, vol. 
88
 pg. 
023515
 
Scoccimarro
R.
Sheth
R. K.
Hui
L.
Jain
B.
ApJ
2001
, vol. 
546
 pg. 
20
 
Sefusatti
E.
Phys. Rev. D
2009
, vol. 
80
 pg. 
123002
 
Seljak
U.
Warren
M. S.
MNRAS
2004
, vol. 
355
 pg. 
129
 
Seljak
U.
Zaldarriaga
M.
ApJ
1996
, vol. 
469
 pg. 
437
 
Sigad
Y.
Branchini
E.
Dekel
A.
ApJ
2000
, vol. 
540
 pg. 
62
 
Smith
R. E.
MNRAS
2009
, vol. 
400
 pg. 
85
 
Smith
R. E.
Scoccimarro
R.
Sheth
R. K.
Phys. Rev. D
2007
, vol. 
75
 pg. 
063512
 
Smith
R. E.
Sheth
R. K.
Scoccimarro
R.
Phys. Rev. D
2008
, vol. 
78
 pg. 
023523
 
Smith
R. E.
Hernández-Monteagudo
C.
Seljak
U.
Phys. Rev. D
2009
, vol. 
80
 pg. 
063528
 
Smith
R. E.
Reed
D. S.
Potter
D.
Marian
L.
Crocce
M.
Moore
B.
2012
 
preprint (arXiv:e-prints)
Springel
V.
MNRAS
2005
, vol. 
364
 pg. 
1105
 
Szapudi
I.
Pan
J.
ApJ
2004
, vol. 
602
 pg. 
26
 
Verde
L.
, et al. 
MNRAS
2002
, vol. 
335
 pg. 
432
 
Wang
Y.
Yang
X.
Mo
H. J.
van den Bosch
F. C.
Chu
Y.
MNRAS
2004
, vol. 
353
 pg. 
287
 

APPENDIX A: SPECTRAL RELATIONSHIPS

A1 Relationship between |${\mathcal {P}}_{(l,m)}$| and the n-point multispectra

We now derive the relation between the functions |${\mathcal {P}}_{(l,m)}$| and the multipoint matter spectra.

To begin, the functions |${\mathcal {P}}_{(l,m)}$| are defined
\begin{eqnarray} \left<\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\right> & \equiv & (2\pi )^3 \delta ^D({\boldsymbol k}_{1}+{\boldsymbol k}_{2}) {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) \end{eqnarray}
(A1)
\begin{eqnarray} &&= \int \prod _{i=1}^{l} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace (2\pi )^3\delta ^{D}({\boldsymbol k}_1-{\boldsymbol q}_{1\ldots l}) (2\pi )^3\delta ^{D}({\boldsymbol k}_2-{\boldsymbol p}_{1\ldots m}) \nonumber \\ &&\times \, \left<\frac{}{} \tilde{\delta }({\boldsymbol q}_1|R)\ldots \tilde{\delta }({\boldsymbol q}_l|R) \tilde{\delta }({\boldsymbol p}_1|R)\ldots \tilde{\delta }({\boldsymbol p}_m|R) \right>. \end{eqnarray}
(A2)
The ensemble average of the l + m density fields may be evaluated to give
\begin{equation} \left< \tilde{\delta }({\boldsymbol q}_1|R)\ldots \tilde{\delta }({\boldsymbol q}_l|R) \tilde{\delta }({\boldsymbol p}_1|R)\ldots \tilde{\delta }({\boldsymbol p}_m|R) \right> \equiv (2\pi )^3 \delta ^D({\boldsymbol q}_{1\ldots l}+{\boldsymbol p}_{1\ldots m}){\mathcal {P}}_{(l+m)}({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_l,{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_m). \end{equation}
(A3)
On inserting the above definition into equation (A2) we find
\begin{eqnarray} \left<\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\right> & \equiv & \int \prod _{i=1}^{l} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace (2\pi )^3\delta ^{D}({\boldsymbol k}_1-{\boldsymbol q}_{1\ldots l}) (2\pi )^3\delta ^{D}({\boldsymbol k}_2-{\boldsymbol p}_{1\ldots m})\nonumber \\ & & \times \, (2\pi )^3 \delta ^D({\boldsymbol q}_{1\ldots l}+{\boldsymbol p}_{1\ldots m}){\mathcal {P}}_{(l+m)}({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_l,{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_m). \end{eqnarray}
(A4)
Integrating over the first two Dirac-delta functions yields
\begin{eqnarray} \left<\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\right> & \equiv & (2\pi )^3 \delta ^D({\boldsymbol k}_{1}+{\boldsymbol k}_{2}) \int \prod _{i=1}^{l-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace \nonumber \\ & & \times \,\, {\mathcal {P}}_{(l+m)}({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_{l-1},{\boldsymbol k}_1-{\boldsymbol q}_{1\ldots (l-1)},{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_{m-1}, {\boldsymbol k}_2-{\boldsymbol p}_{1\ldots (m-1)}). \end{eqnarray}
(A5)
Hence,
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \prod _{i=1}^{l-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(l+m)}({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_{l-1},{\boldsymbol k}_1-{\boldsymbol q}_{1\ldots (l-1)},{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_{m-1}, {\boldsymbol k}_2-{\boldsymbol p}_{1\ldots (m-1)}). \end{equation}
(A6)
Lastly, we may change integration variables in the following way: |$\tilde{{\boldsymbol q}}_{2}\rightarrow {\boldsymbol q}_2-{\boldsymbol q}_1$|⁠, |$\tilde{{\boldsymbol q}}_3\rightarrow {\boldsymbol q}_3-\tilde{{\boldsymbol q}}_2$|⁠, ..., upon which the above expression may be written as
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \prod _{i=1}^{l-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(l+m)}({\boldsymbol q}_1,{\boldsymbol q}_2-{\boldsymbol q}_1,\ldots ,{\boldsymbol k}_1-{\boldsymbol q}_{l-1},{\boldsymbol p}_1,{\boldsymbol p}_2-{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{m-1}). \end{equation}
(A7)
Terms up to and including the quadspectrum may be written as
\begin{eqnarray} {\mathcal {P}}_{(1,1)}({\boldsymbol k}_1) & = & {\mathcal {P}}_{(2)}({\boldsymbol k}_1,{\boldsymbol k}_2) = {\mathcal {P}}({\boldsymbol k}_1) \end{eqnarray}
(A8)
\begin{eqnarray} {\mathcal {P}}_{(2,1)}({\boldsymbol k}_1) & = & \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} {\mathcal {P}}_{(3)}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2) = \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} {\mathcal {B}}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2) \end{eqnarray}
(A9)
\begin{eqnarray} {\mathcal {P}}_{(3,1)}({\boldsymbol k}_1) & = & \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\frac{{\rm d}^3{\boldsymbol q}_2}{(2\pi )^3} {\mathcal {P}}_{(4)}({\boldsymbol q}_1,{\boldsymbol q}_2-{\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_2,{\boldsymbol k}_2) \end{eqnarray}
(A10)
\begin{eqnarray} {\mathcal {P}}_{(2,2)}({\boldsymbol k}_1) & = & \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3} {\mathcal {P}}_{(4)}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol p}_1,{\boldsymbol k}_2-{\boldsymbol p}_1) \end{eqnarray}
(A11)
\begin{eqnarray} {\mathcal {P}}_{(4,1)}({\boldsymbol k}_1) & = & \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\frac{{\rm d}^3{\boldsymbol q}_2}{(2\pi )^3} \frac{{\rm d}^3{\boldsymbol q}_3}{(2\pi )^3} {\mathcal {P}}_{(5)}({\boldsymbol q}_1,{\boldsymbol q}_2-{\boldsymbol q}_1,{\boldsymbol q}_3-{\boldsymbol q}_2,{\boldsymbol k}_1-{\boldsymbol q}_3,{\boldsymbol k}_2) \end{eqnarray}
(A12)
\begin{eqnarray} {\mathcal {P}}_{(3,2)}({\boldsymbol k}_1) & = & \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\frac{{\rm d}^3{\boldsymbol q}_2}{(2\pi )^3} \frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3} {\mathcal {P}}_{(5)}({\boldsymbol q}_1,{\boldsymbol q}_2-{\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_2,{\boldsymbol p}_1,{\boldsymbol k}_2-{\boldsymbol p}_1). \end{eqnarray}
(A13)

A2 Relationship between |${\mathcal {B}}_{(l,m,n)}$| and the n-point multispectra

In a similar fashion, we may now derive the relation between the functions |${\mathcal {B}}_{(l,m,n)}$| and the multipoint matter spectra.

To begin, the functions |${\mathcal {B}}_{(l,m,n)}$| are defined as
\begin{eqnarray} &&\!\!\!\!\left\langle\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\Delta ^{(n)}({\boldsymbol k}_3|R)\right\rangle \equiv \int \prod _{i=1}^{l} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace \prod _{k=1}^{n} \left\lbrace \frac{{\rm d}^3{\boldsymbol s}_k}{(2\pi )^3}\right\rbrace (2\pi )^3\delta ^{D}({\boldsymbol k}_1-{\boldsymbol q}_{1\ldots l})(2\pi )^3 \nonumber \\ & &\quad \times \, \delta ^{D}({\boldsymbol k}_2-{\boldsymbol p}_{1\ldots m}) (2\pi )^3\delta ^{D}({\boldsymbol k}_3-{\boldsymbol s}_{1\ldots n}) \langle\frac{}{} \tilde{\delta }({\boldsymbol q}_1|R)\ldots \tilde{\delta }({\boldsymbol q}_l|R) \tilde{\delta }({\boldsymbol p}_1|R)\ldots \tilde{\delta }({\boldsymbol p}_m|R) \tilde{\delta }({\boldsymbol s}_1|R)\ldots \tilde{\delta }({\boldsymbol s}_n|R) \rangle . \end{eqnarray}
(A14)
The ensemble average of the l + m + n density fields may be evaluated to give
\begin{eqnarray} \left< \tilde{\delta }({\boldsymbol q}_1|R)\ldots \tilde{\delta }({\boldsymbol q}_l|R) \tilde{\delta }({\boldsymbol p}_1|R)\ldots \tilde{\delta }({\boldsymbol p}_m|R) \tilde{\delta }({\boldsymbol s}_1|R)\ldots \tilde{\delta }({\boldsymbol s}_n|R) \right> & \equiv & (2\pi )^3 \delta ^D({\boldsymbol q}_{1\ldots l}+{\boldsymbol p}_{1\ldots m}+{\boldsymbol s}_{1\ldots n}) \nonumber \\ & & \times \, {\mathcal {P}}_{(l+m+n)} ({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_l,{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_m,{\boldsymbol s}_1,\ldots ,{\boldsymbol s}_n). \end{eqnarray}
(A15)
On inserting the above definition into equation (A14) we find
\begin{eqnarray} \left<\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\Delta ^{(n)}({\boldsymbol k}_3|R)\right> & \equiv & \int \prod _{i=1}^{l} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace \prod _{k=1}^{n} \left\lbrace \frac{{\rm d}^3{\boldsymbol s}_k}{(2\pi )^3}\right\rbrace (2\pi )^{12} \delta ^{D}({\boldsymbol k}_1-{\boldsymbol q}_{1\ldots l}) \delta ^{D}({\boldsymbol k}_2-{\boldsymbol p}_{1\ldots m}) \nonumber \\ & &\times \, \delta ^{D}({\boldsymbol k}_3-{\boldsymbol s}_{1\ldots n})\delta ^D({\boldsymbol q}_{1\ldots l}+{\boldsymbol p}_{1\ldots m}+{\boldsymbol s}_{1\ldots n}) {\mathcal {P}}_{(l+m+n)}({\boldsymbol q}_1,\ldots ,{\boldsymbol q}_l,{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_m,{\boldsymbol s}_1,\ldots ,{\boldsymbol s}_n). \end{eqnarray}
(A16)
Integrating over the first three Dirac-delta functions gives
\begin{eqnarray} \left<\Delta ^{(l)}({\boldsymbol k}_1|R)\Delta ^{(m)}({\boldsymbol k}_2|R)\Delta ^{(n)}({\boldsymbol k}_3|R)\right> & \equiv & (2\pi )^{3}\delta ^D({\boldsymbol k}_1+{\boldsymbol k}_2+{\boldsymbol k}_3) \int \prod _{i=1}^{l} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace \prod _{k=1}^{n} \left\lbrace \frac{{\rm d}^3{\boldsymbol s}_k}{(2\pi )^3}\right\rbrace \nonumber \\ & & \times \,{\mathcal {P}}_{(l+m+n)}({\boldsymbol q}_1,\ldots ,{\boldsymbol k}_1-{\boldsymbol q}_{1\ldots l-1},{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{1\ldots m-1},{\boldsymbol s}_1,\ldots ,{\boldsymbol k}_3-{\boldsymbol s}_{1\ldots n-1}). \end{eqnarray}
(A17)
As for the power spectrum, we may change integration variables in the following way, |$\tilde{{\boldsymbol q}}_{2}\rightarrow {\boldsymbol q}_2-{\boldsymbol q}_1$|⁠, |$\tilde{{\boldsymbol q}}_3\rightarrow {\boldsymbol q}_3-\tilde{{\boldsymbol q}}_2$|⁠, .... After which we find
\begin{eqnarray} {\mathcal {B}}_{(l,m,n)}({\boldsymbol k}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) & = & \int \prod _{i=1}^{l-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{m-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace \prod _{k=1}^{n-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol s}_k}{(2\pi )^3}\right\rbrace \nonumber \\ & & \times \,{\mathcal {P}}_{(l+m+n)}({\boldsymbol q}_1,{\boldsymbol q}_2-{\boldsymbol q}_1,\ldots ,{\boldsymbol k}_1-{\boldsymbol q}_{l-1},{\boldsymbol p}_1,{\boldsymbol p}_2-{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{m-1}, {\boldsymbol s}_1,{\boldsymbol s}_2-{\boldsymbol s}_1,\ldots ,{\boldsymbol k}_3-{\boldsymbol s}_{n-1}). \end{eqnarray}
(A18)

A3 Proof of the symmetry of |${\mathcal {P}}_{(l,m)}$|

We now prove that the spectra |${\mathcal {P}}_{(l,m)}$| are symmetric in their indices m and l:
\begin{equation} {\mathcal {P}}_{(l,m)} = {\mathcal {P}}_{(m,l)} . \end{equation}
(A19)
Consider equation (A7), on relabelling the variables |${\boldsymbol p}_i={\boldsymbol q}_{l+i-1}$|⁠, and writing P(l + m) = P(m + l), we find
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3}\right\rbrace \ldots \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_{l+m-2}}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(m+l)}({\boldsymbol q}_1,\ldots ,{\boldsymbol k}_1-{\boldsymbol q}_{l-1},{\boldsymbol q}_l,\ldots ,{\boldsymbol k}_2-{\boldsymbol q}_{l+m-2}). \end{equation}
(A20)
On changing the integration variables to |${\boldsymbol p}_i=-{\boldsymbol q}_i$|⁠, we find
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3}\ldots \frac{{\rm d}^3{\boldsymbol p}_{l+m-2}}{(2\pi )^3} {\mathcal {P}}_{(m+l)}(-{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_1+{\boldsymbol p}_{l-1},-{\boldsymbol p}_l,\ldots ,{\boldsymbol k}_2+{\boldsymbol p}_{l+m-2}). \end{equation}
(A21)
Parity invariance of the n-point correlation functions means that |${\mathcal {P}}_{n}({\boldsymbol p}_1,\ldots ,{\boldsymbol p}_n)={\mathcal {P}}_{n}(-{\boldsymbol p}_1,\ldots ,-{\boldsymbol p}_n)$| (for a proof see Smith et al. 2008). Under parity invariance, we find
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3}\right\rbrace \ldots \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_{l+m-2}}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(m+l)}({\boldsymbol p}_1,\ldots ,-{\boldsymbol k}_1-{\boldsymbol p}_{l-1},{\boldsymbol p}_l,\ldots ,-{\boldsymbol k}_2-{\boldsymbol p}_{l+m-2}). \end{equation}
(A22)
We may switch |${\boldsymbol k}_1=-{\boldsymbol k}_2$| and |${\boldsymbol k}_2=-{\boldsymbol k}_1$|⁠,
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3}\right\rbrace \ldots \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_{l+m-2}}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(m+l)}({\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{l-1},{\boldsymbol p}_l,\ldots ,{\boldsymbol k}_1-{\boldsymbol p}_{l+m-2}). \end{equation}
(A23)
Next we may rearrange the arguments of the n-point spectra, since it is totally symmetric under exchange symmetry: |${\mathcal {P}}_{n}({\boldsymbol p}_1,\ldots ,{\boldsymbol p}_n)={\mathcal {P}}_{n}({\boldsymbol p}_i,\ldots ,{\boldsymbol p}_1,\ldots ,{\boldsymbol p}_n)$|⁠, whereupon
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_1}{(2\pi )^3}\right\rbrace \ldots \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_{l+m-2}}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(m+l)}({\boldsymbol p}_l,\ldots ,{\boldsymbol k}_1-{\boldsymbol p}_{l+m-2},{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{l-1}). \end{equation}
(A24)
Finally on changing variables |${\boldsymbol p}_{l+i-1}={\boldsymbol q}_i$|⁠, we obtain
\begin{equation} {\mathcal {P}}_{(l,m)}({\boldsymbol k}_1) = \int \prod _{i=1}^{m-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol q}_i}{(2\pi )^3}\right\rbrace \prod _{j=1}^{l-1} \left\lbrace \frac{{\rm d}^3{\boldsymbol p}_j}{(2\pi )^3}\right\rbrace {\mathcal {P}}_{(m+l)}({\boldsymbol q}_1,\ldots ,{\boldsymbol k}_1-{\boldsymbol q}_{m-1},{\boldsymbol p}_1,\ldots ,{\boldsymbol k}_2-{\boldsymbol p}_{l-1}) = {\mathcal {P}}_{(m,l)}({\boldsymbol k}_1), \end{equation}
(A25)
and this completes the proof of equation (A19).

APPENDIX B: MODELLING |$B^{\rm (s)}_{(2,1,1)}$| WITH PERTURBATION THEORY

In order to better understand what drives the amplitude and functional form of the |$B^{\rm (s)}_{(l,m,n)}$| we have attempted to model the signal with SPT techniques. Rather than modelling all of the spectra, we have focused on the lowest order non-trivial term B(2, 1, 1).

To begin, consider again equation (24), this came from
\begin{equation} \left<\Delta ^{(2)}({\boldsymbol k}_1|R)\Delta ^{(1)}({\boldsymbol k}_2|R)\Delta ^{(1)}({\boldsymbol k}_3|R)\right> = \int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} \frac{{\rm d}^3{\boldsymbol q}_2}{(2\pi )^3} (2\pi )^3 \delta ^{D}({\boldsymbol k}_1-{\boldsymbol q}_1-{\boldsymbol q}_2) \left< \tilde{\delta }({\boldsymbol q}_1|R) \tilde{\delta }({\boldsymbol q}_2|R) \tilde{\delta }({\boldsymbol k}_2|R) \tilde{\delta }({\boldsymbol k}_3|R) \right>. \end{equation}
(B1)
The above ensemble averaged product can be broken into connected and disconnected terms. Hence,
\begin{eqnarray} \left<\tilde{\delta }({\boldsymbol q}_1|R) \ldots \tilde{\delta }({\boldsymbol q}_4|R) \right> & = & \left< \tilde{\delta }({\boldsymbol q}_1|R) \ldots \tilde{\delta }({\boldsymbol q}_4|R) \right>_c + \left<\tilde{\delta }({\boldsymbol q}_1|R)\tilde{\delta }({\boldsymbol q}_2|R)\right> \left<\tilde{\delta }({\boldsymbol q}_3|R)\tilde{\delta }({\boldsymbol q}_4|R)\right> \nonumber \\ & & + \left<\tilde{\delta }({\boldsymbol q}_1|R)\tilde{\delta }({\boldsymbol q}_3|R)\right> \left<\tilde{\delta }({\boldsymbol q}_2|R)\tilde{\delta }({\boldsymbol q}_4|R)\right> + \left<\tilde{\delta }({\boldsymbol q}_1|R)\tilde{\delta }({\boldsymbol q}_4|R)\right> \left<\tilde{\delta }({\boldsymbol q}_2|R)\tilde{\delta }({\boldsymbol q}_3|R)\right>. \end{eqnarray}
(B2)
If we consider the disconnected terms, these may be written in terms of power spectra (cf. equation 6). We note that the first disconnected term is vanishing unless |${\boldsymbol k}_1$| is the null vector. The remaining terms are
\begin{eqnarray} \left<\tilde{\delta }({\boldsymbol q}_1|R) \ldots \tilde{\delta }({\boldsymbol q}_4|R) \right> & = & (2\pi )^3\delta ^{D}({\boldsymbol q}_{1\ldots 4}) {\mathcal {T}}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4)\nonumber \\ & & +\, (2\pi )^3{\mathcal {P}}(q_1) P(q_2) \left[ \delta ^{D}({\boldsymbol q}_1+{\boldsymbol q}_3)\delta ^{D}({\boldsymbol q}_2+{\boldsymbol q}_4)+ \delta ^{D}({\boldsymbol q}_1+{\boldsymbol q}_4)\delta ^{D}({\boldsymbol q}_2+{\boldsymbol q}_3) \right]. \end{eqnarray}
(B3)
On inserting the above expression into equation (B1) and computing the integral over |${\boldsymbol q}_2$|⁠, we find
\begin{equation} {\mathcal {B}}^{\rm (s)}_{(2,1,1)} = \frac{2}{3}\left[ {\mathcal {P}}({\boldsymbol k}_2){\mathcal {P}}({\boldsymbol k}_3) + 2\,{\rm cyc} \right] +\frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} \left[{\mathcal {T}}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) +{2\,\rm cyc}\right], \end{equation}
(B4)
where we have factored out the Dirac delta. The power spectrum and trispectrum may be evaluated using standard Eulerian perturbation theory (for a review see Bernardeau et al. 2002). The important results that we will need are that at one loop level we have
\begin{eqnarray} {\mathcal {B}}^{\rm (s)}_{(2,1,1)} & = & \frac{2}{3}\left[ {\mathcal {P}}_{(0)}({\boldsymbol k}_2){\mathcal {P}}_{(0)}({\boldsymbol k}_3) + 2\,{\rm cyc} \right] + \frac{2}{3}\left[ {\mathcal {P}}_{(0)}({\boldsymbol k}_2){\mathcal {P}}_{(1\ell )}({\boldsymbol k}_3) +{\mathcal {P}}_{(1\ell )}({\boldsymbol k}_2){\mathcal {P}}_{(0)}({\boldsymbol k}_3) + 2\,{\rm cyc} \right] \nonumber \\ & & +\,\frac{1}{3}\int \frac{{\rm d}^3{\boldsymbol q}_1}{(2\pi )^3} \left[{\mathcal {T}}({\boldsymbol q}_1,{\boldsymbol k}_1-{\boldsymbol q}_1,{\boldsymbol k}_2,{\boldsymbol k}_3) +{2\,\rm cyc}\right]. \end{eqnarray}
(B5)
The function |${\mathcal {P}}_{(0)}$| denotes the smoothed linear matter power spectrum and |${\mathcal {P}}_{(1\ell )}$| denotes the ‘1-loop’ correction. The 1-loop term may be written as
\begin{equation} \mathcal {P}_{1\ell }({\boldsymbol k}) = \mathcal {P}_{22}({\boldsymbol k}) + \mathcal {P}_{13}({\boldsymbol k}), \end{equation}
(B6)
where the loop integrals are
\begin{eqnarray} \mathcal {P}_{13}(k) & = & \frac{\mathcal {P}_{11}(k)k^3}{252(2\pi )^2}\int _{0}^{\infty } {\rm d}x\, x^2 \mathcal {P}_{11}(xk)\left\lbrace -42x^{2}+100-\frac{158}{x^{2}}+\frac{12}{x^{4}}+ \frac{3}{x}(1-x^2)^3(7x^2+2)\log \left[\frac{x+1}{|x-1|}\right]\right\rbrace , \end{eqnarray}
(B7)
\begin{eqnarray} \mathcal {P}_{22}(k) & = & 2 \int _0^{\infty } \frac{{\rm d}q}{(2\pi )^2} q^2 \mathcal {P}_{11}(q) \int _{-1}^{1} {\rm d}\mu \mathcal {P}_{11}(k\psi (x,\mu )) \left\lbrace \frac{5}{7}+\frac{1}{2}\frac{\mu -x}{\psi (x,\mu )} \left[\frac{x}{\psi (x,\mu )}+\frac{\psi (x,\mu )}{x}\right] +\frac{2}{7}\left[\frac{\mu -x}{\psi (x,\mu )}\right]^2\right\rbrace ^2, \end{eqnarray}
(B8)
where x = q/k and where ψ2(x, μ) = 1 + x2 − 2xμ.
The connected tree-level contribution to the trispectrum is given by
\begin{equation} \mathcal {T}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4) = 4\mathcal {T}_{2211}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4)+ 6\mathcal {T}_{3111}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4), \end{equation}
(B9)
where the two types of term are
\begin{eqnarray} \mathcal {T}_{2211}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4) & = & \mathcal {P}_1 \mathcal {P}_2 \left[\frac{}{} \mathcal {P}_{13} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{13}) F_2( {\boldsymbol q}_2, {\boldsymbol q}_{13}) + \mathcal {P}_{14} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{14}) F_2( {\boldsymbol q}_2, {\boldsymbol q}_{14}) \right] \nonumber \\ & & +\, \mathcal {P}_1 \mathcal {P}_3 \left[\frac{}{} \mathcal {P}_{12} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{12}) F_2( {\boldsymbol q}_3, {\boldsymbol q}_{12}) + \mathcal {P}_{14} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{14}) F_2( {\boldsymbol q}_3, {\boldsymbol q}_{14}) \right] \nonumber \\ & & +\, \mathcal {P}_1 \mathcal {P}_4 \left[\frac{}{} \mathcal {P}_{12} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{12}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{12}) + \mathcal {P}_{13} F_2( {\boldsymbol q}_1,- {\boldsymbol q}_{13}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{13}) \right] \nonumber \\ & & + \,\mathcal {P}_2 \mathcal {P}_3 \left[\frac{}{} \mathcal {P}_{21} F_2( {\boldsymbol q}_2,- {\boldsymbol q}_{21}) F_2( {\boldsymbol q}_3, {\boldsymbol q}_{21}) + \mathcal {P}_{24} F_2( {\boldsymbol q}_2,- {\boldsymbol q}_{24}) F_2( {\boldsymbol q}_3, {\boldsymbol q}_{24}) \right] \nonumber \\ & & + \,\mathcal {P}_2 \mathcal {P}_4 \left[\frac{}{} \mathcal {P}_{21} F_2( {\boldsymbol q}_2,- {\boldsymbol q}_{21}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{21}) + \mathcal {P}_{23} F_2( {\boldsymbol q}_2,- {\boldsymbol q}_{23}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{23}) \right] \nonumber \\ & & +\, \mathcal {P}_3 \mathcal {P}_4 \left[\frac{}{} \mathcal {P}_{31} F_2( {\boldsymbol q}_3,- {\boldsymbol q}_{31}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{31}) + \mathcal {P}_{32} F_2( {\boldsymbol q}_3,- {\boldsymbol q}_{32}) F_2( {\boldsymbol q}_4, {\boldsymbol q}_{32}) \right], \end{eqnarray}
(B10)
and
\begin{equation} \mathcal {T}_{3111}({\boldsymbol q}_1,{\boldsymbol q}_2,{\boldsymbol q}_3,{\boldsymbol q}_4) = F_3( {\boldsymbol q}_1, {\boldsymbol q}_2, {\boldsymbol q}_3) \mathcal {P}_1 \mathcal {P}_2 \mathcal {P}_3 + F_3( {\boldsymbol q}_2, {\boldsymbol q}_3, {\boldsymbol q}_4) \mathcal {P}_2 \mathcal {P}_3 \mathcal {P}_4 + F_3( {\boldsymbol q}_3, {\boldsymbol q}_4, {\boldsymbol q}_1) \mathcal {P}_3 \mathcal {P}_4 \mathcal {P}_1 + F_3( {\boldsymbol q}_4, {\boldsymbol q}_1, {\boldsymbol q}_2) \mathcal {P}_4 \mathcal {P}_1 \mathcal {P}_2, \end{equation}
(B11)
where |$\mathcal {P}_{i}\equiv \mathcal {P}_{\rm {lin}}({\boldsymbol q}_{i})$|⁠, |$\mathcal {P}_{ij}\equiv \mathcal {P}_{\rm {lin}}(|{\boldsymbol q}_{i}+ {\boldsymbol q}_{j}|)$| and |${\boldsymbol q}_{ij}\equiv {\boldsymbol q}_{i}+ {\boldsymbol q}_{j}$|⁠. The calculation of the second-order coupling functions is straightforward. The result is
\begin{eqnarray} F_2^{\rm (s)}({\boldsymbol q}_1,{\boldsymbol q}_2) & = & \frac{5}{14}\left[\alpha ({\boldsymbol q}_1,{\boldsymbol q}_2)+\alpha ({\boldsymbol q}_2,{\boldsymbol q}_1)\right] +\frac{2}{7}\beta ({\boldsymbol q}_1,{\boldsymbol q}_2), \end{eqnarray}
(B12)
\begin{eqnarray} G_2^{\rm (s)}({\boldsymbol q}_1,{\boldsymbol q}_2) & = & \frac{3}{14}\left[\alpha ({\boldsymbol q}_1,{\boldsymbol q}_2)+\alpha ({\boldsymbol q}_2,{\boldsymbol q}_1)\right] +\frac{4}{7}\beta ({\boldsymbol q}_1,{\boldsymbol q}_2), \end{eqnarray}
(B13)
\begin{eqnarray} F_3^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_2},{{\boldsymbol q}_3}) & = & \frac{7}{54} \left[\frac{}{}\alpha ({{\boldsymbol q}_1}, {\boldsymbol q}_{23})F_2^{\rm (s)}({{\boldsymbol q}_2},{{\boldsymbol q}_3})+ \alpha ({{\boldsymbol q}_2}, {\boldsymbol q}_{13})F_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_3}) +\alpha ({{\boldsymbol q}_3}, {\boldsymbol q}_{12})F_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_2})\right]\nonumber \\ & & +\, \frac{4}{54} \left[\frac{}{}\beta ({{\boldsymbol q}_1}, {\boldsymbol q}_{23})G_2^{\rm (s)}({{\boldsymbol q}_2},{{\boldsymbol q}_3})+ \beta ({{\boldsymbol q}_2}, {\boldsymbol q}_{13})G_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_3}) +\beta ({{\boldsymbol q}_3}, {\boldsymbol q}_{12})G_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_2})\right] \nonumber \\ & & +\,\frac{7}{54} \left[\frac{}{}\alpha ( {\boldsymbol q}_{12},{{\boldsymbol q}_3})G_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_2})+ \alpha ( {\boldsymbol q}_{13},{{\boldsymbol q}_2})G_2^{\rm (s)}({{\boldsymbol q}_1},{{\boldsymbol q}_3}) +\alpha ( {\boldsymbol q}_{23},{{\boldsymbol q}_1})G_2^{\rm (s)}({{\boldsymbol q}_2},{{\boldsymbol q}_3})\right], \end{eqnarray}
(B14)
where we introduced the two fundamental mode coupling functions
\begin{equation} \alpha ( {\boldsymbol q}_{1}, {\boldsymbol q}_{2}) = \frac{( {\boldsymbol q}_{1}+ {\boldsymbol q}_{2})\cdot {\boldsymbol q}_{1}}{{\boldsymbol q}_{1}^{2}}; \hspace{28.45274pt} \beta ({{\boldsymbol q}_1},{{\boldsymbol q}_2}) = \frac{|{{\boldsymbol q}_1}+{{\boldsymbol q}_2}|^2({{\boldsymbol q}_1}\cdot {{\boldsymbol q}_2})}{2{\boldsymbol q}_1^2 {\boldsymbol q}_2^2}. \end{equation}
(B15)
We then proceeded with integrating as in equation (24) taking care to cut the integral off at scales that exceed twice the filter radius.