A publishing partnership

The following article is Open access

Compressing the Cosmological Information in One-dimensional Correlations of the Lyman-α Forest

, , and

Published 2023 February 28 © 2023. The Author(s). Published by the American Astronomical Society.
, , Citation Christian Pedersen et al 2023 ApJ 944 223 DOI 10.3847/1538-4357/acb433

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/944/2/223

Abstract

Observations of the Lyman-α forest from spectroscopic surveys such as the Baryon Oscillation Spectroscopic Survey or its extension, eBOSS, or the ongoing Dark Energy Spectroscopic Instrument (DESI) survey offer a unique window to study the growth of structure on megaparsec scales. Interpretation of these measurements is a complicated task, requiring hydrodynamical simulations to model and marginalize over the thermal and ionization state of the intergalactic medium. This complexity has limited the use of Lyα clustering measurements in joint cosmological analyses. In this work we show that the cosmological information content of the one-dimensional power spectrum (P1D) of the Lyman-α forest can be compressed into a simple two-parameter likelihood without any significant loss of constraining power. We simulate P1D measurements from DESI using hydrodynamical simulations and show that the compressed likelihood is model independent and lossless, recovering unbiased results even in the presence of massive neutrinos or running of the primordial power spectrum.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The tightest constraints on cosmological parameters are obtained from the joint analysis of complementary probes, with different sensitivity to cosmological parameters. A common approach is to combine observations of the cosmic microwave background (CMB) with late-time probes of large-scale structure (LSS), such as galaxy clustering or weak lensing (Planck Collaboration et al. 2020; Alam et al. 2021; Abbott et al. 2022). An alternative probe of LSS is the Lyα forest, a series of absorption features in the spectra of z > 2 quasars, caused by intervening neutral hydrogen along the line of sight.

Cosmological analysis of the Lyα forest is driven by large spectroscopic surveys, such as the Baryon Oscillation Spectroscopic Survey (BOSS; Dawson et al. 2013) and its extension eBOSS (Dawson et al. 2016), which between 2009 and 2019 observed ∼200,000 Lyα forest quasars. In 2021, the Dark Energy Spectroscopic Instrument (DESI; DESI Collaboration et al. 2016) started a 5 yr program to survey a third of the sky and obtain spectra of ∼800,000 Lyα forest quasars. The main goal of these quasar surveys is to measure the three-dimensional (3D) correlations in the Lyα forest and to provide accurate measurements of the baryon acoustic oscillations feature to study the expansion of the universe (du Mas des Bourboux et al. 2020). The same data set, however, can be used to measure correlations along the line of sight, known as the one-dimensional flux power spectrum (P1D), a unique window to study the clustering of matter on megaparsec scales (Chabanier et al. 2019a).

Cosmological analyses of the P1D are particularly powerful in combination with CMB measurements due to the large "lever arm" between the two measurements, and these joint analyses have historically provided some of the tightest constraints on the sum of the neutrino masses, and on the shape of the primordial power spectrum of density fluctuations (Phillips et al. 2001; Spergel et al. 2003; Verde et al. 2003; Viel et al. 2004; Seljak et al. 2005, 2006; Bird et al. 2011; Palanque-Delabrouille et al. 2015a, 2015b, 2020).

Massive neutrinos are known to affect the growth of structure by suppressing the late-time clustering of matter on scales smaller than their free-streaming length (Lesgourgues & Pastor 2006). The P1D alone is unable to constrain neutrino masses due to parameter degeneracies (Pedersen et al. 2020), but when combined with the early-time, large-scale measurements from the CMB one can break these degeneracies. In the next few years, and in combination with CMB measurements, several LSS probes will be able to detect the impact of massive neutrinos, even if the sum of the masses is near the minimum of Σmν = 0.06 eV allowed by oscillation experiments (Font-Ribera et al. 2014).

At the same time, inflationary models generically predict that the primordial power spectrum of fluctuations should have small deviations from a power law, often parameterized as a running of the spectral index. Due to the wide lever arm between the large-scale fluctuations probed by Planck and the small scales accessed by the P1D, the Lyα forest is one of the most promising avenues toward tightening the constraints on inflationary models which produce a measurable running of the spectral index (Font-Ribera et al. 2014).

Unfortunately for cosmologists, the statistical properties of the Lyα forest also depend on the thermal and ionization history of the intergalactic medium (IGM; McQuinn 2016). 8 This has two consequences that complicate P1D analyses. First, it means that we need to run expensive hydrodynamical simulations in order to make accurate predictions for a given model. Second, it means that we need to add multiple nuisance parameters in our cosmological inference, and to carefully marginalize over them to obtain robust cosmological constraints.

In the last few years, several groups have attempted to tackle the first problem, introducing new tools to emulate P1D for parameters that are not covered by the relatively small suite of simulations available (Bird et al. 2019; Rogers et al. 2019; Walther et al. 2019; Takhtaganov et al. 2021; Rogers & Peiris 2021a; Pedersen et al. 2021). In this publication, we will use the LaCE 9 emulator presented in Pedersen et al. (2021), and focus on the second problem: the high dimensionality of the parameter space sampled, and the attractive possibility of dramatically reducing the dimensionality of the P1D likelihood into a small number of parameters describing the linear matter power spectrum, without introducing biases or losing relevant information.

The idea of compressing the P1D likelihood into a handful of parameters describing the linear power spectrum is not new. Indeed, the first cosmological studies of the Lyα forest focused on recovering the matter power spectrum (Croft et al. 1998, 2002; McDonald et al. 2000; Gnedin & Hamilton 2002), and the two-parameter (i.e., amplitude and slope) parameterization we focus on in this work was already used 20 yr ago (McDonald et al. 2000). However, most recent P1D analyses from the BOSS and eBOSS surveys have only presented their results in terms of direct fits to the traditional Lambda cold dark matter (ΛCDM) parameters (Borde et al. 2014; Palanque-Delabrouille et al. 2015a, 2015b, 2020), with strong dependence on the priors chosen. This has made it difficult for other groups to include these powerful results into combined cosmological analyses. If the Lyα forest constraints from P1D could be accurately and losslessly represented by just the amplitude and a local slope at a conveniently chosen pivot scale, it would significantly simplify the combination of Lyα forest measurements with other cosmological probes.

Motivated by the latest P1D measurements from eBOSS, the start of the DESI survey, and the recent developments in emulation techniques, in this paper we review the compression of the P1D likelihood. Note that similar discussions are also underway in the context of analysis of the galaxy power spectrum, in particular regarding the information content in measurements of redshift-space distortions (Hamann et al. 2010; d'Amico et al. 2020; Ivanov et al. 2020; Brieden et al. 2021).

We will start in Section 2 with a description of the simulated data, a summary of the emulator used, and the parameterization of the likelihood. In Section 3 we present cosmological constraints from simulated P1D data, and discuss the impact of priors and model depencency of the results. In Section 4 we present joint fits when combining the P1D with an approximated CMB likelihood, and show that the P1D likelihood can be efficiently compressed into two parameters without any loss of information. Finally, in Section 5 we discuss our findings.

2. Methodology

We discuss here the Lyα forest P1D likelihood, including an overview of the emulator used to make theoretical predictions (based on Pedersen et al. 2021), a description of the mock data set, and a discussion of the parameterization of the likelihood.

2.1. Simulations

We begin by describing the simulations used in the analysis, which fall into two categories. First, a set of training simulations are used to construct the emulator. Second, a small number of test simulations are run to represent mock P1D measurements in a variety of different cosmologies, and which are used to test and validate our analysis pipeline. Both the training and most of the test simulations were presented in Pedersen et al. (2021), where we also described and tested the emulation framework. Here we give an overview of the simulations, and refer the reader to Pedersen et al. (2021) for a more detailed description.

The simulations were run in MP-Gadget 10 (Feng et al. 2018), a TreeSPH code based on Gadget-2 (Springel 2005). All simulation boxes had a size of L = 67.5 Mpc and 7683 gas and CDM particles. The initial conditions were generated with MP-GenIC at z = 99, with Fourier modes that had random phases but fixed initial amplitudes (Angulo & Pontzen 2016; Villaescusa-Navarro et al. 2018; Anderson et al. 2019; Pedersen et al. 2021). In order to further reduce cosmic variance, for each model (both in the training and test sets) we ran a pair of simulations with inverted phases, and each quantity estimated from the simulations is taken as the average of the pair.

We output 11 snapshots, equally spaced in redshift between z = 2 and z = 4.5. To produce mock Lyα forest spectra from each snapshot, we use fake_spectra 11 (Bird 2017) to calculate a two-dimensional (2D) grid of 5002 transmission skewers from each snapshot, with a line-of-sight resolution of 0.05 Mpc. The cosmological and astrophysical parameters used in both the training and test simulations are listed in Table 1.

Table 1. Cosmological and Astrophysical Parameters for the Training and Test Simulations

 Training SetCentralNeutrinoRunning
As (× 10−9)[1.35–2.71]2.0062.2512.114
ns [0.92–1.02]0.96760.96760.9280
αs 0.00.00.00.015
Ωm 0.3160.3160.3240.316
Σmν (eV)0.00.00.30.0
${{\rm{\Delta }}}_{p}^{2}(z=3)$ [0.25–0.45]0.35
np (z = 3)−[2.35–2.25]−2.30
zrei [5.5–15]10.5
HA [0.5–1.5]1.0
HS [0.5–1.5]1.0

Note. The limits of the Latin hypercube for the training simulations are shown in the left column, where only the primordial power spectrum and astrophysical parameters are varied. The primordial parameters As and ns here are defined at the CMB pivot scale of k = 0.05 Mpc−1. The Central, Neutrino, and Running simulations are constructed such that they have the same small-scale linear matter power spectrum (${{\rm{\Delta }}}_{p}^{2}$ and np ) at z = 3. For all simulations, we fix ωc = 0.12, ωb = 0.022, and h = 0.67.

Download table as:  ASCIITypeset image

2.2. Emulator

We provide a brief overview of the emulator parameters and framework. We use the LaCE 12 framework presented in Pedersen et al. (2021), and refer the reader to this reference for a more complete description. LaCE uses a Gaussian process emulator 13 to predict P1D as a function of six parameters: the dimensionless amplitude (${{\rm{\Delta }}}_{p}^{2}$) and slope (np ) of the linear power spectrum around a pivot scale of kp = 0.7 Mpc−1; the mean transmitted flux fraction (or mean flux, $\bar{F}$); a thermal broadening scale defined in comoving units (${\sigma }_{T}^{\mathrm{com}}$), set by the temperature of the gas at mean density; the slope of the temperature–density relation (γ); and the filtering length in inverse comoving units (${k}_{F}^{\mathrm{com}}$), a proxy for gas pressure. Note that while the P1D is naturally observed in velocity units, the above quantities are all defined in comoving units in the emulator. The motivation for this is so that the simulated P1D is estimated at a fixed set of wavenumbers for snapshots at all redshifts.

We train the emulator using 30 pairs of simulations, described in Table 1. These simulations explore different thermal and reionization histories by varying zrei, HA , and Hs , which control the redshift of reionization and the heating rates of the gas as rescalings around a fiducial model from Haardt & Madau (2012; see Pedersen et al. 2021 for more detail). The simulations have different amplitudes and slopes of the primordial power spectrum (As , ns ), but have the same value for the physical densities of CDM (ωc = Ωc h2) and baryons (ωb = Ωb h2), the same value of H0, and do not include massive neutrinos. All 11 snapshots from all 30 models are used simultaneously, for a total of 330 points in the training sample.

In the implementation of the emulator presented in Pedersen et al. (2021), the Lyα P1D was emulated directly on a grid of comoving wavenumbers. Because of the limited box size of our simulations, the P1D measurements on large scales are affected by cosmic variance. Given that each simulation was run with the same random seed, there are random noise spikes in the power spectrum that align at the same comoving wavenumbers in each simulation used to train the emulator. When it came to testing the pipeline on simulated mock data with a different background evolution, we found that these noise spikes are interpreted by the emulator as sharp features in the power spectrum, artificially enhancing the emulator sensitivity to changes in cosmology. This is due to the fact that the likelihood evaluation is performed in velocity units, which require a conversion from comoving units using H(z). The pipeline would therefore try and find the H(z) that would align the noise spikes in the observed data with the noise spikes in the training set, with two consequences: when running on mock simulations with the same training seed, the pipeline is artificially sensitive to the conversion between comoving and velocity units, and therefore H(z), due to the presence of these sharp features; when running on mock simulations with a different random seed, the pipeline struggles to return the correct cosmology, as the likelihood maximization is dominated by the incentive to align the different noise features.

In order to remove residual noise in the measurements of P1D, we fit a fourth-order polynomial 14 to the logarithm of P1D as a function of the logarithm of wavenumber, using scales k < 8 Mpc−1:

Equation (1)

Instead of predicting directly P1D the emulator now predicts the five coefficients, cn , of this polynomial, that can later be used to predict P1D on all scales. The use of fitting functions, such as polynomials or principal component analysis, is a standard practice to reduce noise in emulators, such as in the "Coyote" emulator (Lawrence et al. 2010), as well as in early analyses of P1D (McDonald et al. 2005).

The variance on the emulated coefficients (${\sigma }_{{c}_{n}}^{2}$) can be used to obtain an estimate for the variance of the emulated P1D:

Equation (2)

We discuss the impact of cosmic variance on emulator predictions in Appendix B.

2.3. Mock Data

In order to test our analysis pipeline, we have generated three synthetic data sets (or mocks) for models that are not included in the training set of the emulator:

  • 1.  
    Central simulation. This is the simplest case, a simulation without massive neutrinos or running, with the same background expansion as was used in all training simulations, and a primordial power (As , ns ) corresponding to the center of the Latin hypercube used to set up the training set.
  • 2.  
    Neutrino simulation. A simulation with Σmν = 0.3 eV, where the cosmological constant (Λ) has been lowered to compensate the increase in the total matter density. The amplitude of the primordial power is also ∼10% larger to compensate the suppression of power caused by massive neutrinos. In Pedersen et al. (2021) we used this simulation to show that we could recover unbiased predictions in cosmologies with massive neutrinos, even when the emulator was trained exclusively with simulations with massless neutrinos.
  • 3.  
    Running simulation. A simulation with the same cosmology as the Central simulation, except that its primordial power spectrum has a nonzero running of αs = 0.015. The other parameters describing the primordial power (As , ns ) have been modified to compensate the change in running and have the same linear power around the pivot scale used in the emulator (kp = 0.7 Mpc−1; see Table 1).

We start by running a pair of simulations (with inverted phases) for each of the three test models. From each of their 11 snapshots we measure P1D, in comoving units, and fit a fourth-order polynomial as described in Section 2.2 above. In order to roughly simulate the statistical power of DESI, we use a rescaled version of the Sloan Digital Sky Survey (SDSS) Data Release 14 (DR14) covariance matrix of Chabanier et al. (2019b), where all elements are divided by 5 to approximately take into account the difference in the number of spectra between SDSS DR14 and DESI. 15 As is common in P1D measurements, the band powers presented in Chabanier et al. (2019b) are defined in velocity units. At each redshift we compute H(z)/(1 + z) using the simulation cosmology to translate these into wavenumbers in comoving units.

2.4. Likelihood

We use a Gaussian likelihood, naturally decomposed into 11 independent sublikelihoods, one for each snapshot (redshift bin). The covariance matrix is the sum of the data covariance and an extra term describing the uncertainty in the emulator predictions, computed with Equation (2). The typical emulator uncertainty is smaller than 1% for models near the center of our training set, and it only has a minor impact on likelihood evaluations around the best-fit values of our analyses. However, it can be larger than 10% when evaluating the likelihood near the convex hull of our training sample.

The different subsections in Sections 3 and 4 use a different number of free cosmological parameters, including the amplitude (As ), slope (ns ), and running (αs ) of the primordial power spectrum at the usual CMB pivot scale of ks = 0.05 Mpc−1; the physical densities of baryons (ωb = Ωb h2) and of CDM (ωc = Ωc h2); the sum of the neutrino masses (Σmν ); the Hubble parameter H0; and the angular acoustic scale of the CMB (θMC).

We use four functions to describe the thermal and ionization history of the IGM: the effective optical depth as a function of redshift $\tau (z)=-\mathrm{log}\bar{F}(z)$, the thermal broadening scale (in kilometers per second) at mean densities ${\sigma }_{T}^{\mathrm{vel}}(z)$, the slope of the temperature–density relation γ(z), and the filtering/pressure scale ${k}_{F}^{\mathrm{vel}}(z)$ (in s km−1). Following Pedersen et al. (2021), we measure each of these functions from the Central simulation, and use two parameters, αX and βX , to describe a power-law rescaling for each of the four functions. Here the subscript X refers to one of the four IGM parameters. For instance, the thermal broadening scale, ${\sigma }_{T}^{\mathrm{vel}}(z)$, is parameterized as

Equation (3)

where ${\sigma }_{T}^{\mathrm{vel}}(z){| }_{\mathrm{cen}}$ is the thermal broadening scale in the Central simulation. Therefore, we use a total of eight nuisance parameters related to IGM physics.

There is no guarantee that this simple parameterization is accurate enough to do an analysis on real data, but it should be flexible enough to test the compression of the likelihood in a realistic setting. As described in Table 2, we use combined priors: each parameter is allowed to vary within a given range of values (top-hat prior) and an additional weak Gaussian prior is applied to all parameters; the actual prior is a product of the two.

Table 2. Priors Used for the Cosmological Parameters (Top), and for the Nuisance Parameters Describing the Thermal and Ionization History of the IGM (Bottom)

ParameterRange AllowedGaussian Prior
As (×10−9)[1.0–3.2] ${ \mathcal N }(2.1,1.1)$
ns [0.89–1.05] ${ \mathcal N }(0.965,0.08)$
αs [−0.8–0.8] ${ \mathcal N }(0.0,0.8)$
ωb [0.018–0.026] ${ \mathcal N }(0.022,0.004)$
ωc [0.10–0.14] ${ \mathcal N }(0.12,0.02)$
Σmν (eV)[0.0–1.0] ${ \mathcal N }(0.0,0.5)$
H0 [50–100] ${ \mathcal N }(67.0,25.0)$
θMC(×10−3)[9.9–10.9] ${ \mathcal N }(10.4,0.5)$
aτ [−0.1–0.1] ${ \mathcal N }(0.0,0.05)$
bτ [−0.2–0.2] ${ \mathcal N }(0.0,0.1)$
${a}_{{\sigma }_{T}}$ [−0.4–0.4] ${ \mathcal N }(0.0,0.2)$
${b}_{{\sigma }_{T}}$ [−0.4–0.4] ${ \mathcal N }(0.0,0.2)$
aγ [−0.2–0.2] ${ \mathcal N }(0.0,0.1)$
bγ [−0.4–0.4] ${ \mathcal N }(0.0,0.2)$
akF [−0.2–0.2] ${ \mathcal N }(0.0,0.1)$
bkF [−0.4–0.4] ${ \mathcal N }(0.0,0.2)$

Note. All parameters have a limited range of values allowed and a Gaussian prior.

Download table as:  ASCIITypeset image

In the next sections we will discuss the constraints on two derived parameters that are able to capture most of the cosmological information in P1D: the (dimensionless) amplitude and slope of the linear power spectrum at a pivot point k = 0.009 s km−1 and redshift z = 3 16 :

Equation (4)

Equation (5)

where PL (k, z) is the linear power spectrum in velocity units. It is important to highlight that these parameters are defined in velocity units, since P1D measurements are also presented in velocity units and parameters defined in comoving units would be model dependent.

Let us finish this section by summarizing the steps needed to make a likelihood evaluation:

  • 1.  
    Given a set of cosmological parameters, we use the Boltzman solver camb (Lewis et al. 2000) to make predictions for PL (z, k) and H(z) at all redshifts and scales.
  • 2.  
    For each redshift zi in our mock P1D measurement, we compute the value of the amplitude (${{\rm{\Delta }}}_{p}^{2}$) and slope (np ) of the linear power, PL (zi , k), around the pivot point kp = 0.7 Mpc−1. These are two of the six parameters that will be passed to the emulator to get a prediction of P1D at zi .
  • 3.  
    The other four parameters ($\bar{F}$, ${\sigma }_{T}^{\mathrm{com}}$, γ, ${k}_{F}^{\mathrm{com}}$) are computed from the eight nuisance parameters and the four IGM-related functions measured from the Central simulation. For instance, we use Equation (3) to compute the thermal broadening scale (${\sigma }_{T}^{\mathrm{vel}}$) in velocity units at redshift zi , and the comoving scale passed to the emulator is ${\sigma }_{T}^{\mathrm{com}}={\sigma }_{T}^{\mathrm{vel}}(1+{z}_{i})/H({z}_{i})$.
  • 4.  
    For each redshift, we ask the emulator to predict the P1D corresponding to the six emulator parameters computed above. The emulator prediction is in comoving units, and we use H(zi ) to translate it to velocity units.
  • 5.  
    The emulator also returns an uncertainty associated to the prediction, that we add to the data covariance (after translating the emulator covariance to velocity units).
  • 6.  
    We use these ingredients to compute a Gaussian likelihood, and multiply it by the prior probability described above.

We use emcee (Foreman-Mackey et al. 2013) to run Monte Carlo Markov Chains, and we use GetDist (Lewis 2019) to make contour plots with marginalized posteriors.

3. Cosmological Information in the Lyα P1D

In this section we follow the methodology described in Section 2 to fit cosmological parameters from a synthetic measurement of P1D. We refer to these as direct fits.

In Figure 1 we show the marginal constraints on cosmological parameters when analyzing mock data from the Central simulation. In the standard analysis (blue), we vary five cosmological parameters and eight nuisance parameters describing the IGM that are not shown. For comparison, the black lines show the constraints from the priors described in Table 2.

Figure 1.

Figure 1. Direct fits to cosmological parameters from a mock P1D measurement from the Central simulation. We show the marginal posteriors on the five cosmological parameters that are being sampled. The top-right panel shows the marginal posteriors on the two derived parameters that will be used to compress the likelihood. Black lines correspond to running the analysis with only the prior, and dotted gray lines show the true values used to generate the mock. The blue contours show constraints from the P1D with all five cosmology parameters free. In the red contours, we show results where we use a template cosmology, and fix all cosmology parameters to the values in the Central simulation, except As and ns , which are kept free. We investigate the dependence of our posteriors on this choice of template in Figure 2. For concision, we omit contours for the IGM parameters.

Standard image High-resolution image

It is clear that Lyα P1D alone cannot measure well these five cosmological parameters, and that the results strongly depend on the choice of priors (the impact of the prior choice is discussed in Appendix A). For instance, the constraints on As are affected by the maximum value allowed by the prior, and its lower bound is a consequence of the prior on neutrino masses, Σmν , being positive.

In the top-right corner of Figure 1 we also show the marginal posteriors for the two derived parameters describing the linear power spectrum at z = 3 (Equations (4) and (5)). It is clear that adding P1D reduces dramatically the area of the prior contours. In the next sections we will refer to these as the compressed parameters, since they are able to compress most of the cosmological information contained in the Lyα P1D.

3.1. Fixed Template and Fiducial Cosmology

The red contours in Figure 1 show a simplified version of the analysis where only the primordial power parameters (As , ns ) and the eight IGM parameters are varied. In other words, we use a fixed template 17 for the linear power, PL (z, k), and rescale it with these two parameters. This analysis is significantly faster than the standard analysis, since we only need to call camb a single time to compute the transfer function for the fiducial cosmology.

The template analysis can be seen as an analysis with infinitely tight priors on the other cosmological parameters (ωc , H0, Σmν ). While the constraints on the traditional cosmological parameters (As , ns ) are strongly affected by this change in the priors, the constraints on the compressed parameters (${{\rm{\Delta }}}_{\star }^{2}$, n; top-right panel) remain the same.

In this particular realization of the analysis, we have used a template computed with the same cosmology that was used to run the simulation. Even in the standard analysis (blue contours in Figure 1) we had to assume a value for the baryon density (ωb = 0.022). We will use the term fiducial cosmology to refer to the cosmological parameters that are being kept fixed in the analysis. Obviously in a real analysis the true cosmology is not known, and so we next test the effect of changing this fiducial comsmology on our results.

In the top panels of Figure 2 we redo the template analysis when using different fiducial cosmologies, with the wrong CDM density (in red) or the wrong sum of the neutrino masses (in blue). While there is a clear bias on the primordial power parameters (left), the compressed parameters are much less affected by the choice of fiducial cosmology.

Figure 2.

Figure 2. Top panels show marginal constraints on the primordial power parameters (left) and on the compressed parameters (right), when analyzing the Central simulation with different fiducial cosmologies. The fiducial cosmology in the default analysis is the same one that was used to run the Central, and stars mark the true value used in the simulation. When using a different fiducial cosmology, with an incorrect value of the CDM density (red) or neutrino masses (blue) we get biased constraints on primordial power parameters. On the other hand, the constraints on the compressed parameters are much less affected by the choice of fiducial cosmology. The bottom panels show equivalent constraints for the three test simulations, when analyzed with the Central cosmology as fiducial. Note that the Central (black) and Running (blue) simulations have the same values for the compressed parameters, but very different values for the primordial power spectrum (including different value of the running αs ).

Standard image High-resolution image

The bottom panels of the same figure show a template analysis for the three test simulations described in Table 1. In all three analyses we use the Central cosmology as our fiducial cosmology. As can be seen in the bottom-left panel, this results in biased posteriors for the primordial power parameters in the Neutrino and Running simulations (stars identify the true values used in each simulation). However, the marginal posteriors of the compressed parameters are again recovered successfully (bottom-right panel). These marginal posteriors in the bottom-right panel will be used in the next section.

4. Joint Analysis with the Cosmic Microwave Background

In the previous sections we discussed cosmological fits from the Lyα P1D alone, with only weak priors on cosmological parameters. We showed that we can measure very well the amplitude (${{\rm{\Delta }}}_{\star }^{2}$) and slope (n) of the linear power spectrum around z = 3 and kp = 0.009 s km−1, and that the constraints on these compressed parameters were unbiased and do not depend on our choice of priors or fiducial cosmology.

In this section we discuss joint cosmological analysis with anisotropies in the CMB. CMB and P1D measurements are very complementary, since together they cover a very wide range of scales and redshifts. This has made these joint analyses very popular in the past (Phillips et al. 2001; Spergel et al. 2003; Verde et al. 2003; Seljak et al. 2005, 2006; Palanque-Delabrouille et al. 2015a, 2015b, 2020), and they are forecasted to provide some of the tightest constraints on the sum of the neutrino masses and on the running of the spectral index from future surveys (Font-Ribera et al. 2014).

Instead of using an actual CMB likelihood, for simplicity we use a Gaussian likelihood on the relevant cosmological parameters. The Gaussian likelihood uses a covariance matrix obtained from the official Planck chains. 18 The center of the Gaussian has been set to the values used in the different test simulations described in Section 2. The approximated CMB likelihood can be seen in solid black contours in Figures 3 (free neutrino mass) and 4 (free running).

Figure 3.

Figure 3. Cosmological constraints from CMB + Lyα P1D, for mock data from the Neutrino simulation. Blue contours use a direct P1D likelihood, while red contours use the marginal posterior on linear power parameters (${{\rm{\Delta }}}_{\star }^{2}$,n). Black contours show the CMB-only results, with the gray dashed lines representing the values in the mock simulation. The primordial power is assumed to have no running in this analysis.

Standard image High-resolution image

The results in Figure 3 are from a joint analysis of the CMB and our mock P1D from the Neutrino simulation, when varying six cosmological parameters (As , ns , ωb = Ωb h2, ωc = Ωc h2, Σmν , and θMC), with the priors described in Table 2. Even though we sample θMC, we plot the contours for H0, computed as a derived parameter.

The blue contours show a joint fit using the direct P1D likelihood, i.e., we have varied at the same time the cosmological parameters and the eight nuisance parameters that were also used in Section 3 to describe the uncertainties in the physics of the IGM.

The red contours, on the other hand, use the marginal posterior on the linear power parameters (${{\rm{\Delta }}}_{\star }^{2}$, n) obtained from the Lyα P1D alone. In more detail, to obtain the red contours we do the following:

  • 1.  
    Run a template fit to the Lyα P1D alone, varying eight IGM parameters and two cosmological parameters (As , ns ), as described in Section 3.
  • 2.  
    Use a kernel density estimator (from SciPy; Virtanen et al. 2020) to model the marginal posteriors on the two compressed parameters (${{\rm{\Delta }}}_{\star }^{2}$, n), shown in the bottom-right panel of Figure 2.
  • 3.  
    Run a joint analysis of the CMB and the marginal Lyα P1D posterior, varying six cosmological parameters (As , ns , ωb , ωc , Σmν , and θMC). It is important to note that in this last step one does not need to use an emulator or worry about the nuisance parameters describing the IGM; these have already been marginalized over in the previous steps.

It is remarkable that in both cases we recover the true value for the sum of the neutrino masses, even though our emulator was constructed from simulations that assume massless neutrinos. It is also remarkable how similar are the joint constraints when using the direct (blue) and compressed (red) likelihoods. This implies that there is negliglble loss of cosmological information when compressing the P1D into marginal constraints on the linear power spectrum.

In Figure 4 we present a similar analysis for the Running simulation, where we have assumed that the neutrinos are massless but where we have explored models with running of the spectral index αs . Here again we recover the right cosmology, and both approaches give very consistent results.

Figure 4.

Figure 4. Same as Figure 3, except now using the mock data from the Running simulation. Neutrino masses are fixed to 0 in this analysis.

Standard image High-resolution image

The effect on posteriors of including Lyα forest information is slightly different in Figures 3 and 4. While we do not include a figure for this, in the case of a simple flat ΛCDM model the constraints from the CMB alone on ${{\rm{\Delta }}}_{\star }^{2}$ and n are already very good, and the Lyα forest does not provide a significant improvement. It is in the analysis of extended models, such as the two we consider in this work, that the contribution from the Lyα forest becomes important. In the case of free neutrino mass, the improvement is mostly in ${{\rm{\Delta }}}_{\star }^{2}$. This is because free neutrino mass opens up a degeneracy in the amplitude of the late-time power spectrum obtained from the CMB-only analysis, and including Lyα forest information breaks this degeneracy. For the case of free αs , the information from a smaller pivot scale is essential in breaking the degeneracy between αs and n.

5. Discussion

In Section 3 we have shown that the Lyα P1D can robustly measure two parameters describing the amplitude and slope of the linear power spectrum at a central redshift z = 3, and around a pivot point k = 0.009 s km−1 defined in velocity units. We have shown that we recover unbiased results independent of the fiducial cosmology assumed in the fits, even when analyzing models that were not included in the training of our LaCE emulator.

In Section 4 we have shown that, in the context of joint analyses with CMB data, the cosmological information in the Lyα P1D can be captured with the marginalized posteriors of these two parameters. We have explicitly shown that this is the case for the two single-parameter extensions to the ΛCDM model where the P1D is forecasted to contribute the most (Font-Ribera et al. 2014): models with massive neutrinos (Figure 3) and models with running of the spectral index of primordial fluctuations (Figure 4).

The compression is successful as we are able to approximate the expansion and growth rates in the 2 < z < 5 regime using a fixed fiducial model, due to the fact that the universe is close to Einstein–de Sitter (EdS) in this regime. At significantly higher data precision, one would expect this approximation to break down, and therefore the compression to fail. In this case, including the extended parameters discussed in Appendix C might capture the missing information. However, the data covariance we use in this work will not be surpassed by any current or proposed experiment, so we leave quantifying the regime in which the compression fails to a future work.

Exotic cosmological models might require more complex implementations of the emulation and compression schemes discussed in this work. For instance, models with either warm or fuzzy dark matter predict that the linear power spectrum could be strongly suppressed on submegaparsec scales, and the Lyα forest has provided some of the tightest constraints on these models (Viel et al. 2013; Iršič et al. 2017b, 2017a; Murgia et al. 2018; Palanque-Delabrouille et al. 2020; Rogers & Peiris 2021b). In order to use the LaCE emulator in these studies, one would need to add extra emulator parameters describing the suppression of the linear power, and run extra simulations exploring them. Equivalently to ${{\rm{\Delta }}}_{\star }^{2}$ and n, one would need to define other compressed parameters to capture the relevant information present in the P1D likelihood. Since the P1D measurements are naturally carried out in velocity units, these extra parameters would also need to be defined in velocity units, otherwise the cutoff scale would depend on the assumed model of the expansion rate H(z).

In the next few years, the DESI will measure with unprecedented accuracy the Lyα P1D, enabling very precise constraints on the linear power spectrum of matter fluctuations around z = 3. We expect that the compression scheme discussed here will significantly increase the impact of these measurements, and it will simplify joint analyses with external data sets.

The authors thank Patrick McDonald, Pablo Lemos Portela, and the Lyα working group of DESI for useful discussions. C.P. acknowledges support by NASA ROSES grant No. 12-EUCLID12-0004. A.F.R. acknowledges support from the Spanish Ministry of Science and Innovation through the program Ramon y Cajal (RYC-2018-025210) and from the European Unions Horizon Europe research and innovation program (COSMO-LYA, grant agreement 101044612). IFAE is partially funded by the CERCA program of the Generalitat de Catalunya. This manuscript has been coauthored by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy, Office of Science, Office of High Energy Physics. This work was partially supported by the Visiting Scholars Award Program of the Universities Research Association, and by funding from the UCL Cosmoparticle Initiative. This work used computing facilities provided by the UCL Cosmoparticle Initiative. The simulations were run using the Cambridge Service for Data Driven Discovery (CSD3), part of which is operated by the University of Cambridge Research Computing on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). The DiRAC component of CSD3 was funded by BEIS capital funding via STFC capital grants ST/P002307/1 and ST/R002452/1 and STFC operations grant ST/R00689X/1. DiRAC is part of the National e-Infrastructure.

Appendix A: Impact of Prior Choice

The results presented in the main text used a Gaussian prior, described in Table 2. In Figure 5 we demonstrate that the marginalized posteriors for the compressed parameters are not affected by this prior.

Figure 5.

Figure 5. Marginalized 1D and 2D posterior distributions on compressed parameters, corresponding to analyses of the Central mock data. In blue we show the constraints from a direct P1D analysis using the loose Gaussian priors, and in red we show the constraints from an equivalent template fit (fixed values for ωc , H0, and Σmν ); these 2D contours were already presented in the top-right panel of Figure 1. The black (green) dotted contours show the constraints from a direct (template) fit when not using any Gaussian prior, and demonstrate that the role of the Gaussian prior on the compressed constraints is very minor.

Standard image High-resolution image

We start by showing the results from a direct analysis (blue contours) and a template analysis (red contours) that include the Gaussian prior; these are the contours already presented in the top-right panel of Figure 1. These can be compared, respectively, to the black and green dotted contours, where we have not included the Gaussian prior.

Appendix B: Impact of Cosmic Variance in the Emulator Predictions

In the main text we have analyzed simulations that had initial conditions generated with the same random phases as the simulations used to train the LaCE emulator. In order to study the impact of cosmic variance in the emulator predictions, in Figure 6 we show the results when analyzing a test simulation diff seed (red contours) that has the same physics as the Central simulation (blue contours) but has different random phases in the initial conditions.

Figure 6.

Figure 6. Marginalized 1D and 2D posterior distributions on compressed parameters, corresponding to template fits to the Central mock data (in blue) discussed in Figure 1, and fits to mock data with different random phases (diff seed, in red). The solid lines show the constraints when using the polyfit framework used in the main text, where we emulate the coefficients of polynomial fits to P1D. The dotted lines, on the other hand, use the k_bin framework that was used in Pedersen et al. (2021), where we emulate the value of P1D on a grid of wavenumbers. While both frameworks give consistent results when analyzing the Central simulation, it is clear that the k_bin emulator gives biased results when analyzing simulations with different random phases (dotted red contours).

Standard image High-resolution image

In the same figure we also compare the results when using two different implementations of the LaCE emulator: the polyfit framework (solid lines), used in the main text, emulates the value of the coefficients of polynomial fits describing the Lyα P1D(Equation (1)); the k_bin framework (dotted lines), used in Pedersen et al. (2021), directly emulates the value of Lyα P1D on a fine grid of wavenumbers.

It is clear that the k_bin emulator gives biased results, probably because it is trying to fit different noise spikes than the ones used in the training sample. On the other hand, the polyfit emulator is able to give unbiased results even when analyzing mock data with different cosmic variance.

Appendix C: Extended Compression Schemes

In Section 3 we have proposed to compress the cosmological information in P1D into two parameters describing the amplitude (${{\rm{\Delta }}}_{\star }^{2}$) and slope (n) of the linear power spectrum at z = 3, around a pivot point k = 0.009 s km−1. We have shown in Section 4 that this compression is lossless in the context of joint analyses with the CMB with free neutrino masses (Σmν ), or free running of the primordial power spectrum (αs ). In Section 5 we mentioned that one might need to add extra parameters describing the shape of the linear power at z: for instance, a third parameter describing the curvature around the pivot point (McDonald et al. 2005), or a cutoff to describe the small-scale suppression in non-CDM models. In this appendix, instead, we discuss possible extensions to capture other cosmological information beyond the shape of the linear power at z.

Measurements of the Lyα P1D typically cover a wide range of redshifts. For instance, Chabanier et al. (2019b) measured P1D from z = 2.2 to z = 4.6. It might seem surprising that we can capture all the cosmological information when parameterizing the linear power spectrum at a single redshift z = 3. Moreover, while the shape of the linear power spectrum is constant when described in comoving units, the same is not true when described in velocity units. Our pivot scale k correspond to different comoving separations at different redshifts, and one could imagine measuring H(z)/(1 + z) from the redshift evolution of the shape of the linear power in velocity units.

In order to capture information from these two effects, we introduce two extra parameters. We parameterize the growth of structure around z with the logarithmic growth rate f = f(z), defined as usual:

Equation (C1)

with f = 1 in an EdS universe.

Similarly, we parameterize the evolution of the expansion rate around z in terms of g = g(z), defined as

Equation (C2)

such that g = 1 corresponds again to an EdS universe.

C.1. Template Fits with f and g

Instead of looking at posteriors of f and g computed as derived parameters in fits for a particular model, we would like to directly sample these without assuming any cosmological model. For instance, in a ΛCDM universe, without curvature or massive neutrinos, both f and g would be just a function of Ωm . However, more exotic models could decouple the linear growth from the expansion of the universe, making these parameters independent.

Therefore, in this appendix we directly sample the four compressed parameters (${{\rm{\Delta }}}_{\star }^{2}$, n, f, and g) and the same eight nuisance parameters used in the main text to model the IGM. We use a uniform prior range of [0.24, 0.47], [−2.352, −2.25], [0.9, 1.0], and [0.9, 1.0] for each parameter, respectively. The details of how we do this are detailed later in Section C.2.

In Figure 7 we show constraints on compressed parameters, after marginalizing over the IGM. We use as a mock data set the Neutrino simulation, and show two sets of constraints. In red, we have fixed f and g to the values of the fiducial cosmology (f = 0.981, g = 0.968), whereas in blue they are left as free parameters. The dashed lines show the true values in the mock simulation (f = 0.989, g = 0.969). We note that f is very poorly constrained, implying that the P1D alone is not highly sensitive to the redshift evolution of the linear power spectrum. This result is consistent with the findings of McDonald et al. (2005), although we confirm that this is still the case when using high-precision data sets. The posterior for g is slightly better constrained, although it can only rule out very low values of g < 0.9. Additionally, there is very little effect on the posteriors for ${{\rm{\Delta }}}_{\star }^{2}$ and n when marginalizing over f and g when compared to fixing them.

Figure 7.

Figure 7. Marginalized 1D and 2D posterior distributions on compressed parameters when analyzing the mock data set from the Neutrino simulation. In the red contours we fix f and g to the fiducial values described in the text. The dashed lines show the true values in the mock simulation, and the shaded areas of the 1D posteriors show the 68% credible region.

Standard image High-resolution image

Note that the red contours were constructed assuming the wrong background cosmology (wrong values of f and g), but that the constraints on ${{\rm{\Delta }}}_{\star }^{2}$ and n are nevertheless unbiased.

C.2. Reconstructing the Linear Power Spectrum

Here we describe the procedure for mapping from a set of values for the compressed parameters (${{\rm{\Delta }}}_{\star }^{2}$, n, f, and g) to the 11 pairs of emulator parameters (${{\rm{\Delta }}}_{p}^{2}$, np ) the values required to generate theoretical predictions for the P1D from the emulator, one at each redshift. This is done using a fiducial cosmology, as outlined below. We will use k to refer to the (modulus of the) 3D wavenumbers in comoving coordinates, i.e., in megaparsecs. We will use q to refer to the same wavenumber in velocity units. They are related by

Equation (C3)

M(z) will play an important role in this discussion.

We will use P(k) to refer to (3D) power spectra in comoving units, i.e., in units of cubic megaparsecs. We will use Q(q) to refer to (3D) power spectra in velocity units, i.e., with units of (km s−1)3. They are related by

Equation (C4)

In our code we will use a fiducial cosmology as a reference, and parameterize our models as deviations from that cosmology. We will use either subscripts "0" or superscripts "0" to identify functions for the fiducial cosmology. We address changes to the shape of the power spectrum, as described by ${{\rm{\Delta }}}_{\star }^{2}$, n (and α) first, and then later address changes to the redshift evolution using f and g. We can now define the ratio of the linear power between any model and the fiducial one, at the central redshift z, and in velocity units:

Equation (C5)

This will be another important function, tightly related to the linear power parameters that we will end up using.

We fit a second-order polynomial to the logarithm of the linear power spectrum at z, in velocity units, around a pivot point q. By default we use z = 3 and q = 0.009 s km−1, and we fit the polynomial in a range of wavenumbers defined as q/2 < q < 2q: 19

Equation (C6)

or, equivalently,

Equation (C7)

n is the first log-derivative around q, and α is the second log-derivative around the same point. Note that the polynomial fit, however, returns ($\mathrm{ln}A$, n, α/2). Finally, we define a dimensionless parameter describing the amplitude, ${{\rm{\Delta }}}_{\star }^{2}\,=A\,{q}_{\star }^{3}/(2{\pi }^{2})$. When reconstructing the linear power spectrum using a fiducial cosmology, we use differences in the shape parameters with respect to the fiducial ones:

Equation (C8)

We are also concerned with reconstructing the linear power spectrum at redshifts other than z. We ignore neutrinos for now, and work with just the CDM+baryon power spectrum. In this case, we can use the linear growth factor D(z), defined as

Equation (C9)

where, in general, functions y = y(z). We can write the power spectrum at an arbitrary redshift as a function of the fiducial one:

where for convenience we have defined two functions:

Equation (C10)

and

Equation (C11)

which describe differences in expansion rate and in linear growth, respectively.

Using the definition of g in Equation (C2), we approximate m(z) using the difference of g between the input and the fiducial cosmology as

Equation (C12)

or, equivalently,

Equation (C13)

Similarly, we approximate d(z) using the difference of f between the input and the fiducial cosmology as

Equation (C14)

or, equivalently,

Equation (C15)

With these equations, for a given set of (${{\rm{\Delta }}}_{\star }^{2}$, n, α, f, g), Q(z, q) can be estimated. We then use the approximation of m(z) to convert the velocity unit power spectrum to a comoving power spectrum, and fit a polynomial over the range kp /2 < k < 2k to obtain values for ${{\rm{\Delta }}}_{p}^{2}$ and np . Note that the emulator returns a P1D in comoving units. The final step is to convert this into velocity units, once again using the above approximation of m(z). This reconstruction process and the composite approximations have been compared against the true values generated in camb , and we verified that they are accurate to within the percent level across all redshifts and extended model spaces considered in this paper.

Footnotes

  • 8  

    This also makes the Lyα forest, especially at z > 5, a key probe of reionization, but we do not discuss this in this work.

  • 9  
  • 10  
  • 11  
  • 12  
  • 13  

    We use the Python implementation GPy (GPy 2012).

  • 14  

    This setting performed better than third- or fifth-order polynomials, but we did not explore other functions.

  • 15  

    A more detailed forecast should also take into account the differences in pixelization, spectral resolution, and signal-to-noise, but we leave this for future work.

  • 16  

    This pivot scale was found in McDonald et al. (2005) to be optimal for their data set, but it might be suboptimal for other surveys.

  • 17  

    This term is commonly used in redshift-space distortion analyses of galaxies to refer to analyses with fixed transfer functions (Alam et al. 2021).

  • 18  

    For chains with massive neutrinos, we have computed the covariance around its best-fit value (Σmν = 0) and not around its mean.

  • 19  

    We do the fit using numpy.polyfit.

Please wait… references are loading.
10.3847/1538-4357/acb433