CALIBRATING PHOTOMETRIC REDSHIFT DISTRIBUTIONS WITH CROSS-CORRELATIONS

Published 2010 November 12 © 2010. The American Astronomical Society. All rights reserved.
, , Citation A. E. Schulz 2010 ApJ 724 1305 DOI 10.1088/0004-637X/724/2/1305

0004-637X/724/2/1305

ABSTRACT

The next generation of proposed galaxy surveys will increase the number of galaxies with photometric redshift identifications by two orders of magnitude, drastically expanding both the redshift range and detection threshold from the current state of the art. Obtaining spectra for a fair subsample of these new data could be cumbersome and expensive. However, adequate calibration of the true redshift distribution of galaxies is vital to tapping the potential of these surveys to illuminate the processes of galaxy evolution and to constrain the underlying cosmology and growth of structure. We examine here an alternative to direct spectroscopic follow-up: calibration of the redshift distribution of photometric galaxies via cross-correlation with an overlapping spectroscopic survey whose members trace the same density field. We review the theory, develop a pipeline to implement the method, apply it to mock data from N-body simulations, and examine the properties of this redshift distribution estimator. We demonstrate that the method is generally effective, but the estimator is weakened by two main factors. One is that the correlation function of the spectroscopic sample must be measured in many bins along the line of sight, which renders the measurement noisy and interferes with high-quality reconstruction of the photometric redshift distribution. Also, the method is not able to disentangle the photometric redshift distribution from redshift dependence in the bias of the photometric sample. We establish the impact of these factors using our mock catalogs. We conclude that it may still be necessary to spectroscopically follow up a fair subsample of the photometric survey data. Nonetheless, it is significant that the method has been successfully implemented on mock data, and with further refinement it may appreciably decrease the number of spectra that will be needed to calibrate future surveys.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

It is essential to calibrate the true redshift distribution of galaxies in a photometric survey if the survey is to be utilized to its full potential. One application of survey data that requires a detailed understanding of the distribution of galaxies is weak gravitational lensing tomography. The shearing of the shapes of distant galaxies via weak gravitational lensing is a powerful cosmological probe that can be used to study the distribution of dark matter, the nature of dark energy, and the formation of large-scale structures in the universe, as well as fundamental properties of elementary particles and potential modifications to the general theory of relativity (recent studies include Hoekstra & Jain 2008; Thomas et al. 2009; Kilbinger et al. 2009; Ichiki et al. 2009). Cosmic shear measurements statistically examine minute distortions in the orientations of high-redshift galaxies, whose shapes have been sheared by intervening dark matter structures. Although weak gravitational lensing provides only an integrated measure of the intervening density field, using source populations at different redshifts permits some degree of three-dimensional (3D) reconstruction, known as tomography. The distortions are small (at the 1% level) and the intrinsic orientation of the source galaxies is unknown, thus large galaxy samples are required to map the density field and probe the growth of density fluctuations with precision. Existing cosmic shear measurements have already constrained the amplitude of the dark matter fluctuations at the 10% level (Heymans et al. 2004; Rhodes et al. 2004; Massey et al. 2005; Semboloni et al. 2006) and there are many exciting galaxy survey proposals that will increase the available number of source galaxies by two orders of magnitude including DES, DUNE, Euclid, LSST, PanStarrs, SNAP, and Vista.

Because these large galaxy surveys will have photometric rather than spectroscopic redshift identifications, the community has carefully attended to fine tuning the calibration of the photometric redshifts, minimizing biases and catastrophic errors (Mandelbaum et al. 2008; Lima et al. 2008; Freeman et al. 2009; Xia et al. 2009; Gerdes et al. 2010). Unlike experiments that use the galaxy positions to directly trace the underlying dark matter distribution, such as baryon acoustic oscillation studies, weak lensing analyses do not require a precise redshift identification for each individual source galaxy. It is sufficient to accurately determine the redshift distribution of the sources. However, lensing measurements are extremely sensitive both to error and bias in the source distribution Huterer (2002). Attaining an accurate source distribution will be crucial if weak lensing measurements are to be competitive with other cosmological probes in constraining the cosmological parameters.

Another example where calibration of the true distribution of galaxies may be essential is in using the abundances and clustering of different galaxy populations to connect galaxies at late times to their potential progenitors at early times (as in, e.g., Conroy & Wechsler 2009). Such studies also utilize the luminosity functions of galaxies in each redshift slice and sometimes divide these into different rest-frame color bins. To avoid potential systematics in inferences made about galaxy evolution, it will be necessary to know if some fraction of the population in a given photometric redshift bin is actually living at a different redshift, especially if there is an asymmetry in such errors that depends on color.

Recently, an alternative approach to attaining an accurate source redshift distribution has been proposed in Newman (2008). This method is similar to cross-correlation techniques used in Phillipps (1985) and Masjedi et al. (2006), and the idea has also been studied theoretically in Schneider et al. (2006) and Bernstein & Huterer (2010). A similar technique was used in Erben et al. (2009) to check the redshift distribution for interlopers. Similar in spirit, the analysis of Quadri & Williams (2009) uses close angular pairs of photometric galaxies to constrain the photometric errors without the use of a spectroscopic sample.

The cross-correlation method determines the photometric redshift distribution by utilizing the cross-correlation of the galaxies in the photometric sample with an overlapping spectroscopic sample that traces the same underlying density field. One advantage of this approach is that the spectroscopic sample used to calibrate the photometric redshift distribution can be comprised of bright rare objects such as quasars or luminous red galaxies (LRGs) whose spectra are relatively easy to obtain, and indeed may already exist in legacy data. Spectra could also be obtained for emission line galaxies (ELGs), which are easy to follow up but may not represent a fair subsample. Another advantage is that catastrophic redshift errors in the photometry do not systematically bias the redshift distribution estimate, they merely contribute to the noise.

The cross-correlation method makes use of two observables, the line-of-sight projected angular cross-correlation between the photometric and spectroscopic samples wps(θ) and the 3D autocorrelation function of the spectroscopic sample ξss(r). By postulating a simple proportionality between the autocorrelation function of the spectroscopic objects and the 3D cross-correlation function between the two samples ξps(r) ∝ ξss(r), it is potentially possible to infer a very accurate redshift distribution for the photometric sample. This assumption is guaranteed to be valid if the spectroscopic sample is a subsample of the photometric population, but may be problematic if the two sets of tracers have different bias functions with respect to the dark matter.

In this paper, we develop a pipeline to apply the cross-correlation method. In Section 2, we review the theory and explain how the method works. We highlight its strengths and examine potential drawbacks and systematic effects. As a proof of concept, in Section 3 we use the halo model (Cooray & Sheth 2002) to populate N-body simulations with mock photometric and spectroscopic galaxy data to quantify the properties of this redshift distribution estimator. We examine the extent to which different bias functions interfere with the reconstruction of the true distribution of photometric galaxies. In Section 4 we discuss inherent tradeoffs, outline outstanding theoretical questions, and draw our conclusions. We leave a more detailed discussion of error propagation to the Appendix.

2. THEORY

The goal of this paper is to demonstrate via numerical simulations that two spatially overlapping samples, one photometric and one spectroscopic, can be combined to infer the redshift distribution of the photometric sample to very high accuracy. The redshift distribution is

Equation (1)

where $\frac{dN_p}{dz\,d\Omega }$ is the number of photometric galaxies per unit redshift, per steradian, and the quantity in brackets is the total number of galaxies (per steradian) in the sample, ensuring that ϕp(z) integrates to one. If a survey is divided into a number of redshift bins zi, then ϕp(zi) gives the fraction of the total number of galaxies that live in the ith bin. Suppose we observe the angular cross-correlation function between all the photometric galaxies and the spectroscopic galaxies in a particular bin zi. This angular cross-correlation function is related to the photometric redshift distribution that we are attempting to calibrate:

Equation (2)

Here, ξps(r) is the 3D cross-correlation function between the entire photometric sample and the spectroscopic galaxies that live in bin i, which is not observable because the redshifts of the photo-z sample are not known to sufficient accuracy to measure it. The key assumption of the cross-correlation method is that ξps(r) ∝ ξss(r), where ξss(r), the 3D autocorrelation function of the spectroscopic calibrators, is observable. This is a reasonable assumption because on large (linear) scales, both ξps(r) and ξss(r) are related to the underlying dark matter power spectrum Δ2lin(k) as

Equation (3)

Equation (4)

where bs and bp are the linear biases of the spectroscopic and photometric samples (assumed to be scale independent) and j0 is a spherical Bessel function.

On linear scales, the relationship between the cross-correlation function and the (observable) autocorrelation function of the spectroscopic sample is given in Equation (5) (further corrections are required for translinear scales).

Equation (5)

Thus, we may write

Equation (6)

Since bs(z) can be fit with the spectroscopic data, this means that the relation can only be inverted to solve for the product bp(zp(z) in terms of observable quantities. If bp(z) were independent of redshift, it would scale out when the recovered ϕp(z) is normalized to 1. However, for a magnitude limited sample bp(z) is expected to change with redshift. The population of galaxies being examined at high redshift will consist of systematically brighter, rarer, more biased objects than those at low redshift, so the bias can be expected to exhibit redshift dependence in a way that will be difficult to reliably calibrate. This presents a degeneracy with the recovered redshift distribution. Even if the photometric survey were volume limited, evolution in the intrinsic luminosities and densities of these tracer objects can lead to redshift dependence in the bias.

This degeneracy cannot be resolved by measuring the angular correlation function of the photometric sample. We can express wpp in terms of ξss

Equation (7)

Changing variables to a central redshift and a Δz = z1z2, and taking note that ξss vanishes for large Δz, we find that

Equation (8)

In terms of known observables, this relation can be inverted to determine the product b2p(z2p(z). Unfortunately, this quantity has the same direction of degeneracy as the product bp(zp(z). Thus the observable wpp(θ) can only be used to improve the accuracy to which the product is determined, it cannot be used to break the degeneracy between the large-scale bias and the selection function. In order to obtain ϕp(z), it will be necessary either to appeal to some model for the redshift dependence of the bias or to find another observable that can be used to break the degeneracy.

This is significant because estimators of, e.g., the mean redshift of a sample will be affected by assuming a functional form for the bias that is incorrect. If we know we are recovering the product b(z)ϕ(z), then in our estimator of the mean redshift

Equation (9)

while the true $\bar{z}$ is

Equation (10)

It is not unreasonable to expect that the bias may not be a particularly smooth in its transitions if the sample of galaxies accessible to a photometric survey shifts abruptly as some redshift threshold is crossed. One example of a very rapidly changing bias function can be seen in Table 2 of Padmanabhan et al. (2009) where the bias of a sample of LRG galaxies is computed in several photometric bins. At a redshift of z ∼ 0.35, the bias jumps dramatically from 1.77 to 2.36 and back down again (somewhat more smoothly) to 1.9 at higher z. As a quick illustrative example, we take the two functional forms of ϕp(z) used later in this paper (see Section 3.1) and compute the error in $\bar{z}$ that occurs if we assume a smooth transition in the galaxy bias from 1.7 to 1.9 in best, but allow the true bias btrue to jump to 2.3 in between. We find that the fractional error in the mean redshift is 6% and 11% if the interval is 0 < z < 1 and the jump is placed at z = 0.8. If the jump is placed at z = 0.3, the fractional error is 0.05% and 0.2%, which is somewhat less significant, since there is substantially less volume at z = 0.3.

One principle goal of this paper is to examine the extent to which this impacts the cross-correlation method, particularly whether the systematic biases involved are substantial compared to the resolution and accuracy of the photometric distribution that is recovered.

3. TESTS WITH MOCK DATA

3.1. Simulations and Mock Galaxy Catalogs

We use the halo model to populate N-body cold dark matter simulations with mock galaxies to test the cross-correlation method. The simulations compute the evolution of large-scale structure in a periodic, cubical box of side 1 Gpc/h using a Tree-PM code White (2002). There are 10243 dark matter particles of mass 6 × 1010M/h. The randomly generated Gaussian initial conditions are evolved from a starting redshift of z = 75. The Plummer softening is 35 kpc/h (comoving). The simulated cosmology has the parameters: Ωm = 0.25, ΩΛ = 0.75, Ωbh2 = 0.0224, h = 0.72, n = 0.97, and σ8 = 0.8.

Halo catalogs are constructed from this simulation using a friends-of-friends algorithm (Davis et al. 1985) with a linking length of b = 0.168 in units of the mean inter-particle spacing. There are approximately 7.5 million halos of mass greater than 5 × 1011M/h resolved with ∼8 or more dark matter particles. The galaxy catalogs are constructed by populating these halos according to the halo-model prescription with a center-satellite split (Zehavi et al. 2005). It is assumed that very small halos host no galaxies. Halos that cross a mass threshold Mmin are assumed to host one central galaxy. Halos of much higher mass are assumed to have formed through mergers of smaller halos and will host both central and satellite objects. The mean number of galaxies in a halo of mass M (i.e., the Halo Occupation Distribution or HOD; Cooray & Sheth 2002) is given by

Equation (11)

The position of the central galaxy is assumed to be at the halo center defined by the position of the most gravitationally bound dark matter particle. The number of satellite galaxies in any particular halo is drawn from a Poisson distribution, and the satellites are assumed to trace the dark matter. The mock catalogs are constructed from a single simulation snapshot at redshift z = 0.1, rather than from a true light cone. The HOD also has no redshift dependence, so there is no clustering evolution of the galaxy samples in the mock catalogs unless it is intentionally introduced through the selection function, as described below.

We use this prescription to populate the simulation with several different galaxy samples. One population, used as the mock photometric catalog, has a relatively low value of Mmin = 5 × 1011M/h (this choice is driven by the smallest identifiable halos in our catalog). These are fairly common galaxies with large-scale bias bp ≈ 1.6 with respect to the dark matter. Although we know the true value of the redshifts of these objects from the simulations, we do not use any redshift information for the photometric sample when testing the reconstruction algorithm developed here. The other populations we have created are mock spectroscopic samples, with values of Mmin = 1 × 1012M/h and 7 × 1012M/h that generate much rarer mock galaxies with mean number densities of 1.5 × 10−3 and 1.2 × 10−4 h Mpc−1 (a factor of roughly two and 30 fewer galaxies than the photometric sample, before the selection function is applied). The two mock spectroscopic samples have higher biases bs ≈ 1.8 and bs ≈ 2.2 with respect to the underlying dark matter. For some tests of the cross-correlation method, we also use a fair subsample of 50% or 80% of the mock photometric population to define the spectroscopic sample. These mock galaxy samples are all somewhat sparser than those available in real galaxy surveys (our choices are driven by the resolution of our simulations). Thus, trends identified in this study, particularly those that render the cross-correlation method less effective due to noisy correlation function measurements of very sparse samples, may be less severe in real observations.

To test the cross-correlation method, we also need to impose both a selection function and an observation window on the photometric sample. We do not impose any selection function on the spectroscopic sample. For convenience, we choose to perform the reconstruction in bins of comoving distance χi rather than redshift bins zi, but the method can be used with either choice of bins. We place the observer in the center of one face of our simulation cube, and assume that (s)he observes a conical volume with a 12 deg opening angle from the center line (24 deg in total). The cone stretches the length of the box (1 Gpc/h) along the line of sight and has a diameter of roughly half of the box at the far end. We adopt a toy model for the selection function, the sum of two Gaussians, which defines the fraction of galaxies "detected" Nkeep(χ)/N(χ) at each of 70 slices in comoving distance χ through the box.

Equation (12)

We have normalized this quantity so that the maximum value is 1, and we refer to it as the "detection fraction" later in the paper. We present two models for comparison in this paper, [χ1, σ1, χ2, σ2] = [0, 0.15, 0.8, 0.16] and [0.3, 0.07, 0.7, 0.10]. In the limit of small galaxy numbers in the slice, it is significant that we apply the selection function before the geometry for reasons of cosmic variance. The shape of the final photometric redshift distribution is affected both by the selection function and the geometry of the mock observation. This is illustrated graphically in Figure 1, which shows the first of the two selection functions. Once we have mock photometric and spectroscopic samples in place, we compute in each bin χi the autocorrelation function of the spectroscopic sample ξss(r, χi), and the angular cross-correlation between the spectro-z objects in the ith bin and the entire photometric sample wps(θ, χi). We have used the algorithm of Landy & Szalay (1993) to compute the correlation functions. To compute wps(θ, χi), we opt to measure the 3D correlation function ξps(r, χi) and perform the integral of Equation (6) numerically, because the mock observation volume is small which renders direct calculation of wps(θ, χi) noisy. This will not be a problem for surveys that cover a large fraction of the sky.

Figure 1.

Figure 1. Final photometric redshift distribution is affected by the selection function and also the survey geometry.

Standard image High-resolution image

We are eager to test how sensitive the reconstruction of the photometric distribution function ϕp(χ) may be to redshift dependence in the bias bp that is not accounted for. We addressed this question analytically in Section 2, but it is useful to examine the issue with simulations to see if the systematic biases that occur are significant compared to the error bars on the reconstructed distribution. Redshift dependence in the bias can occur because of evolution in the underlying large-scale structure, the clustering properties of galaxies of a given type, and it can occur because the population of photometric objects detected by an instrument changes as a function of redshift, as expected in a magnitude limited sample. Our simulation volume is made of a single time-slice, so we cannot examine the effects of the first two mechanisms at present, but we expect the effects of the second mechanism to dominate the redshift dependence of the bias. We have devised a simple method inspired by the results of Vale & Ostriker (2006) and Conroy & Wechsler (2009) to introduce this type of bias evolution into our mock galaxy sample. Whereas before we applied the selection function in Equation (12) randomly to the galaxies in each slice, now we choose to keep galaxies preferentially that live in the largest halos. We assume that the brightest galaxies live in the biggest halos and that central galaxies are brighter than satellite galaxies at a given redshift. First, we rank order the halos in the slice. We determine the number of galaxies to be kept from Equation (12) and place one in the center of each halo beginning with the halo of highest mass. If the number of galaxies exceeds the number of halos in the slice, we begin placing satellite galaxies. We again start with the largest halo and continue populating each halo with satellites until we have placed all the galaxies. This procedure generates a mock catalog of photometric galaxies whose clustering strength will depend strongly on the selection function. The effect of this procedure on the large-scale bias is somewhat complicated. In the regime where only satellites in the lowest mass halos are being eliminated, if the detection fraction from Equation (12) is high the bias will be close to the bias of the whole population. However as the detection fraction gets lower, the sample will be more highly biased with respect to the dark matter because only satellites in the largest halos are being kept, thus weighting the largest halos more heavily than the smaller halos in the clustering measurement. However, if the detection fraction is so low that all of the satellites are eliminated and the population consists only of centrals, the large-scale bias will be lower than for the whole population, because more weight from the largest halos has been discarded than from the smaller halos. In this analysis we are principally in the second regime, which means the large-scale bias will tend to follow the shape of the selection function.

This is demonstrated in Figure 2, which plots the correlation function measurement of the mock photometric sample in each conical slice. The top panel shows the result if the galaxies are eliminated randomly. In this case the variation among the curves is due to cosmic variance (with more noise in the closest bins because of smaller volume), and the ratio bp/bs is independent of redshift. The bottom panel shows the results from the procedure we just described, and the ratio bp/bs will vary with redshift by construction. The inset shows the variation in the value of ξpp(r) along the line of sight (where we have picked a particular scale, indicated with a vertical line). On the top we show that compared to the last slice (marked with blue squares), the value of ξpp(r) varies by some tens of percent, and is a relatively flat function of the line-of-sight distance. The bottom inset on the other hand clearly reflects the shape of the redshift distribution function used to create the sample (plotted later in the bottom panel of Figure 4) and shows variations in the normalization of ξpp(r) of up to 150%.

Figure 2.

Figure 2. Three-dimensional correlation function of the photometric sample in each slice along the line of sight. The closest and farthest slice plotted are labeled with symbols. The inset shows the relative variation among the curves at a single scale. On the top there is much less variation among the curves than on the bottom. The inset on the bottom clearly reflects the shape of the selection function (and hence the bias) used to create the sample.

Standard image High-resolution image

We emphasize that we do not expect this method to quantitatively capture the redshift dependence of the bias in a real galaxy survey, but we expect it is qualitatively similar, and as long as the bias changes significantly with redshift, it will be useful in testing the magnitude of the effect on the reconstruction. We will refer to this method of applying the selection function as the ordered selection function (OSF) and the simple random method as the random selection function (RSF). By comparing the reconstruction in the two cases, we determine how important it is to account for redshift dependence in the bias.

3.2. Pipeline

Once the cross-correlation wps(θ, χi) and autocorrelation function ξss(r, χi) have been measured, a procedure is needed to perform the inversion of the integral of Equation (6) to obtain the line-of-sight distribution of the photometric sample in some bins χj and to obtain error bars on those measurements. For a single angular scale θ and bin i in spectroscopic redshift, we approximate the integral in Equation (6) as a discreet sum over N redshift bins (ignoring bias evolution for the moment):

Equation (13)

Equation (14)

The observation will be performed in multiple angular bins θ. The values of r in Xij will not be at the exact positions where we have made the measurement of ξss (in our case in 60 bins between 5 and 50 comoving Mpc/h). If the r falls within the domain of our data, we interpolate ξss with a spline, if it is larger than the biggest scale we measure, the value of ξss is extrapolated using a fit of the form r−2 to a subset of data points (larger than 15 Mpc/h) measured in the volume. If the value of r is less than the minimum value of r measured, we reject the entire θ bin, since we do not trust the method for ξss measured on very small scales. At a distance of 250 Mpc/h (for example), this results in rejecting θ values less than ∼1 deg.

Collecting all the observed wps values in a single vector w (one element for each unique combination of χi and θ), the solution for ϕpj) will be obtained by inverting relation:

Equation (15)

Here, X is a matrix whose elements are given by Equation (14) and whose dimension is

ϕ will be a vector of length N, and w will be a vector of length (# spec bins i* # theta bins). It is significant that measurements at different values of θ can mix in the inversion; since they are correlated it is important that they do so. We do not know the pre-factor bp/bs, but as long as it is assumed to be independent of redshift (as in the special case where the spectroscopic sample is a fair subsample of the photometric population) this is not relevant. The reconstructed ϕpj) will not be correctly normalized after the matrix inversion because there is no information in Equation (13) about the total number of galaxies in the photometric sample. The normalization of ϕpj) will be set by requiring that it integrate to 1. Later, we will discuss the consequences of violating the assumption that bp/bs is independent of redshift.

In principle, solving Equation (15) for ϕpj) is a simple matter of inverting a matrix, but in practice doing so is numerically unstable and furthermore errors in the observables w and X must be accounted for in the solution. Since w is a projection through the same dark matter structures traced by X, there will be non-negligible correlations between the errors in the two observables. To extract ϕp(χ), we write down the expression for the χ2 statistic (the first term of Equation (16)), and minimize it. Typically, we are attempting to reconstruct ϕpj) in as many bins N as possible. There is often insufficient information in the correlation functions to completely constrain ϕp in the chosen number of bins, thus it becomes necessary to stabilize against solutions that have highly anti-correlated adjacent bins. Since it is reasonable to assume that the true distribution will not be highly oscillatory, we adopt a smoothness prior to regularize the solution and add it to our expression for χ2 below:

Equation (16)

Here, Cϕ is the covariance matrix incorporating errors in both w and X, and the subscript denotes that it is an explicit function of ϕ. For purposes of the present analysis, however, we will assume that the spectroscopic correlation function is perfectly determined, and we will propagate errors from wps only, specifically C−1ϕ = C−1 = inverse covariance matrix of wps(θ, χi). This is not a good approximation as we will soon show (in fact it is unlikely that ξss will ever be better determined than wps for realistic scenario), and we refer the reader to the Appendix for a description of the full error propagation from C−1ϕ into errors on ϕpj). The reader should consider all errors reported on ϕpj) as lower limits on the total error.

The regularization scheme we have adopted is described in detail in Press (2002, Section 18.5). The intention is to add a term to χ2 that gets large when neighboring points have widely different values. Minimization of χ2 will then tend to solutions that do not have anti-correlated neighboring points. The matrix B is given by

Equation (17)

Note that B has one fewer row than column. The factor λ should be chosen such that the first and second terms in χ2 contribute roughly equal weight. This can be approximately arranged if λ is taken to be

Equation (18)

but in practice the weight in χ2 will also depend on the solution and the values of wps. We find that the quality of the reconstruction depends on how well the two terms in Equation (16) are balanced. Since this depends on the answer, we opt to refine the value of lambda and recompute the solution iteratively as follows.

  • 1.  
    Compute a tolerance parameter defined from the two terms in χ2 as tol = 1–(term 1/term 2).
  • 2.  
    If tol >0, λ is increased by a factor of 10. If tol <0, we decrease it by 10.
  • 3.  
    We recompute the solution and the tol.
  • 4.  
    If the absolute value of tol has decreased, we repeat the refinement of lambda, otherwise we exit and keep the previous value of the solution.

In practice, the procedure usually requires only one refinement of λ. We observe that changing the algorithm to have smaller steps in lambda (e.g., a factor of two rather than 10) does not improve the solution and occasionally oversmooths it. We emphasize that while we have identified an algorithm that works, we have not optimized the application of a smoothness prior. Since the solution is moderately sensitive to the smoothing, care should be taken to understand the properties of the smoothing before applying this technique to real data. Imposing a smoothness prior may be reasonable for a magnitude limited photometric sample, but may be problematic when attempting to calibrate a color-selected sample, which can exhibit jumps in the redshift distribution as features of the galaxy spectra pass in and out of the photometric bands. We have not studied the effects of different smoothing algorithms because it is likely that the need for a smoothness prior will be mitigated in future analyses by using the photometric probability distributions as an additional prior. Smooth portions of the photo-z prior will similarly penalize oscillatory solutions as the smoothness prior we have used in this work, while discontinuous behavior in the photometric priors reflects actual photo-z degeneracies and will guide the reconstruction toward solutions that resolve multi-modal behavior. Since we have no mock photometric redshift probabilities, examining that technique is beyond the scope of this paper, but will be the subject of further research.

To minimize χ2 we take the derivative with respect to ϕ and set it equal to zero. Taking C−1 to be independent of ϕ and noting that it is symmetric, we find

Equation (19)

Equation (20)

Equation (21)

The covariance matrix of the recovered ϕ is given by

Equation (22)

To recover the constant of proportionality, we will need to rescale the elements of the covariance matrix by the square of the factor used to rescale ϕ (since we renormalize such that ϕ(χ) integrates to 1). We caution that all matrix multiplications above are finite sum approximations to integral quantities, so care must be taken to ensure that phi and its error bars come out to scale. For clarity, let us refer to wps(zi) as a function of a continuous variable zi (which will label the spectroscopic bins). ξ(zi, χi) will be a function of both zi and another continuous variable χi (which will label the bins in which ϕ is reconstructed). We are considering a single value of θ, and z and χ can represent either redshifts or comoving distances, we simply use both letters to distinguish them. Ignoring regularization, the continuous expression for χ2 is then

Equation (23)

To minimize χ2 we must set the functional derivative δχ2/δϕ(χ) = 0. We leave the details to the reader, but note that in Equation (19), the factors of Δz that correspond to integrals over z cancel but the factor of Δχ does not.

3.3. Redshift Distribution Reconstruction

In this section, we present a series of reconstructions that examine various aspects of the problem and identify the potential difficulties in applying this method. We begin in Figure 3 with the simplest case and gradually add complexity. The heavy solid line shows the theoretical redshift distribution (selection function + geometry) that has been applied mock photometric catalog of galaxies with Mmin = 5 × 1011 in Equation (11). This redshift distribution corresponds to the selection function shown on the left of Figure 1. We have chosen galaxies randomly (the RSF method described in Section 3.1) in applying the selection function. The resulting catalog has 54,000 galaxies. We have measured the detection fraction in each slice for this particular realization and plot it with a thin wavy line in Figure 3. The spectroscopic calibrating sample is taken to be a different population of galaxies with Mmin = 7 × 1012 yielding around 5600 galaxies, i.e., a highly biased tracer of the density field compared to the photometric sample. We assume that both the observables are measured with perfect accuracy. Since we have assumed perfect knowledge of the spectroscopic correlation function, we opt to measure it in the entire light cone and use the result in each of the i spectroscopic bins ξss(r, χi) = ξwhole volume(r). This is only justified because there is no evolution in the light cone, and the calibrators are not affected by the selection function which changes along the line of sight. We perform the reconstruction by computing the matrices X, w, C−1, and B and using them in Equation (21). The result is plotted in Figure 3 with points. Notice that the reconstruction follows the values of this realization (the actual redshift distribution in this cone) rather than the theoretical redshift distribution used to create the sample.

Figure 3.

Figure 3. Photometric distribution ϕp(χ) and its reconstruction. The x-axis is the comoving line of sight from the observer, who is located at the origin. The thick solid line shows the theoretical value of the distribution, the thin wavy line shows the particular realization in this simulation volume, and the points show the reconstructed solution. The photometric sample is constructed using the RSF method, thus the ratio of the large-scale biases bp/bs is independent of redshift.

Standard image High-resolution image

Figure 4 reveals a more realistic picture. The lines and points in Figure 4 are all the same as in Figure 3. We now measure the autocorrelation of the spectroscopic sample ξss(r, χi) in each of the conic sections and use those measurements to perform the reconstruction. The top and bottom panels show two different choices of selection function, whose parameters are marked in the caption. The top is difficult to reconstruct because it peaks at χ = 0 where there is no volume in the mock observation, and the bottom is a challenge because the feature coincides with the bin spacing, which will be a problem to reconstruct because of the smoothness prior. We have drastically decreased the number of bins, so that there are a reasonable number of calibrators in most of the bins (a few hundred to $ {\cal O}(1000)$). The normalization criterion, which comes from the condition that ϕp integrate to 1, becomes more sensitive to noise in the reconstruction when the number of bins is decreased. In this plot and the plots that follow, we set the normalization by hand so that we can illustrate other points; we let the maximum value of the reconstruction (points) equal the maximum value of the theoretical input ϕ (thick solid line). While the reconstructions in Figure 4 show the correct trends, they are not sufficiently high quality to recover the bimodal behavior in the reconstructed selection function.

Figure 4.

Figure 4. Photometric distribution ϕp(χ) and its reconstruction. The top and bottom panels show two different selection function choices. Each is the sum of two Gaussians with [χ1, σ1, χ2, σ2] = [0.0, 0.15, 0.8, 0.16] on the top and [0.3, 0.07, 0.7, 0.10] on the bottom. The errors come from Poisson error in the cross-correlation measurement. The photometric sample is constructed using the RSF method, thus the ratio of the large-scale biases bp/bs is independent of redshift. The spectroscopic sample is made of rare objects with Mmin = 7 × 1012 in Equation (11). There are very few objects in each conic section, so the correlation functions are poorly measured. The smoothness prior is dominating the shape of the red points because the sparseness of the sampling is making it difficult for the signal to come through, thus the reconstruction is not capturing the bimodal behavior.

Standard image High-resolution image

We now briefly discuss the error bars in Figure 4 obtained from Equation (22). The Cov[ϕ] matrix is not diagonal; we conservatively report $1/\sqrt{\vphantom{A^A}\smash{\hbox{${{\rm Cov}^{-1}[\phi ]_{i,i}}$}}}$ as the error on the ith bin. These error bars represent a lower bound on the error because we have only propagated error from wps and not from ξss. The matrix C−1 in Equation (22) is the inverse covariance matrix of the cross-correlation measurement w, and with sufficient volume can be estimated by dividing the observation into a number of bins and bootstrapping. Here, however, we opt to follow Peebles (1980) and assume that on these scales the errors are dominated by Poisson noise. In this case, C−1 is diagonal and its elements are given by

Equation (24)

We use survey parameters appropriate to large upcoming missions. We take $\bar{n}_s=1\times 10^5$ galaxies per (Gpc/h)3, $\bar{\Sigma }_p=100$ photometric galaxies per square arcmin, and Ω = 20, 000 deg2 on the sky. Δχs is the width of the bin of spectroscopic data. The resulting error bars in Figure 4 and those that follow display a surprising trend. Even though the number of pairs is dramatically larger for bins at farther comoving distance, the reconstruction of the photometric distribution is less accurate in the most distant bins. The key to understanding this trend is that for a fixed angular scale θ, the physical scale being probed in distant bins is much larger than at nearby distances. Since the correlation function is a steeply dropping function of separation, the impact of shrinking values in X of Equation (22) dominates over the growing number of pairs in C−1. By measuring wps on smaller angular scales the errors in the farthest bins can be improved, but there is a limit to how far this can be pushed because measurements at small angular scales will send the near field into the trans-linear and one-halo regime, for which there are corrections to the underlying assumption that ξps ∝ ξss.

In Figure 5 we switch to a larger calibration sample with Mmin = 1 × 1012, which allows a much more accurate measurement of the spectroscopic correlation function. We see that the reconstruction now captures the bimodal behavior, but both the resolution (bin spacing) and errors are modest and the smoothing is still evident in the last bin. Notice that the error bars have increased; this is because this calibration sample has lower bias, so the elements in X in the covariance matrix of ϕ(χ) (Equation (22)) are all smaller. This is, however, an illusion. Had we propagated the error from ξss, it would contribute larger errors to Figure 4 than to Figure 5 because the measurement is noisier for the smaller sample. In moving to the larger sample, the number of spectra required has increased by an order of magnitude and is now the same size as the mock photometric catalog (after the selection function has been applied). In summary, comparison of Figures 3, 4, and 5 demonstrates that ξss(χ) must be reasonably well determined for the shape of the reconstructed photometric distribution to be well captured. It is worth noting that we have used no priors in the determination of ξss(χ), improving and applying theoretical priors may yield significantly better results with fewer spectra.

Figure 5.

Figure 5. Photometric distribution ϕp(χ) and its reconstruction. Now the spectroscopic population has Mmin = 1 × 1012 and is roughly the same size as the photometric one though it is more biased and comprised of different galaxies. Because the correlation function has been measured well, the reconstruction recaptures the bimodal behavior.

Standard image High-resolution image

The fact that the more numerous calibrators generated such improved results suggested that the noisiness of the spectroscopic correlation function is very likely driving the poorness of the reconstruction. One alternative way to avoid noisiness in the calibrating sample might be to assume a template for the spectroscopic correlation function and fit it to the noisy data. We attempted to improve the reconstruction with the rarer calibrators by simply fitting the spectroscopic observations with a two-parameter power law, and performing the reconstruction using the fin instead of interpolating the measured data. This alternate procedure appears to discard too much information contained in ξss(χ). The result was visually similar to Figure 4, namely, the bimodal behavior in the distribution is lost.

We add another layer of complexity in Figure 6. We apply our selection function in such a way that the bias of the remaining objects will exhibit redshift dependence (the OSF method described in Section 3.1). We continue using the less rare calibrating sample with halo model Mmin = 1 × 1012 for this illustrative proof of concept. Although it is not very statistically significant, the eye can pick out a trend in the reconstruction; the reconstructed distribution is suppressed in areas of low detection fraction. Since the bias is tracking the detection fraction, the areas of low detection are enhanced less than the areas of high detection, thus they appear suppressed when we normalize to the highest point. At this level of accuracy, it is unlikely that this systematic will dominate the error in tomographic analyses, however for much larger surveys it could be a significant concern.

Figure 6.

Figure 6. Photometric distribution ϕp(χ) and its reconstruction. Here the selection function applied to the photometric sample causes the bias to change as a function of comoving distance from the observer. The calibrators are the Mmin = 1 × 1012 sample, whose bias is independent of redshift. The result is a systematic shift of marginal significance in areas of low detection fraction.

Standard image High-resolution image

In Figure 7, we show how the situation is altered if a fair subsample of the photometric population is used instead of a rarer biased tracer population. This is a special case because the bias of the calibrators will have identical redshift dependence as the photometric sample (the ratio bp/bs = 1). We show two different subsamples in this plot, 50% and 80% of the photometric catalog, and test the reconstruction for the distribution on the right in Figures 4, 5, and 6. We see that for a survey of this volume, 50% is too small to capture the bimodal behavior in the reconstruction, but with 80%, the reconstruction works well. Indeed the systematic offset in Figure 6 is remedied and the reconstruction follows the realization to much better accuracy. Of course spectroscopic follow-up of such a large fraction of the sample is more than sufficient to calibrate the redshift distribution without using the cross-correlation technique (black stars). For larger surveys the fraction needed will be smaller than indicated here, because they will be able to measure the spectroscopic correlation functions with greater accuracy. For surveys large enough not to be limited by noisy spectroscopic correlation functions, it may be necessary to calibrate with a fair subsample of galaxies to avoid systematic bias in the redshift distribution that comes from redshift dependence in the bias of the photometric sample.

Figure 7.

Figure 7. A fair subsample is used to reconstruct ϕp(χ) instead of the Mmin = 1 × 1012 sample. The bias of the subsample has the same redshift dependence as the bias of the photometric sample. The reconstruction is studied only for the distribution on the right of Figures 4, 5, and 6. The top panel shows a subsample of 50%, which is not sufficient to capture the bimodal behavior due to poor statistics in the spectroscopic bins. The bottom shows 80%, which does well, and shows no systematic offset due to evolving bias. The black stars demonstrate that a fair subsample of this size is more than sufficient to characterize the redshift distribution without the need for the reconstruction technique.

Standard image High-resolution image

4. DISCUSSION

We have outlined the theory behind the cross-correlation method for calibrating the redshift distribution of objects with photometric redshifts and developed a pipeline than can be used to apply the method to survey data. We have created mock simulations to test the pipeline. We have succeeded in reconstructing the redshift distribution of the mock photometric galaxies using the angular cross-correlation of these galaxies with an overlapping spectroscopic sample (whose redshifts are known). We have not used any redshift information about the photometric sample. We have demonstrated the validity of the method. We have also identified the aspects that are likely to be the limiting factors in relying upon this method to provide accurate redshift distribution information. These limiting factors are (1) that the spectroscopic sample must be binned along the line of sight, causing their correlation functions to be noisy and interfere with the reconstruction, and (2) that the changes in the bias with redshift of the photometric sample cannot be disentangled from the redshift distribution that is reconstructed, which may necessitate the follow-up of a fair subsample of galaxies. Improved modeling of theoretical priors could yield large dividends if these factors can be mitigated.

The analysis has revealed a number of trade-offs that exist in application of this method. For a given set of calibrators, there is a trade-off between the resolution (bin spacing) of the redshift distribution reconstruction and the error bars on any individual point. Thus if a population of catastrophic outlier were discovered, for example, it would be interesting to investigate whether it is more important to know how many there are or at exactly which redshift they lie. We also find a trade-off between the number of calibrators and the quality of the reconstruction. As the number of available spectra are decreased, the spectroscopic redshift bins need to widen to maintain equivalent signal strength. This in turn affects how finely the reconstructed redshift distribution can be sampled. Bimodal behavior may be lost, and with insufficient priors (such as the smoothness prior we implemented here), false bimodal behavior may appear in the form of anti-correlated adjacent points. The survey geometry will also impact the requirements on the spectroscopic calibrators; at a fixed spectroscopic bin spacing, a pencil beam survey will require spectra for much fainter objects than a wide survey in order to achieve a sufficiently strong signal in the spectroscopic correlation function to accurately perform the reconstruction.

Widening the bins to compensate for fewer spectra also introduces another difficulty. Recall that the error in the angular cross-correlation function goes as $\delta w_{{\rm ps}} \sim 1/ \sqrt{n_{\rm pairs}}$. This means that

Equation (25)

For the purposes of this order of magnitude argument, suppose ϕp were constant, then to integrate to 1 requires ϕp(χ) ∝ 1/Δχrb. The subscript rb is used to indicate the width of the reconstruction bin (not the spectroscopic bins). When we remove it from the integral, we are left with an expression that is wp(rp):

Equation (26)

If the reconstruction bin is widened by a factor of two, the number density of photometric galaxies will have to be increased by a factor of four to preserve the same accuracy in the cross-correlation measurement. Therefore, in this method there is also a trade-off between the number of spectra that the survey can afford versus the number of photometric galaxies they can afford.

In this analysis we have not made any use of the photometric redshifts, which although not sufficiently accurate to determine the true redshift distribution of the population still contain significant information about it. Modern techniques such as in Gerdes et al. (2010) have made it possible to assign a probability distribution for the redshift of each individual galaxy in a photometric survey, rather than a single best estimate and error bar. We propose that combining these probabilities for all the galaxies in a given redshift bin constitutes a reasonably powerful prior that can take the place of the smoothing we have introduced in this analysis. This is fortunate because the solution is frequently wrecked by the smoothing criterion, although the analysis cannot be performed without it. A second improvement that should be made is to improve the modeling so that scales between 1and5 Mpc/h can be used. Since true physical correlations increase with decreasing separation, much of the available signal is as these smaller separations. Another detail that we leave for a future study is that the spectroscopic sample need not be comprised of a single rare population; it is quite conceivable to target certain regions of redshift space that are known to be problematic more heavily. It must also be possible to use less rare tracers in regions with smaller volume. We leave such optimization to the future.

There are a few important factors that we have neglected in this analysis. As pointed out by Bernstein & Huterer (2010), weak gravitational lensing will induce correlations between the positions of calibrator galaxies in the foreground with photometric galaxies in the background, and vice versa. This will need to be carefully controlled for the method to reliably calibrate redshift distributions. We also have not mentioned the complication of the integral constraint, which as discussed in, e.g., Huff et al. (2007) can lead to very significant errors when the volume and the scales in the correlation function are of comparable size. This may be as significant an issue as redshift dependence in the large-scale bias of the photometric population, though it may be mitigated if integral constraint errors are correlated between photometric and spectroscopic samples. Fortunately, improved estimators exist (in Padmanabhan et al. 2007, for example) and should certainly be incorporated into the pipeline.

Having now demonstrated that the method is viable, with further refinement the cross-correlation method applied in conjunction with direct follow-up surveys may significantly reduce the number of spectra that are required to calibrate photometric redshift distributions to the desired accuracy. There are a few other advantages that we have not touched upon in detail. As we have shown, the cross-correlation method can be used to calibrate the redshift distribution all the way along the line of sight, and as such is uniquely suited to detection and calibration of catastrophic redshift errors, even if these errors are so rare that they are missed by conventional follow-up. Also, the reconstruction should not be adversely affected by redshift deserts, the regions where no spectroscopic redshifts are available. We are optimistic that this technique will provide a useful complementary approach to conventional calibration techniques, and every effort should be made to refine the method further.

Many thanks to Martin White for the use of his simulations and for pointing the way out of a number of tight corners. Conversations and e-mails with Rachel Mandelbaum, Jeff Newman, Nikhil Padmanabhan, Doug Rudd, William Schulz, David Shih, and many others were also incredibly helpful. A.E.S. is supported by the Corning Glassworks Foundation Fellowship and NSF (AST-0807444) at the Institute of Advanced Study.

APPENDIX: ERROR PROPAGATION

The treatment in this paper proceeds by assuming that the correlation function of the spectroscopic calibrating sample is perfectly determined. However we show that since the tracers are rare, the measurements of ξss can be quite noisy due to binning finely along the line if sight. Error in the measurement of ξss may ultimately dominate the error budget, and it is therefore important to lay out the procedure for properly incorporating it. Ignoring the regularization term, the expression for χ2 is

Equation (A1)

Cϕ is the covariance matrix that includes errors in both w and X. The elements of the covariance matrix are

Equation (A2)

which depend explicitly on the solution ϕ. This renders the problem nonlinear and will require iterative numerical methods to minimize χ2. To cut down on the proliferation of indices, the expressions in C are written out below for the case of a single angular bin θ:

Equation (A3)

Equation (A4)

Equation (A5)

To generalize to multiple theta bins, let the index i in, e.g., epsilonwi) run over each unique combination of θχ, rather than just over χ. In the expressions for epsilonξi, χj), it is worth mentioning that in general i and j run over a different number of bins and k and l run over the number of reconstruction bins N.

It is often useful in numerical minimization to have an expression for the derivative of χ2. This is

Equation (A6)

To simplify the expression we have suppressed the indices, but note that the derivative of C is a rank 3 object, with two dimensions of the length of w and one dimension of the length of ϕ (i.e., N). Since C is a symmetric matrix, the derivative is given by

Equation (A7)
Please wait… references are loading.
10.1088/0004-637X/724/2/1305