PHOTOMETRIC SUPERNOVA COSMOLOGY WITH BEAMS AND SDSS-II

Renée Hlozek; Martin Kunz; Bruce Bassett; Mat Smith; James Newling; Melvin Varughese; Rick Kessler; Joseph P. Bernstein; Heather Campbell; Ben Dilday; Bridget Falck; Joshua Frieman; Steve Kuhlmann; Hubert Lampeitl; John Marriner; Robert C. Nichol; Adam G. Riess; Masao Sako; Donald P. Schneider

doi:10.1088/0004-637X/752/2/79

1. INTRODUCTION

The unexpected faintness of distant Type Ia supernovae (SNe Ia) was the key to the discovery of late-time cosmic acceleration (Riess et al. 1998; Perlmutter et al. 1999). A decade later, the discovery and analysis of large numbers of high-quality SNe Ia remain cornerstones of modern cosmology, with various surveys probing SN Ia over a huge range of distances, with a particular focus on understanding and removing potentially unaccounted-for systematic errors and sharpening them as standard candles (e.g., Hamuy et al. 1993, 1996a, 1996b, 2006; Riess et al. 1999, 2004; Tonry et al. 2003; Sollerman et al. 2005; Krisciunas 2008; Kessler et al. 2009a; Lampeitl et al. 2009; Holtzman et al. 2008; Freedman et al. 2009; Contreras et al. 2010; Li et al. 2000; Hicken et al. 2006; Kowalski et al. 2008; Jha et al. 2006). The current state-of-the-art is a heterogenous sample of hundreds of SNe, predominantly at intermediate redshifts, z < 1 (Sullivan et al. 2011; Krisciunas 2008; Amanullah et al. 2010; Kowalski et al. 2008; Hicken et al. 2009; Aldering et al. 2002), with a high-redshift, z > 1, sample from the Hubble Space Telescope (Riess et al. 2004) anchored with a low-redshift sample, z < 0.02 (Hamuy et al. 1996a; Riess et al. 1999; Krisciunas et al. 2001, 2004a, 2004b; Jha et al. 2006; Contreras et al. 2010).

The Sloan Digital Sky Survey II Supernova Survey (SDSS-II SN) data (Holtzman et al. 2008; Kessler et al. 2009a) fill in the "redshift gap" between 0.02 < z < 0.4. In these surveys, multi-band photometric light curves are very successfully used to estimate the probability that a candidate is an SN Ia as opposed to a core-collapse SN (Ibc or II) or other object, providing vital intelligence for the selection of likely SN Ia for spectroscopic follow-up (Johnson & Crotts 2006; Kuznetsova & Connolly 2006; Connolly & Connolly 2009; Poznanski et al. 2006; Rodney & Tonry 2009, 2010; Kim & Miquel 2007; Sako et al. 2008, 2011; Zentner & Bhattacharya 2009; Scolnic et al. 2009; Gong et al. 2010), if not currently used for making Hubble diagrams.

Future surveys such as the Dark Energy Survey¹⁹ (DES; Bernstein et al. 2009), Pan-STARRS²⁰ (Kaiser & Pan-STARRS Team 2005), and the Large Synoptic Survey Telescope²¹ (LSST, Abell et al. 2009) will vastly increase the numbers of detected SNe, perhaps by a factor of a thousand in the case of LSST. However, the quandary facing these surveys is how to make appropriate use of this surfeit of data given that spectroscopic confirmation will only be possible for a small fraction of the promising SN Ia candidates, varying somewhere between 0 and 20%. The wealth of current and future data is therefore driving us inexorably toward an era of purely photometric SN cosmology in which most of the cosmological constraints from the survey come from SNe with no spectroscopic information, except perhaps for a host redshift obtained with a multi-object spectrograph.

Photometric SN cosmology is not a task to be undertaken lightly. While multi-band photometric methods strive to reduce the amount of contamination of the SN Ia sample from interlopers to a minimum, there will always be some level of contamination—typically around the few percent level (Bazin et al. 2011; Abell et al. 2009; Sako et al. 2008, 2011)—and this biases the inferred cosmological constraints at an unacceptable level if one simply uses standard χ² inference techniques. In this standard paradigm, one is faced with a choice between inevitable contamination using all data, restricting the sample size to those SNe that can be followed up spectroscopically, wasting the available data at hand, or defining a smaller subset from the photometric candidates that has a high SN Ia purity (see Bernstein et al. 2011 for one such treatment). Fortunately, one can rigorously incorporate contamination effects into a Bayesian inference framework to yield unbiased cosmological results.

In this paper, we apply the resulting framework, Bayesian Estimation Applied to Multiple Species (BEAMS; Kunz et al. 2007), to purely photometric SN data with host galaxy redshift information. We test the algorithm against various simulations and describe potential challenges in future photometric analyses. In addition, we apply the algorithm to the photometric SDSS-II SN data sample with host galaxy redshifts. While still in its developing stages, photometric SN cosmology is a very promising approach to exploit the deluge of SN data expected in the next decade, and we show how BEAMS is one approach that is robust to general assumptions about the SN population.

2. THE BEAMS FRAMEWORK FOR PHOTOMETRIC SN COSMOLOGY

2.1. Basic Formalism

The current state-of-the-art is to restrict any contamination from non-SN Ia interlopers by only doing cosmology on those candidates that have been spectroscopically confirmed as being SN Ia through the identification of characteristic absorption lines, such as the Si ii λ6150 feature (Oke & Searle 1974; Kirshner et al. 1973). While this strategy is feasible for the current level of precision, using only spectroscopically confirmed SNe in the cosmology analysis from future large surveys, such as DES and LSST, will result in throwing away the great majority of interesting candidates.

BEAMS was developed to make the most of the upcoming large data sets (Kunz et al. 2007). It is a general Bayesian framework that allows use of all available candidates provided we have some indication of how likely they are to be an SN Ia. While BEAMS is a general method for estimating parameters from any type of data that may be contaminated, it is readily applied to the SN problem, where we wish to evaluate the posterior distribution $P(\boldsymbol{\theta }|\boldsymbol{d}),$ the probability of cosmological parameters, which we denote by $\boldsymbol{\theta } = (\theta _1, \theta _2,\ldots ,\theta _j, \theta _m)$ , given the SN data, expressed as a vector d = (d₁, d₂, ..., d_i, ..., d_N) of N measurements. In the SN case, for example, d consists of pairs of either the distance modulus μ and redshift, d_i = (μ_i, z_i), or apparent magnitude m and redshift, i.e., d = (m, z).

To apply BEAMS, we invoke a theoretical binary vector $\boldsymbol{\tau } = (\tau _1, \tau _2,\ldots , \tau _i,\ldots ,\tau _N)$ of length N, equal to the number of data points. The entries of this vector are one if the corresponding data point is an SN Ia and zero if the point is not (for e.g., Type Ibc, II or non-SN), that is, τ_i = 1(0) if the ith data point is (or is not) an SN Ia. While this represents the underlying "truth," we shall see later that we will use a proxy for the true type: the "probability" of being a Ia. In general, the class of "non-SN Ia" SNe can be subdivided into many subclasses, in which case $\boldsymbol{\tau }$ would not be a binary vector but would index the possible sub-classes. Here we consider only the simple binomial case.

Applying Bayes' theorem and marginalizing over all possible values of the vector $\boldsymbol{\tau } = \tau _1, \tau _2,\ldots ,\tau _N$ , the general posterior becomes (Kunz et al. 2007)

$\begin{equation} P(\boldsymbol{\theta }|\boldsymbol{d}) = \sum _\tau P(\boldsymbol{\theta },\boldsymbol{\tau }|\boldsymbol{d}) = \sum _\tau P(\boldsymbol{d}|\boldsymbol{\theta }, \boldsymbol{\tau })\frac{P(\boldsymbol{\theta },\boldsymbol{\tau })}{P(\boldsymbol{d})}, \end{equation} \tag{ 1 }$

where $P(\boldsymbol{\theta }, \boldsymbol{\tau })$ is the prior for the parameters and $P(\boldsymbol{d})$ is the usual evidence factor, which does not depend on the cosmic parameters.

Assuming that the data are uncorrelated, we then split the effective posterior into two parts for each ith data point:

$\begin{eqnarray} \left. P(\boldsymbol{d}|\boldsymbol{\theta },\boldsymbol{\tau })P(\boldsymbol{\tau })\right|_i &=& [P(d_i|\boldsymbol{\theta }, \tau _i = 1)P_i \nonumber \\ &&+ P(d_i|\boldsymbol{\theta }, \tau _i = 0)(1-P_i)], \end{eqnarray} \tag{ 2 }$

where P_i = P(τ_i = 1) is the probability that a given point is in fact an SN Ia, $P(d_i|\boldsymbol{\theta }, \tau _i = 1)$ is the likelihood of the SN Ia distribution, and $P(d_i|\boldsymbol{\theta }, \tau _i = 0)$ is the non-SN Ia likelihood. This assumption of uncorrelated data is crucial in separating the contributions of the SN Ia and non-SN Ia populations to the final posterior, and relaxing this assumption will require a more complex statistical description.

The SN Ia probabilities P_i can be determined through fitting light curves to standard SN Ia models such as SALT2 (Guy et al. 2007) or MLCS2k2 (Jha et al. 2007), which typically assume that the data belong to the SN Ia class and hence fit SN Ia light-curve templates to all the data points. The light-curve fitter returns the goodness of fit of the light curves normalized by the degrees of freedom (dof), which can be used as prior probability for BEAMS, denoted by P_fit:

$\begin{equation} P_i = P_{\mathrm{fit}} \propto \exp \big(\chi ^2_{lc}/\mathrm{dof}\big). \end{equation} \tag{ 3 }$

Another method of determining the SN Ia probability is through fitting multiple templates to the data and obtaining probabilities from the relative χ² of the fits from different SN templates (Ia, Ibc, II, etc.), which we will denote by P_typer, such as the PSNID typer (Sako et al. 2008, 2011). However, as we will describe below, we will apply a normalization to these probabilities determined from light-curve fits. The probability (3) represents how well a light curve fits the photometric magnitudes and does not tell you a priori how likely the data point is to be a Ia. For example, if both SN Ia and SN II templates fit the data equally well with a χ²_fit/dof = 0.95, then the relative probability of each type is P_i = 0.5, while the probability of being an SN Ia obtained only from the fit of the SN Ia light curve to the data is P_fit = 0.65. We add another dimension to the standard Hubble diagram as shown in Figure 1. Typically the probabilities are combined with additional selection cuts (Sako et al. 2011; Bernstein et al. 2011). Hence, in general, the conversion from a normalized χ²_lc to P_i can lead to skewed probabilities (see Newling et al. 2011a for a detailed investigation of potential bias from incorrect assumptions about the probabilities—we test for distance-modulus–probability correlations in Appendix A), as selection cuts, which are typically based on SN rate or intrinsic brightness, can introduce redshift-dependent selection bias. In addition, a data set may contain very different numbers of SN Ia and core-collapse SNe. For this reason we add a global parameter A that re-normalizes the relative probability (the Bayes factor) of being of SN Type Ia or not. We define B = P/(1 − P) and normalize the probabilities to $\tilde{B} = AB.$ The new probability, then, is given as

$\begin{equation} \tilde{P_{i}} = \frac{\tilde{B}}{1+\tilde{B}} = \frac{A P_{i}}{(1-P_{i}+AP_{i})}. \end{equation} \tag{ 4 }$

We impose positivity constraints on the parameter A (and sample A subject to a Jeffreys' prior, i.e., we sample uniformly in ln A such that A < 100) and estimate A at the same time as the other parameters. Equation (4) is monotonic in P and yields a normalized probability between zero and unity. In general, the parameter A will depend on the quality of the data, as this determines the accuracy of the typing procedure. This mapping provides an indication of whether or not the input probabilities (from the light-curve fitter, for example) are biased, as we expect the distribution to be peaked around one. The re-mapping of probabilities allows BEAMS to "correct" for bias in the input probabilities. The parameter A is necessary to debias the probabilities with respect to the overall SN Ia/non-SN Ia ratio of the whole sample even if the light-curve fitter gives a perfect "per SN" probability. We discuss this in detail in Section 4.4. In general, one might allow A to vary with redshift, or indeed with the light-curve model used, given the variety of assumptions made by the light-curve fitters. We leave these tests to future work, and in this analysis we consider only one global parameter A.

**Figure 1.** Three-dimensional photometric Hubble diagram for the SDSS-II data: residuals relative to the input cosmology for the SDSS-II SN photometric data with host spectroscopic redshifts (discussed in Section 3.3). SNe are measured in both redshift and distance modulus space, but the BEAMS algorithm includes probability information, adding a third dimension to standard SN cosmology. The SNe Ia clearly show up on the left at high probability, but with some small contamination that must be accounted for.
Download figure:
Standard image High-resolution image

In addition, it is important to note that in applying the BEAMS algorithm we do not assume a known population of SNe Ia (or non-SNe Ia), and hence no probabilities are set to zero or unity in the analysis, even if they have been spectroscopically confirmed, as the number of known SNe Ia will be much smaller than the total photometric sample in future surveys.

The total BEAMS posterior is then

$\begin{eqnarray} P(\boldsymbol{\theta }|\boldsymbol{d})& \propto & P(\boldsymbol{\theta }) \times \prod _{i=1}^N \big\lbrace P(d_i|\boldsymbol{\theta },\tau _i=1) P_{i}^{(A)} \nonumber \\ &&+ P(d_i|\boldsymbol{\theta },\tau _i=0) \big(1-P_{i}^{(A)} \big) \big\rbrace , \end{eqnarray} \tag{ 5 }$

where the parameter vector $\boldsymbol{\theta }$ now contains the cosmological parameters {H₀, Ω_m, Ω_Λ}, the probability parameter, A, and the extra parameters necessary to model the SN likelihoods as discussed below. $P(\boldsymbol{\theta })$ represents the prior probabilities of the parameters. If we are interested only in the cosmological parameters, then we marginalize over all the others. It is worth noting that as the probabilities P_i and distances d_i are (potentially) obtained using the same light-curve fits, these two quantities could be correlated. We search for correlations in the probabilities in Appendix A. However, allowing for the normalization factor A reduces the effect of this correlation, and one can also use different fitting techniques to obtain the probabilities (such as using SNANA for the distance modulus and the PSNID approach to obtain the probabilities P_i).

We now discuss in detail our choice of the SN Ia and non-SN Ia likelihoods.

2.2. The Likelihood Distribution for SN Ia

The SN Ia likelihood given the parameters $\boldsymbol{\theta }$ is modeled as a Gaussian for the observed distance modulus μ_i centered around the theoretical value $\mu (z,\boldsymbol{\theta })$ with a variance σ²_{tot, i}:

$\begin{equation} P(\mu _i|\boldsymbol{\theta }, \tau _i = 1) = \frac{1}{\sqrt{2\pi }\sigma _{\mathrm{tot},i}}\exp \left(-\frac{(\mu _i - \mu (z_i,\boldsymbol{\theta }))^2}{2 \sigma _{\mathrm{tot},i}^2}\right). \end{equation} \tag{ 6 }$

The distance modulus is related to the cosmological model via

$\begin{equation} \mu (z, {\boldsymbol{\theta }}) = 5\log d_L(z,{\boldsymbol{\theta }}) + 25, \end{equation} \tag{ 7 }$

where

$\begin{eqnarray} {d_L(z, {\boldsymbol{\theta }}) = \frac{c(1+z)}{\sqrt{\Omega _k}H_0}\nonumber \times }\sinh \left(\sqrt{\Omega _k}\int _0^z \frac{dz}{E(z)}\right)&& \nonumber \\ \end{eqnarray} \tag{ 8 }$

is the luminosity distance measured in Mpc, and the expansion rate is given by

$\begin{equation} E(z) = {\sqrt{\Omega _m(1+z)^3 + \Omega _k(1+z)^2 + \Omega _{\mathrm{DE}}f(z)}}. \end{equation} \tag{ 9 }$

The energy densities relative to flatness of matter (Ω_m), curvature (Ω_k), and dark energy (Ω_DE) obey the relation Ω_m + Ω_k + Ω_DE = 1. The distance modulus is defined as the difference between the absolute and apparent magnitudes of the SN, μ = m − M, with additional corrections made to the apparent magnitude for the correlations between brightness, color, and stretch and a K-correction term related to the difference between the observer and rest-frame filters, for example. The corrections are typically made within the model employed in a light-curve fitter, such as that for MLCS2k2.

The parameter H₀ is the Hubble constant, and the function f(z) = ρ_DE(z)/ρ_DE(0) describes the evolution of the dark energy density. While one of the ultimate goals in SN cosmology is to test for dynamics with redshift of the equation of state w, this relies on a sample at both high and low redshift to anchor the Hubble diagram and provide a long lever arm. In this work, we discuss how BEAMS improves constraints on parameters when including a photometric sample, and hence we do not include the low- or high-redshift samples in this case. For this reason we focus on the Ω_m, Ω_Λ combination of cosmological parameters and so will only consider ΛCDM models for which f(z) = 1. In principle we should also consider radiation, but its energy density is negligible at late times when we observe SNe.

In this application of BEAMS, we have assumed that the distance modulus μ is obtained directly from the light-curve fitter (such as is the case for fitters that use the MLCS2k2 light-curve model); however, this is not an implicit assumption. In the case of the SALT light-curve fitter, the distance modulus would be reconstructed using a framework such as that outlined in Marriner et al. (2011) before including in the BEAMS algorithm.

We model the error on the distance modulus of each SN as a sum in quadrature of several independent contributions,

$\begin{equation} \sigma _{\mathrm{tot},i}^2 = \sigma _{\mu ,i}^2 + \sigma _{\tau }^2 + \sigma _{\mu ,z}^2, \end{equation} \tag{ 10 }$

where σ_{μ, i} is the error obtained from fits to the SN light curve and σ_τ is the characteristic intrinsic dispersion of the SN population, which we add as an additional global parameter to the vector $\boldsymbol{\theta }$ with Jeffreys' prior. The constraints do not depend strongly on the prior used for the intrinsic dispersion. We have ignored any correlation between the SNe due to calibration, or other effects, such as those presented in detail in Conley et al. (2011) and March et al. (2011). Including correlations makes the BEAMS approach more complicated to apply; an approach to address such correlations will be presented in future work. The error term σ_{μ, z} converts the uncertainty in redshift due to measurement errors and peculiar velocities into an error in the distance of the SN as

$\begin{equation} \sigma _{\mu ,z} = \frac{5}{\ln (10)}\frac{1+z}{z(1+z/2)}\sqrt{\sigma _z^2 + (v_{{\rm pec}}/c)^2}, \end{equation} \tag{ 11 }$

with σ_z as redshift error and v_pec as the typical amplitude of the peculiar velocity of the SN, which we take as 300 km s⁻¹ (Lampeitl et al. 2009; Kessler et al. 2009a). Within the fitting procedure, the observed heliocentric redshift is used for K-corrections and anything related to light-curve fitting, and this is incorporated into the σ_{μ, i} error. The σ_z error in Equation (11) is only used for the Hubble diagram fit. While not including the effect of the uncertainty in redshift on the light-curve fit is logically incorrect (see Conley et al. 2011), our results are not affected by this omission since all the redshifts are determined spectroscopically with an error that is negligible for the purposes of this analysis. This will become much more important when using photometric redshift information for the SNe, as will be the case with LSST, for example.

In general, light-curve models such as SALT2 (Guy et al. 2007) or MLCS2k2 (Jha et al. 2007) are used to fit fluxes in various bands and time epochs to obtain a distance modulus. The two light-curve models are based on different approaches and hence make different assumptions about the underlying SN properties. In general, one might also include a systematic error due to differences in distance modulus from using different light-curve fitters as discussed in Kessler et al. (2009a). However, given that we are fitting the light curves using only the MLCS2k2 model in this analysis, and as we are interested in the relative improvement of constraints when applying BEAMS, we ignore this constant systematic error without loss of generality.

2.3. Forms of the Non-SN Ia Likelihood

The general form of the non-SN Ia likelihood will be complicated since there are several sub-populations. Given the limited number of non-SN Ia in the SDSS-II SN data set, however, we will model it with a single mean and a dispersion. If one chooses to describe a population using only a mean and a variance, statistically the least informative (maximum entropy) choice of the probability distribution function (pdf) in this case is also a Gaussian (Jaynes & Bretthorst 2003),

$\begin{equation} P(\mu _i|\boldsymbol{\theta }, \tau _i = 0) = \frac{1}{\sqrt{2\pi }s_{\mathrm{tot},i}}\exp \left(-\frac{(\mu _i - \eta (z_i,\boldsymbol{\theta }))^2}{2 s_{\mathrm{tot},i}^2}\right). \end{equation} \tag{ 12 }$

As we do not know the mean η and variance s²_{tot, i} of the non-SN Ia population, we describe them with additional parameters. We will keep the parameterization of the mean very general (see below), but for the variance we restrict ourselves to the same form as for the SNe Ia, Equation (10), but with a potentially different intrinsic dispersion s²_τ described by an independent parameter (again with a Jeffreys' prior). We assume that the measurement errors and the contribution from the peculiar velocities enter in the same way for SN Ia and other SNe and so keep these terms identical.

We do not know what to expect for the mean of the non-SN Ia pdf, and so we allow for a range of possibilities. As the brightness is linked to the luminosity distance through Equation (7), we describe the expected non-SN Ia distance modulus (as provided by the light-curve fitter) as a deviation from the theoretical value, $\eta (z,\boldsymbol{\theta })=\mu (z,\boldsymbol{\theta })\,{+}\,\Upsilon (z)$ , where we consider the following Taylor expansions of the difference as a function of redshift:

$\begin{eqnarray} \Upsilon (z) = \eta (z,\boldsymbol{\theta })-\mu (z,\boldsymbol{\theta }) &\propto :& \sum ^3_{i=0} (a_i z^i)/(1+dz){.} \end{eqnarray} \tag{ 13 }$

We consider the cases where we set different combinations of the parameters (a_i, d) to zero and employ a criterion based on model probability to decide which of these functions to use. We note that the explicit link of $\eta (z,\boldsymbol{\theta })$ to $\mu (z,\boldsymbol{\theta })$ carries a risk that the non-SN Ia likelihood can bias the posterior estimation of the cosmological parameters. For this reason we verify that the contours do not shift when we set directly $\eta (z,\boldsymbol{\theta })=\Upsilon (z)$ , although we will need a higher-order expansion in general (and of course the recovered parameters of the function ϒ(z) will change). In general, as long as the basis assumed has enough freedom to fit the deviation in distance modulus of the non-SN Ia population from the SN Ia model, the inferred cosmology will not be biased.

It is also worth noting that the distances for the non-SN Ia population are obtained assuming that the data are in fact from an SN Ia distribution (this is an implicit assumption within the MLCS2k2 fitting approach, for example). Thus, corrections have been applied to the distances within the fitting algorithm, and one would in general obtain a different value for the distance if one assumed instead that the data were drawn from the non-SN Ia population. For a cosmological analysis, we just marginalize over the values of the parameters in ϒ(z), but these parameters contain information on the distribution of non-SN Ia, and thus their posterior is of interest as well, allowing us to gain insight into the distribution characteristics on the non-SN Ia population at no additional "cost."

The simple binomial case considered here, where the non-SN Ia population consists of all types of core-collapse SNe, is probably too simplistic to accurately describe the distribution of non-SN Ia SNe. In general, one could include multiple populations, one for each SN type, which would yield a sum of Gaussian terms in the full posterior. In addition, the forms describing the distance modulus of the non-SN Ia population are chosen to minimize the cosmological information from the non-SNe Ia (we always test for a deviation from the cosmological distance modulus); however, the parameterization of the non-SN Ia distance modulus could be improved by investigating the distance modulus residuals from simulations, as the major contributions to the distance modulus residuals appear to be the core-collapse luminosity functions, along with the specific survey selection criteria and limiting magnitude (see Falck et al. 2010). While current SN samples do not include a large sample of non-SN Ia data to test for this, larger data sets (such as the data from the BOSS SN survey) will allow for a detailed analysis of the number (and form) of distributions describing the contaminant population.

2.4. Markov Chain Monte Carlo Methods

In this work, the BEAMS algorithm is implemented within a Markov Chain Monte Carlo (MCMC) framework, and the Metropolis-Hastings (Metropolis et al. 1953) acceptance criterion was used. We use the cosmological parameters

$\begin{equation} \lbrace \Omega _m, \Omega _\Lambda , H_0\rbrace \end{equation} \tag{ 14 }$

in the case of the χ² approach on the spectro and cut samples described below, and we add additional parameters

$\begin{equation} \lbrace A, \sigma _\tau , s_\tau , \boldsymbol{a}\rbrace \end{equation} \tag{ 15 }$

in the case of the BEAMS application. The parameters $\boldsymbol{a} = \lbrace a^0, a^1, a^2\rbrace ; d = a^3 = 0$ in Equation (15) are for the quadratic model; in the other models for ϒ(z) we adjust the parameters accordingly. The chains were in general run for around 100,000 steps per model; this was sufficient to ensure convergence. We test for convergence using the techniques described in Dunkley et al. (2005). We impose positivity priors on the energy densities of matter and dark energy and impose a flat prior on the Hubble parameter between 20 < H₀ < 100 km s⁻¹ Mpc⁻¹. The Hubble parameter is marginalized over, given that we do not know the intrinsic brightness of the SNe but through the distance modulus are only sensitive to the relative brightness of the SNe. We impose broad Gaussian priors on the parameters of the non-SN Ia likelihood function and step logarithmically in the probability normalization parameter A the intrinsic dispersion parameters of both the SN Ia and non-SN Ia distributions.

2.5. Comparison to Standard χ² Methods

The primary difference between BEAMS and current methods is that the latter either require that all data are spectroscopically confirmed or apply a range of quality cuts based on selection criteria. In this paper, we will compare the performance of BEAMS to these two approaches, by processing the data that pass the required selection criteria using the SN Ia likelihood, Equation (6). We will hereafter refer to this as the χ² approach.

We use the following samples:

1.
Spectro sample. The sample containing only spectroscopically confirmed SNe. In addition to spectroscopic confirmation, we will also apply a cut on the goodness-of-fit probability from the light-curve templates within the MLCS2k2 model, P_fit > 0.01, and a cut on the light-curve fitter parameter Δ > −0.4, where Δ is a parameter in the MLCS2k2 model describing the light-curve width–luminosity correlation. MLCS2k2 was trained using the range −0.4 < Δ < 1.7 (Jha et al. 2007); hence, we restrict the sample to Δ > −0.4, which is a typical cut in current SN surveys, and so we introduce the cut to provide comparison between data sets. We process this spectro sample using the χ² approach.
2.
Cut sample. This larger sample is selected by both removing 5σ outliers from a moving average fit to the Hubble diagram including both photometric and spectroscopically confirmed data and applying a cut to the sample, including only data with a high enough probability, P_typer > 0.9 (where the probability comes from a general SN typing procedure, such as PSNID, described in Sako et al. 2008, 2011). We choose to use the PSNID probabilities to make the probability cut on the sample (P_typer > 0.9); if the MLCS2k2 probabilities had themselves been used to make a cut sample, then objects would only be included if they had probabilities greater than, for example, P_fit > 0.9. In addition, we impose a cut on the goodness of fit of the light-curve data to the SN Ia typer (χ²_lc < 1.8), a cut on the goodness-of-fit probability from the light-curve templates within the MLCS2k2 model (P_fit > 0.01), and a cut on Δ > −0.4. In this cut sample case we then use the standard χ² cosmological fitting procedure on the sample and so set the SN Ia probability of all points to one.
3.
Photo sample. This sample is the one to which BEAMS will be applied and will include all the photometric data with host galaxy redshifts. As in the previous two cases, we include only data that have P_fit > 0.01, Δ > −0.4.

Note that the spectro sample will be included in all three samples described above. While the spectro and cut samples have by definition $P_\mathrm{\rm Ia} = 1$ (as they are analyzed in the χ² approach), we do not set the probabilities to unity when applying BEAMS to the full sample—the spectro subsample within the larger photo sample will be treated "blindly" by BEAMS. The spectro sample is the one most similar to current cosmological samples and will be used to check for consistency in the derived parameters between BEAMS applied to the photo sample and the χ² approach on the spectro sample.

3. DATA SETS

We apply BEAMS to three data sets. First, we generate an ideal simulated data set where the input SN Ia and non-SN Ia model for distance modulus are known and all data are simulated as Gaussian distributions around this model. The second level of simulations are generated from SNANA (Kessler et al. 2009b) as light curves and then fit using MLCS2k2 (Jha et al. 2007), based on an SDSS-II-like data set. The third level is the SDSS-II SN Survey photometric data with host-z from 2005 to 2008. The various data sets are described below.

3.1. Level I: Gaussian Simulations

To test the BEAMS algorithm explicitly, we need a completely controlled sample, where all variables (such as the non-SN Ia model, SN Ia probabilities) are directly known and where we can verify that the algorithm is able to recover them correctly. In addition, we use this data set to check that we recover the correct shape of the non-SN Ia distance modulus η(z) since the true $\eta (z, \boldsymbol{\theta })$ is known for this sample only. We simulate a population of 50,000 SNe, with redshifts drawn from a Gaussian distribution, $z \sim \mathcal {N}(0.3,0.15),$ and distance moduli drawn from a flat ΛCDM universe with $(\Omega _m, \Omega _\Lambda , H_0, w_0, w_a) = (0.3, 0.7, 70, -1, 0)$ . The non-SN Ia population includes a contribution to the distance modulus, $\eta (z,\boldsymbol{\theta }) = \mu (z,\boldsymbol{\theta }) + a^0+ a^1z + a^2z^2$ , where we choose (a⁰, a¹, a²) = (1.5, 1, −3). We assign P_Ia probabilities from a model dN/dP_Ia = A₁P_Ia + A₂P²_Ia, with A₁ = −0.9, A₂ = 1.9. We then assign the types from the two samples (of SNe Ia and non-SNe Ia), i.e, we choose a random number t and if t < P_Ia (i.e., the type also follows the same linear relationship as the probability) we take the data point to be a Ia, and if t > P_Ia we assign it as a non-SN Ia, until we run out of data points from either sample. This procedure reduces the sample size from 50,000 to 37,529, merely due to the fact that one continues to choose SNe until you run out of one population (which depends on the relative distribution chosen for the probabilities P_Ia).

We assign a "measurement error" to each distance modulus of σ_μ = 0.1; add an intrinsic error σ_τ = 0.16 and a peculiar velocity error based on Equation (10), with v_pec = 300 km s⁻¹. We then randomly scatter to the data points based on the total error bar. To mimic what happens in a light curve fitter, only the measurement error is recorded, however. When performing parameter estimation on the points we either add this measurement error in quadrature to the other terms whose amplitudes are fixed (in the case of the χ² approach), or we estimate the magnitudes of the intrinsic dispersion when we apply the BEAMS algorithm. We randomly choose 10% of the SN Ia data and assign spectro status; this represents the data that are followed up by large telescopes on the ground. This spectro sample is drawn so that we can compare the BEAMS-estimated result to the χ² approach on a smaller sample. The data are shown in Figure 2. In the BEAMS analysis we checked on a small number of simulated samples that the results obtained were unbiased—a full Monte Carlo simulation of bias is beyond the scope of this work.

**Figure 2.** Level I Gaussian data: 37,529 points simulated according to Gaussian distributions around a distance modulus in a flat ΛCDM model for the SN Ia population (25,000 points) and with extra terms up to quadratic order in redshift for the non-SN Ia population. The points are colored according to their simulated probabilities from blue (low probability, color online) to dark brown (high probability, color online).
Download figure:
Standard image High-resolution image

3.2. Level II: SNANA Simulations

The previous Gaussian simulation is generated in order to test the algorithm for any intrinsic biases in the analysis procedure. In order to apply BEAMS to a more realistic scenario, we use the Supernova Analysis package (SNANA; Kessler et al. 2009b) to simulate a mixed sample of SNe Ia and non-SN Ia contaminants and to include realistic survey characteristics. SNANA contains both a simulation module to generate light-curve data and a light-curve fitter that includes the MLCS2k2 model we used in this work (both SALT2 and MLCS2k2 are contained within the SNANA package). This sample provided a useful procedure to test the BEAMS algorithm, where the final distribution of distance moduli is not explicitly given, but rather arises from the generation of SN data from light-curve templates and the fitting of those templates with standard light-curve fitters. The simulation specifications were chosen based on the SDSS-II SN survey characteristics. A sample of 62,441 SNe were simulated between redshifts 0.02 < z < 0.5, assuming a ΛCDM $(\Omega _m, \Omega _\Lambda) = (0.3, 0.7)$ cosmology. The simulation was generated using the same characteristics as in the Supernova Photometric Classification Challenge (SNPCC; see Kessler et al. 2010a, 2010b for the simulation specifications), where the non-SN Ia simulation is based on 42 spec-confirmed non-SN Ia light curves. The cosmology cuts of P_fit > 0.01, Δ > −0.4 reduce the sample from 62,441 to 35,815. Most of the SNe that are cut from the sample are from the non-SN Ia sample; only 17% of the final sample are non-SN Ia. A large spectro sample of 13,826 SNe were flagged as spectroscopic within SNANA; however, we reduce this sample to a smaller spectro sample of 1467 SNe (roughly ≃ 10% of the full spectroscopic SN Ia sample). The simulation was generated using an efficiency correction based on the full sample of SNe Ia, including photometric and spectroscopic candidates. Hence, the smaller spectro sample was taken from the full set of all SNe Ia in the sample (i.e., it is not a subset of those flagged as spectroscopically confirmed) so as not to introduce an efficiency bias.

The data are corrected for a small redshift-dependent bias in the fitted distance modulus. This correction is determined by comparing the MLCS2k2 fitted modulus to the input distance modulus in 10 redshift bins and varies by up to 2%. The data are shown in Figure 3. The distance modulus residuals of the SNANA simulation are binned in Figure 4, illustrating that the non-SN Ia population is in general not merely a single Gaussian family. In this case the single Gaussian assumption in BEAMS will not be entirely accurate; however, this sub-structure is not yet observed in current data, which have not included large populations of non-SNe Ia, and hence we leave the multinomial description for future work (see Appendix B for a discussion of pitfalls in photometric cosmology and future outlook). In addition, as can be seen from the top panel of Figure 4, the SN Ia distance moduli are approximately Gaussian. One can check whether allowing for non-zero skewness and kurtosis through, for example, a saddlepoint distribution improves the fit of the data (particularly in the tail regions).

**Figure 3.** Level II SNANA simulations: 35,815 SNANA simulated data points generated from a ΛCDM concordance cosmology and fitted with efficiency corrections as discussed in the text, which satisfy the conditions P_fit > 0.01; Δ > −0.4. The data contain 30,623 SN Ia, and we define a smaller subset (1467 SNe Ia) as a *spectro* subsample, to mimic current data. Same as in Figure 2, the points are colored by probability from low (blue points, color online) to high (dark brown points, color online).
Download figure:
Standard image High-resolution image

**Figure 4.** Distance moduli of the Level II SNANA simulation: the top panel shows the 35,815 SNANA normed histogram of the distance modulus residuals (the difference of the distance modulus of each point relative to the input ΛCDM cosmology) for the SNe Ia only, with the Gaussian fit to the residuals. The skewness and kurtosis of this distribution are 0.46 and 0.11, respectively. The bottom panel shows the normed histogram for the Type 2 (SN Types IIn, IIP, IIL) and Type 3 (SN Types Ib, Ic) populations (the SNANA simulation yields populations generated according to many subtypes, as specified in Kessler et al. 2010a, 2010b; here, we consider two broader classes for visual purposes). The dashed line shows the Gaussian distribution fitted to the mean and standard deviation of all the non-SN Ia types combined. In this case, the simple sum of two Gaussian distributions, one for the SNe Ia and one for the non-SN Ia points, is no longer adequate in describing the simulated model. In particular (as can also be seen from Figure 2), the simulation predicts a population of non-SNe Ia brighter than the SN Ia population (negative Δμ), which is not seen in the SDSS-II SN sample, as stringent cuts are typically made to the data before obtaining a Hubble diagram.
Download figure:
Standard image High-resolution image

3.3. Level III: SDSS-II SN Photometric Data

The Sloan Digital Sky Survey Supernova Search operated for three, three-month long seasons during 2005–2007. We use the photometric SNe from all three seasons of the SDSS-II SN survey, which also had host galaxy redshifts from the SDSS survey. The analysis and cosmological interpretation of the first season of data (hereafter Fall 2005) are described in Frieman et al. (2008), Kessler et al. (2009a), Lampeitl et al. (2009), and Sollerman et al. (2009). The SDSS CCD camera is located on a 2.5 m telescope at the Apache Point Observatory in New Mexico. The camera operated in the five Sloan optical bands ugriz (Fukugita et al. 1996). The telescope made repeated drift scans of Stripe 82, a roughly 300 deg² region centered on the celestial equator in the Southern Galactic hemisphere, with a cadence of roughly four to five days, accounting for problems with weather and instrumentation.

The images were scanned and objects were flagged as candidate SNe (Sako et al. 2008). Candidate light curves were compared to a set of SN light-curve templates in the g, r, i bands (consisting of both core-collapse and SNe Ia) as a function of redshift, intrinsic luminosity, and extinction. Likely SN Ia candidates were preferentially followed up with spectroscopic observations of both the candidates and their host galaxies (where possible) on various larger telescopes (see Sako et al. 2008).

In addition to the spectroscopically confirmed SNe Ia discovered in the SDSS-II SN, many high-quality candidates without spectroscopic confirmation (i.e., only photometric observations were made of the SNe) but which, by chance, have a host galaxy spectroscopic redshift are present in the SDSS sample (see Figure 5).²²

**Figure 5.** Level III SDSS-II SN data: the photometric sample of the full three seasons of SDSS-II SN survey. The 792 points are all those with host galaxy spectroscopic redshifts. The sample includes 297 spectroscopically confirmed SNe and is color coded using probabilities from the PSNID typer (Sako et al. 2008, 2011) from low (blue, color online) to high (dark brown, color online).
Download figure:
Standard image High-resolution image

We include these SNe in both the cut sample and the spectro samples in the full photo sample but do not set the probabilities of these points to unity. These SNe are fit with the MLCS2k2 model (Jha et al. 2007) to obtain a distance modulus for each SN, assuming that the SN is a Type Ia, in the same way as the Level II SNANA simulations.

As outlined in Section 2.5, we impose the standard selection cuts on the probability of the fit to the MLCS2k2 light-curve template P_fit > 0.01 and Δ > −0.4 to all data and require that the data used have spectroscopic host galaxy redshift information. Applying these cuts to the full three-year data yields a photometric sample of 792 SNe, with a spectroscopic subsample of 297 SNe. The spectro sample consists of the objects that have been spectroscopically confirmed by other ground-based telescopes, while the cut sample consists of the data points that have a typer probability of P_typer > 0.9 and a goodness of fit to the light-curve templates within the PSNID typer (Sako et al. 2008, 2011), χ²_lc < 1.8.

4. APPLICATION OF BEAMS

4.1. Performance of BEAMS Comparisons across Data Sets

The BEAMS approach can be compared to standard χ² techniques, namely, the χ² approach applied to subsets of the data set resulting from cuts. For the Level I Gaussian simulation, we define the spectro sample as a randomly selected sample of 10% of the points we know to be of Type Ia. This is to match expected future efficiencies of spectroscopic confirmation; one will always be comparing the performance of BEAMS, which takes account of contamination within the algorithm, on a larger photometric sample against a χ² approach that does not directly control for contamination, on a smaller but more pure sample. We compare the constraints using the three level data sets and various approaches in Figure 6.

Table 1. Summary of Data Sets: The Redshift Distribution and Sample Sizes of the Data Sets Compared in Figure 6

Data Set	Level I	Level II	Level III
	Gaussian	SNANA	SDSS-II SN
	Sim	Sim	Data
Redshift range	(0.02, 0.9)	(0.02, 0.45)	(0.02, 0.45)
Total photo sample size	37529	35815	792
No of spectro points	2500	1467	297
No of cut points	7130	10967	191
No of Ia points	25000	30623	Unknown

Notes. The Level I Gaussian simulation and constraints are shown in the top row of Figure 6, the Level II SNANA simulation is shown in the middle panel, and the Level III SDSS-II SN data are shown in the bottom row of Figure 6, in which case the true numbers of SNe Ia in the sample are unknown.

Download table as: ASCII Typeset image

In each case, the BEAMS algorithm applied to the data gives the tightest constraints that are also consistent with the input cosmology (in the case of simulations) and the spectroscopic sample (in the case of the data). In the case of the Level I Gaussian simulation (since there is no light-curve fitting procedure in this simulation), the cut sample is taken to be all points with probability P_Ia > 0.9, while for the Level II SNANA simulation, this is taken as all points that satisfy the basic cuts (such as the cut on the Δ parameter), and which also satisfy P_Ia > 0.9 and the goodness-of-fit χ²_lc < 1.8. Note that the cut on the goodness of fit is not particularly conservative. In general, the more conservative the cut, the less biased the contours become. However, this is at the cost of the size of the contours, which increase, thereby losing the statistical power of the large sample. The curve corresponding to the spectroscopic subset we define as "unbiased" since they by definition are the contours that would result in a contemporary analysis.

A small (≃ 1σ) bias is visible in the recovered cosmology in the case of the Level II SNANA simulations. This is potentially due to a combination of factors. First, a single Gaussian is used to model the distance modulus of the non-SN Ia population, which we can see from Figure 4 is not an accurate description of the core-collapse population within the SNANA simulation. This same substructure is not seen in the data, and hence at present a single population is sufficient to model the contaminant population.

This assumption will be relaxed when applying BEAMS to the larger SN sample within the BOSS data, which is left to future work. In addition, an efficiency correction for Malmquist bias was made within the MLCS2k2 fitting procedure, based on the sample of SN Ia data. We do not expect the Malmquist bias of the SN Ia supernova sample to be the same as that of the non-SN Ia data; this issue will be addressed in detail in future work and is essential for future large photometric surveys. An additional source of bias could be due to incorrectly assuming a Gaussian likelihood for the SN Ia (and non-SN Ia) populations, as this would bias all cosmological analyses. We leave investigation of non-Gaussian likelihoods to future work.

As is shown in Figure 6, BEAMS recovers the input cosmology of the simulations and estimates parameters consistent with the spectro sample in the case of the Level III SDSS-II SN data. Moreover, the BEAMS contours are three times smaller than when using the spectro sample alone. In the Level II SNANA simulation the contours are ≃ 40% the size of the spectro sample, while in the case of ideal Level I Gaussian simulations, the BEAMS contours using all the points are ≃ 16% the size of the spectro sample. This highlights the potential of photometric SN cosmology to drastically reduce the size of error contours with larger samples while remaining unbiased relative to the "known" spectroscopic case. If one adds a low-z sample (such as the standard one used in the SDSS-II SN analysis presented in Kessler et al. 2009a), the relative shrinking is only slightly degraded: the BEAMS contours are 2.7 times smaller than the contours from the spectroscopic-only analysis, rather than three times smaller when using only the SDSS-II SN data. If one also adds in a high-z spectroscopic sample, the contours become of similar size whether or not one uses BEAMS; however, this is due to the fact that the sample is now dominated by the high-z arm of the Hubble diagram and due to the increase in the total number of spectroscopic SNe in the sample.

4.2. Scaling of Error Bars

As discussed in Kunz et al. (2007) for the one-dimensional case, the effective number of SNe that result when applying BEAMS scales as the number of spectroscopic SNe and the average probability of the data set multiplied by the remainder of the photometric sample, $\sigma \rightarrow \sigma /\sqrt{N_\mathrm{spec} + \langle P_{\mathrm Ia} \rangle N_\mathrm{photo}}.$ In the two-dimensional case, the square root would be removed as the area of the ellipse scales with the increase in the effective number of SNe. In our applications we have, however, not used the fact that we know that some points are confirmed as Type Ia. In other words, the probability of each data point was taken from the light-curve fitter and was not adjusted to one or zero depending on the known type. Hence, we expect the size of the contours in the i − j plane to scale as

$\begin{equation} C_{ij}^{1/2} \rightarrow \frac{C_{ij}^{1/2}}{\langle P_\mathrm{Ia}\rangle N_\mathrm{photo}}. \end{equation} \tag{ 16 }$

We compute the size of the error ellipse for various Level I simulations as a function of the size of the simulation, shown in Figure 7, for one particular model of the probabilities, and hence one value of 〈P〉. We impose a positivity prior on the densities, and hence the ellipses are not closed for smaller samples, as the large uncertainties mean that we probe the hard prior boundary. For large enough sample sizes the ellipse is closed and we observe that the error ellipses scale in area as ∝1/〈P_Ia〉N, which is consistent with earlier results (Kunz et al. 2007). In general then, one would obtain a different constant factor 〈P〉 in Figure 7 for different simulated probability distributions.

Large SN surveys not only will increase the total number of SN Ia candidates but will allow one to investigate systematics about the SN populations directly. The BEAMS algorithm is designed to include and adapt to information about the non-SN Ia population easily. By adapting the form of the non-SN Ia population and including more than one population group, one could use BEAMS to gain insight into the contaminant distribution. One caveat to this is that we consider here only the statistical error and ignore the systematic floors that exist to effects such as calibration of the distance modulus data, or possible redshift-dependent effects in the non-SN Ia population.

4.3. Constraining ϒ(z) Forms for the Non-SN Ia Population

4.3.1. Level I Gaussian Simulation

The Gaussian simulation described in Section 2.3 uses a quadratic model for the differences between the standard ΛCDM μ(z) and the non-SN Ia distance modulus. We test here that assuming a different functional form while performing parameter estimation does not significantly bias the inferred cosmology. We define the effective χ² as $-2\ln \mathcal {L},$ where the posterior $\mathcal {L}$ is a linear sum of the terms in Equation (5), and provide values relative to the simplest linear model for ϒ(z). In Figure 8, we show that BEAMS is reasonably insensitive to the assumed form of the non-SN Ia likelihood, provided that it is allowed enough freedom to capture the underlying model. A linear model fails to recover the correct cosmology, as it does not have enough freedom to recover the difference between the SN Ia and non-SN Ia distribution. It correspondingly has a very high χ² relative to the other approaches. The higher-order functions recover consistent cosmologies, and the χ² of these models improves by Δχ² < 0.5, even though the models have increased the number of parameters by one.

**Figure 8.** Different ϒ(z) distributions for the non-SN Ia likelihoods: 2σ constraints in the $\Omega _m,\Omega _\Lambda$ plane for different versions of the non-SN Ia distance modulus function, for the Level I Gaussian simulation (top panel), the Level II SNANA simulation (middle panel), and the Level III SDSS-II SN data (bottom panel). In the case of the Level I simulation, we simulated a quadratic model and ran BEAMS assuming a linear, quadratic, cubic, and Padé form for ϒ(z), as described in Section 2.3. For the other two cases the underlying distribution for the non-SN Ia distance modulus is not known analytically. For these cases, we test for the same models as for the Level I simulation; the legend is the same for all panels. In the Level I Gaussian simulation, the linear model does not have enough freedom to capture the non-SN Ia distribution (as expected, since the input model was a quadratic function). This behavior is also seen in the Level II SNANA simulations. The Level III SDSS data do not have a particular preference of the form used, as the number of SNe in the sample is not large enough to constrain the non-SN Ia population. The goodness of fit of the distributions to the data is summarized in Tables 2–4 for the Level I, II, and III cases, respectively.
Download figure:
Standard image High-resolution image

**Figure 8.** Different ϒ(z) distributions for the non-SN Ia likelihoods: 2σ constraints in the $\Omega _m,\Omega _\Lambda$ plane for different versions of the non-SN Ia distance modulus function, for the Level I Gaussian simulation (top panel), the Level II SNANA simulation (middle panel), and the Level III SDSS-II SN data (bottom panel). In the case of the Level I simulation, we simulated a quadratic model and ran BEAMS assuming a linear, quadratic, cubic, and Padé form for ϒ(z), as described in Section 2.3. For the other two cases the underlying distribution for the non-SN Ia distance modulus is not known analytically. For these cases, we test for the same models as for the Level I simulation; the legend is the same for all panels. In the Level I Gaussian simulation, the linear model does not have enough freedom to capture the non-SN Ia distribution (as expected, since the input model was a quadratic function). This behavior is also seen in the Level II SNANA simulations. The Level III SDSS data do not have a particular preference of the form used, as the number of SNe in the sample is not large enough to constrain the non-SN Ia population. The goodness of fit of the distributions to the data is summarized in Tables 2–4 for the Level I, II, and III cases, respectively.
Download figure:
Standard image High-resolution image

Table 2. Comparison of Non-SN Ia Likelihood Models for Level I Gaussian Simulation: χ² Values for the Fits Using Various Forms of the Non-SN Ia Likelihood for the Level I Simulations, Where the True Underlying Model Is a Quadratic

Model	Δχ²_eff^a	Parameters
ϒ(z) = az + c	0	2

ϒ(z) = az + bz² + c	−192.9	3

ϒ(z) = az + bz² + cz³ + d	−193.3	4

ϒ(z) = (az + bz² + c)/(1 + dz)	−193.4	4

Notes. The constraints on Ω_m, Ω_Λ are shown in Figure 8. ^aDifference in the effective χ² between a given model and the linear case, which has χ²_eff = 42, 526.2.

Download table as: ASCII Typeset image

Table 3. Comparison of Non-SN Ia Likelihood Models for Level II SNANA Simulation: χ² Values for the Fits Using Various Forms of the non-SN Ia Likelihood for the Level II Simulations, Where the True Underlying Model Is Unknown

Model	Δχ²_eff^a	Parameters
ϒ(z) = az + c	0	2

ϒ(z)^b = az + bz² + c	−135.1	3

ϒ(z) = az + bz² + cz³ + d	−287.2	4

ϒ(z) = (az + bz² + c)/(1 + dz)	−206.1	4

Notes. The constraints on Ω_m, Ω_Λ are consistent for all models of second order or higher. ^aDifference in the effective χ² between a given model and the linear case, which had χ²_eff = 10092.6. ^bIn the quadratic case, the best-fit values for the non-SN Ia distribution parameters were (a⁰, a¹, a², σ_τ, s_τ) = (1.99, −13.99, 20.32, 0.06, 0.93).

Download table as: ASCII Typeset image

Table 4. Comparison of Non-SN Ia Likelihood Models for Level III SDSS-II SN Data: χ² Values for the Fits Using Various Forms of the Non-SN Ia Likelihood for the SDSS-III Data

Model	Δχ²_eff^a	Parameters
ϒ(z) = a¹z + a⁰	0	2

Parameters (a¹, a⁰)	(− 5.3, 1.7)

ϒ(z) = a¹z + a²z² + a⁰	−1.3	3

Parameters (a¹, a², a⁰)	(− 9.69, 10.73, 2.1)

ϒ(z) = a¹z + a²z² + a³z³ + a⁰	−2.9	4

Parameters (a¹, a², a³, a⁰)	(0.59, −4.97, 9.98, 1.64)

ϒ(z) = (a¹z + a²z² + a⁰)/(1 + dz)	−0.2	4

Parameters (a¹, a², a⁰, d)	(− 33.3, 83.4, 1.43, −19)

Notes. While the χ² decreases as the number of parameters increases, it does not decrease significantly given the amount of freedom in the higher-order models. The constraints on Ω_m, Ω_Λ from these fits are shown in Figure 8. The parameter values are the mean of the one-dimensional likelihood for the model parameters. This form also appears to be consistent with the SNANA simulated data (see Figure 3). ^aDifference in the effective χ² between a given model and the linear case, which has χ²_eff = 1215.3.

Download table as: ASCII Typeset image

4.3.2. Level II: SNANA Simulations

In the case of the Level I Gaussian simulated data, the explicit form of the ϒ(z) was specified as quadratic and then various other forms for ϒ(z) were fit for within the BEAMS approach. In general, any model with enough freedom (of equal or higher order to the input model) managed to recover the input cosmology. We apply this test to the Level II SNANA simulations, where the data are generated from fitting generated light-curve data to templates. Naively, one might not expect a simple model to fit the data. In general, we find that while a cubic model performed the best at fitting the data (it had the lowest χ²_eff relative to the linear model), the inferred cosmology in the quadratic case is consistent with the input cosmology. Fitting the Level II SNANA simulated data with a linear model led to a large bias in the inferred cosmology, as shown in Figure 8.

4.3.3. Level III: SDSS-II Data

In the case of the Level III SDSS-II SN data, we shall let the data inform us of the best choice of model for the non-SN Ia distribution. In an observed sample of non-SN Ia data, the theoretical distance modulus depends on their apparent magnitudes, which in turn depend on redshift and survey limiting magnitude and the absolute magnitudes of the non-SN Ia population, which are drawn from an unknown luminosity function. In the large SN sample limit, we will learn about the distribution of those SNe that are not from the SN Ia distribution; however, we treat them here as nuisance parameters that we marginalize over, rather than using a "hard-coded" empirical relation. Table 4 shows the various forms of the non-SN Ia distribution considered and the χ² of the fit.

The data seem consistent with a quadratic model, and the constraints do not change significantly for any assumed form, as shown in Figure 8. It is clear from the limited amount of data in the current SDSS-II SN sample that a complicated form is unjustified at present; however, this will be tested as the number of SN candidates increases with the SDSS-II SN with host redshifts from BOSS (and future large photometric surveys such as LSST and DES).

4.4. BEAMS Posterior Probabilities

BEAMS uses probability information from photometric SN Ia candidates in the likelihood to determine cosmological parameters. In this section, however, we illustrate in addition that BEAMS can test whether a given set of probabilities are biased and can allow for uncertainty in the probabilities themselves while computing cosmological constraints.

4.4.1. Methodology

The probability P_i is a prior probability from the earlier fitting and mapping process on whether or not the specific data belong to the Type Ia population of SNe. Using accurate probabilities within the BEAMS framework leads to the greatest reduction in the size of the parameter contours, while controlling for bias, as is shown in Figure 6. But one might ask, can BEAMS itself recover probability information from the data? The answer, as we will discuss below, is yes. Indeed, BEAMS posterior probabilities can be used as a check for bias in the input probabilities of the data. As described in Kunz et al. (2007), however, one can promote P_i to a free variable, and its posterior distribution then contains information on how well it fits the SN Ia or then non-SN Ia class of SNe. If we leave just P_i for one i free and fix all other parameters, then the posterior becomes

$\begin{eqnarray} &&P(\boldsymbol{\theta },P_i|\boldsymbol{\mu }) \propto \left(\prod ^{N}_{j\ne i} P(\boldsymbol{\theta }|\mu _j)\right) \left\lbrace P(\mu _i|\boldsymbol{\theta }, \tau _i=1)P_i \right.\nonumber \\ \left.+ P(\mu _i|\boldsymbol{\theta }, \tau _i =0)(1-P_i)\right\rbrace ,&& \nonumber \\ \end{eqnarray} \tag{ 17 }$

where $P(\boldsymbol{\theta },P_i|\mu _j)$ is the posterior at the fixed parameter vector $\boldsymbol{\theta }$ containing both the cosmological parameters and the additional parameters describing the non-SN Ia population (Equations (14) and (15)) for all SNe except i. The above expression is just a straight line going from $\prod ^{N}_{j\ne i} P(\boldsymbol{\theta }|\mu _j)P(\mu _i|\boldsymbol{\theta }, \tau _i =0)$ at the intercept P_i = 0 to the value $\prod ^{N}_{j\ne i} P(\boldsymbol{\theta }|\mu _j)P(\mu _i|\boldsymbol{\theta }, \tau _i =1)$ at P_i = 1. In general, we do not fix the parameters but sample from the full posterior and then marginalize over everything except P_i. This results again in the posterior for P_i being a straight line.

To extract the model probabilities corresponding to SN i being of Type Ia or not, as opposed to the posterior distribution of the parameter P_i, we take recourse to the Savage–Dickey density ratio (Dickey 1971; G hat u nel & Dickey 1974): In nested models the relative model probability (in favor of the more simple model) is the ratio of the posterior divided by the prior at the nested point. The two models $\mathcal {M}_1$ = "Ia" and $\mathcal {M}_2$ = "not Ia" are not nested, but we can use a trick by extending our model space with a third model $\mathcal {M}_3$ = "P_i free." Then the two models are nested in that third model at the points P_i = 1 and P_i = 0, respectively. Therefore, the relative probabilities $B^{(i)}_{31}=P(\mathcal {M}_3)/P(\mathcal {M}_1)$ and $B^{(i)}_{32}=P(\mathcal {M}_3)/P(\mathcal {M}_2)$ for SN i can be extracted from an MCMC chain with free probability P_i, by looking at the end points of the normalized posterior for P_i, marginalized over all other parameters. Given the discussion above on the shape of the posterior of P_i, what we do in practice is to fit a straight line to the distribution of P_i values of an MCMC chain in which we left P_i free. The values at the end points give directly B₃₁ and B₃₂. The relative probability between models $\mathcal {M}_1$ and $\mathcal {M}_2$ is now simply B⁽ⁱ⁾₁₂ = B⁽ⁱ⁾₃₂/B⁽ⁱ⁾₃₁.

To which value should we set the probabilities that we keep fixed? A natural possibility would be to use the output of a prior typing stage, but this choice involves the risk that the prior probabilities could be biased. Instead, we could use P = 1/2 to convey the minimal amount of extra information. In this case we should also use the A parameter to allow for an automatic correction of different total numbers of SNe of different types. This choice has another advantage: as shown in Section IVA of Kunz et al. (2007), we get effectively P = 1/2 if we marginalize over a fully free P, so this is also the choice where we let all P_j values float freely and marginalize over all but P_i. For this reason we will use P = 1/2 together with a free global A in the remainder of this section.

4.4.2. Toy-model Illustration of Posterior Probabilities

Let us illustrate the meaning of the posterior probabilities that we expect to find if BEAMS works with a simple toy model: we assume that we are dealing with two populations (let us call them "red" and "blue") drawn from two normal distributions with means at ±θ and equal variances of σ² = 1 (see the top panel of Figure 9).

**Figure 9.** Posterior probabilities: the top panel provides an illustration of the two toy distributions, in the case of θ = 0.5, 1.0, 2.0 (left to right). The bottom panel shows the probability histogram density plots, or number of "red" points with a given probability, where dN^(r)(P) is given in Equation (22) for θ = 0.5, shown as the blue (color online) line that slopes to zero at high and low probability, 1, shown as the red (color online) curve that peaks at high probability, and 2, indicated by the yellow (color online) line that peaks at low probability.
Download figure:
Standard image High-resolution image

We use this toy model specifically to highlight the most important elements in estimating the posterior probabilities, in the case where the populations are similar (e.g., they have equal variances), and to highlight the necessity of the normalization/rate parameter A. We will also see that unbiased probabilities imply that a small peak at low probability for the SN Ia or high probability for the non-SN Ia is actually "right" and is what we should expect.

The equality of the variances of the two populations means that we are measuring the distance Δ = 2θ between the two mean values in units of the standard deviation. We also allow for different numbers of points drawn from the red/light and blue/dark Gaussians through a "rate parameter" ρ ∈ [0, 1] that gives the probability to draw a red point. If we draw N points in total, we will then have on average ρN red points and (1 − ρ)N blue points. The likelihood for a set of points {x_j}, with j running from 1 to N, is then

$\begin{equation} P(\lbrace x_j\rbrace |\theta) = \prod _{j=1}^N \frac{1}{\sqrt{2\pi }} \left(P e^{-\frac{1}{2}(\theta -x_j)^2} + (1-P) e^{-\frac{1}{2}(\theta +x_j)^2} \right) \end{equation} \tag{ 18 }$

for P = ρ.

To simplify the analysis, we assume that we are dealing with large samples so that θ is determined to high precision, with an error much smaller than σ. In this case (and since this is a toy model), we can take the parameter θ fixed. We also note that if we are running this in BEAMS with a true prior probability P = ρ, then we would find a normalization parameter A = 1, while for P = 1/2 we would obtain A = ρ/(1 − ρ), and we again assume that this parameter can be fixed to its true value. Then it is easy to see that if we leave the probability for point i, P_i, free, we find a Bayes factor

$\begin{equation} B=\frac{P(\lbrace x_j\rbrace |\theta , \tau _i=1)}{P(\lbrace x_j\rbrace |\theta ,\tau _i=0)} = \frac{e^{-\frac{1}{2}(\theta -x_i)^2}}{e^{-\frac{1}{2}(\theta +x_i)^2}} = e^{2\theta x_i}. \end{equation} \tag{ 19 }$

In other words, ln (B) = x_iΔ, just the value of the data point times the separation of the means. If the point is exactly in between the two distributions x_i = 0, then B = 1, i.e., its BEAMS posterior probability to be red or blue is equal. Note that the answer is independent of the value of ρ and this happens because we have removed any influence of the rate parameter A on the free P_i. This means that if we want to think of the BEAMS posterior probability as the probability to be red or blue, we should update the Bayes factor with A, i.e., use $\tilde{B} = B A$ , with an associated probability $P=\tilde{B}/(1+\tilde{B})$ . We also see that the probability to be red increases exponentially as x_i increases. As we will see below, this reflects the fact that the number of red points relative to the blue points increases in the same way. The rapidity of this increase is governed by the separation, Δ, of the two distributions.

What is the distribution of the posterior probabilities, i.e., the histogram of probability values, and what determines how well BEAMS does as a typer in this example? The number of red points in an interval [x, x + dx] is just given by the "red" probability distribution function at this value, times dx. To plot this function in terms of P, we also need

$\begin{eqnarray} x(P)&=&\frac{\ln (B)}{\Delta }=\frac{\ln (P/(1-P))}{\Delta } \end{eqnarray} \tag{ 20 }$

$\begin{eqnarray} \frac{\it dP}{dx} &=& {\Delta P}{(1-P)}. \end{eqnarray} \tag{ 21 }$

The probability histograms for the red (r) and blue (b) points, normalized to ρ and 1 − ρ, respectively, then are

$\begin{eqnarray} {\it dN}^{(r)}(P) &=& \frac{\rho }{\sqrt{2\pi }\Delta } \frac{\it dP}{P(1-P)} \nonumber \\ && \times \exp \left\lbrace -\frac{1}{2} \left(\frac{\ln [P/(1-P)]}{\Delta } - \theta \right)^2 \right\rbrace\quad\quad \end{eqnarray} \tag{ 22 }$

$\begin{eqnarray} {\it dN}^{(b)}(P) &=& \frac{1-\rho }{\sqrt{2\pi }\Delta } \frac{\it dP}{P(1-P)} \nonumber\\ && \times \exp \left\lbrace -\frac{1}{2} \left(\frac{\ln [P/(1-P)]}{\Delta } + \theta \right)^2 \right\rbrace.\quad\quad \end{eqnarray} \tag{ 23 }$

We plot dN^(r)/dP/ρ for θ = 0.5, 1, and 2 in the lower panel of Figure 9. We see how the values become more concentrated around P = 1 for larger separation of the distributions, i.e., BEAMS becomes a "better" typer. It is worth noting that apart from the factors of ρ, (1 − ρ), Equations (22) and (23) are exchanged when we replace P with 1 − P (i.e., the expressions are reversed for the blue population relative to the red population, and so the blue population will have a peak at P = 0 and the red population a peak at P = 1). But for very large separations there are also suddenly more SNe Ia at low P (yellow curve, color online). By the symmetry mentioned earlier, the SNe Ia will now also have a small peak at P = 0, and correspondingly we expect a peak in the non-SNe Ia at P = 1. The reason is that BEAMS does not try to be the best possible typer; instead, it respects the condition that the probabilities have to be unbiased, in the sense that

$\begin{equation} \frac{{\it dN}^{(r)}}{{\it dN}^{(b)}} = \left(\frac{P}{1-P} \right) \left(\frac{\rho }{1-\rho } \right) = {\it B A} = \tilde{B}. \end{equation} \tag{ 24 }$

Since BEAMS only uses the information coming from the distribution of the values, its power, as reflected in the distribution of probability values dN(P), is given by how strongly the distributions are separated. If they are identical (θ = 0), then BEAMS can only return P = 1/2, while for larger θ there is a stronger preference for one type over another. But given the two populations, we can in principle derive the probability histogram by just looking at the ratio of data points of either type at each point in data space; there is nothing else BEAMS can do. Also, in order for the probabilities to be unbiased (up to the rates that are taken into account by A), if there are, say, 200 red points in the P = 0.9 bin and only 10 in the P = 0.8 bin, then we need to find about two blue points in the P = 0.8 bin, but 20 in the P = 0.9 bin. Although this looks like a significant misclassification problem, it is just a reflection of Equation (24) and is actually the desired behavior.

4.4.3. Application to Level II

In order to check whether BEAMS is able to produce posterior probabilities with the expected properties, we ran it on the Level II SNANA simulation for constant P = 1/2 prior probabilities, allowing for a free A. We plot the two probability histograms in the upper panel of Figure 10, for probabilities that were updated with the posterior value of the rate parameter, A = 0.33. The cuts on the Δ parameter and removing all points with probability P_fit < 0.01 reduce the number of non-SNe Ia in the sample at low probability, while there is quite a spread in the probabilities of the SN Ia. The right-hand panel shows the BEAMS posterior probabilities—the normalization parameter A allows BEAMS to adjust the high-probability non-SN Ia to low probability and to sharpen the SN Ia probabilities around high probability.

**Figure 10.** Posterior probabilities within BEAMS for the Level II SNANA simulation. The top left panel shows the histogram of the MLCS2k2 fitter probabilities for the 35,815 SNe in the SNANA simulation, while the top right-hand panel shows the histogram for the same data, where the probabilities are taken as the BEAMS-posterior estimated probabilities obtained using the A parameter. In both cases, the probabilities are separated according to the true (input) type of the points, either SN Ia or non-SN Ia. There are few non-SN Ia points in the quality-controlled Level II SNANA simulation, but a reasonably small number of low-probability non-SN Ia, compared to the BEAMS posterior probabilities on the same data. This is illustrated in the bottom panel, where we plot the ratio of SN Ia to non-SN Ia in probability bins for the MLCS2k2 probabilities (black crosses) and BEAMS posterior probabilities (gray dots) compared to the theoretical expectation ln (B) = ln (P/(1 − P)) (orange curve and error bars, color online). For example, the dot for the 90% bin will lie on the theoretical curve if indeed 90% of the SNe with P_Ia ≈ 0.9 are of actually Type Ia, and the equivalent for the other bins. The fact that the histogram of BEAMS posterior probabilities (top right panel) shows that more non-SNe Ia are assigned low probability explains why there is less bias at low probability in the bottom panel.
Download figure:
Standard image High-resolution image

The bottom panel of the figure shows the ratio of the two histograms in the upper panel to test whether we recover Equation (24). Given that B is the ratio of the number of SN Ia to non-SN Ia points in each bin, we have that ln (B) = ln (N_Ia) − ln (N_nonIa), where the numbers are for each bin, and explicitly N_tot = N_Ia + N_nonIa per bin. We can determine the error on ln (B) using the following prescription. In each probability bin, we take the probability P and the total SN number N_tot to be fixed, in which case the SN types are distributed according to a binomial distribution, with the mean number of SNe Ia given by N_totP and variance $\sigma ^2_{N_\mathrm{Ia}}(P) = N_\mathrm{tot}P(1-P).$ Similarly, the non-SNe Ia have mean N_tot(1 − P) and the same variance. Using propagation of errors, we arrive at an expression for the variance on B = N_Ia/(N_tot − N_Ia):

$\begin{equation} \sigma ^2_B(P) = \frac{PN_\mathrm{tot}^2}{((1-P)N_\mathrm{tot})^3}. \end{equation} \tag{ 25 }$

Therefore, the error on ln (B) is given through

$\begin{equation} \sigma ^2_{\ln (B)}(P) = \sigma ^2_B(P)\left(\frac{d\ln (B)}{\it dB}\right)^2 = \frac{1}{P(1-P)N_\mathrm{tot}}. \end{equation} \tag{ 26 }$

The errors are shown around the theoretical model in Figure 10.

4.4.4. Approximate Methods

The procedure to extract the posterior probabilities as outlined above is rather slow, as we need to run a full MCMC analysis for each SN. This is only so because we evaluate the posterior for P_i given all other probabilities fixed to their mapped values. Leaving all the probabilities free would lead to a very high dimensional and complex posterior that would be very hard to sample from. However, since we are working at any rate in the limit of at most weak correlations between SNe, we can leave free a subset n of the P_i simultaneously.

It is better if n is much smaller than the total number of SNe, and not too large in any case, for example, n = 10. In addition, the more uncorrelated the P_i, the easier it is to sample from the total posterior. But in this way we speed the process up by a factor of n, making it more tractable for large sets. Additionally, since the runs are independent, they can be performed on a large computer in parallel, so that even large SN samples can be analyzed with the computational resources available to the typical astrophysics department (for example, we ran the Level II SNANA analysis above without this trick in just a day on the local cluster). A quicker method of determining the posterior probabilities is obtained by taking the ratio of the probability that a data point i is from the SN Ia type relative to the probability that it is a non-SN Ia. If, instead of marginalizing over all other parameters, we evaluate the posterior at the maximum likelihood point for the cosmological parameters, where $\boldsymbol{\theta} = \boldsymbol{\theta ^*},$ the ratio is simply given by

$\begin{equation} \frac{P(\mu _i|\boldsymbol{\theta ^*},\tau _i = 1)}{P(\mu _i|\boldsymbol{\theta ^*},\tau _i = 0)}. \end{equation} \tag{ 27 }$

We compare the approach to the computationally intensive one discussed above in Figure 11 as a function of true SN type for the Level II SNANA simulation. In general, the probabilities are consistent, especially in the case of the SNe Ia, with no SN Ia being given a high probability in the full approach and being given a low probability using the ratio of maximum likelihoods. The mean and standard deviation of the distribution of residuals between the two approaches are δP_Ia = 0.005 ± 0.015. As such, the approximate method provides a robust check of the full approach, as differences in the probabilities are mostly related to convergence properties of the full estimation. As might be expected, non-SNe Ia that are given a high probability using the full method are also given a high P_Ia using the approximate method.

5. TESTS AND CHECKS FOR BIAS

5.1. Dependence on Error Accuracy

The full error on the distance modulus is given in Equation (11)—where the error combines measurement error (from light-curve fitting), intrinsic dispersion (from the absolute magnitude distribution of the SNe), and peculiar velocity error. In general, the peculiar velocity error is degenerate with the non-SN Ia distribution characteristics in that the velocity error tends to increase the errors at low redshift. However, the intrinsic dispersion of the non-SN Ia effectively controls the spread in the distribution, which we know to increase at low redshift. Hence, fitting for the velocity and intrinsic dispersion together can lead to one being unconstrained. We test for the dependence of the cosmological results on this effect in the Level I Gaussian simulation only, where the input simulated data model is completely understood. In the cosmological analysis we set the peculiar velocity term to be set by v_pec = 300 km s⁻¹ (Lampeitl et al. 2009); however, doubling v_pec to 600 km s⁻¹ does not change the inferred cosmology. When we allow v_pec to be free, we find it unconstrained by the data, with the minimum value saturating the lower bound of the prior of log (v_pec/c) = −20, and the maximum given by v_pec < 1922 km s⁻¹.

5.2. Dependence on SN Ia/Non-SN Ia Rates

An additional complication to the probabilities is the dependence of the probabilities on redshift. This redshift dependence occurs as a result of the fact that the signal-to-noise ratio changes as a function of redshift and the effective rest-frame filters used to type SNe. The relative numbers of SNe at a given redshift depend on the various SN rates (or number of explosions per year per unit volume). In general, non-SN Ia rates are less certain than SN Ia rates, since SNe are mainly followed up in a cosmological survey if they already appear to be good SN Ia candidates. As a test of this dependence, we modify the probabilities in two ways: first, we scale the true probabilities as a function of redshift as

$\begin{equation} P_{{\rm Ia},z} = \mathrm{min}\left(P_{\mathrm{\rm Ia}}\frac{1+z}{1+z_{{\rm max}}},1\right), \end{equation} \tag{ 28 }$

which increases the probability of being SN Ia of data at higher redshift. In this case the fact that there is no redshift dependence in A itself introduces a slight bias in the inferred cosmology, as is shown in Figure 12, with the input cosmology only recovered at 2σ (the filled 1σ and 2σ contours from the linear unbiased case are shown for comparison). An alternative way of probing this dependence is by artificially changing the relative numbers of SNe Ia to non-SNe Ia in a given simulation. We do this by simply removing a subset of the SN Ia data (where the data are binned in 10 redshift bins) after assigning probabilities to ensure that we are effectively biasing the probabilities—or rather, that the probability of being a Type Ia at a given redshift will not reflect how many SNe Ia actually exist at that redshift. This case is also shown in Figure 12 for the Level I Gaussian simulation and the Level II SNANA simulation, where the contours are slightly larger than the standard case (given that data are removed) but are consistent with the input cosmology at 1σ.

**Figure 12.** Dependence of the probabilities on SN rate: the 2σ contours in the Ω_m, Ω_Λ plane when considering two different methods of introducing a redshift dependence on the probabilities for the Level I Gaussian simulation (top panel) and the Level II SNANA simulation (bottom panel). The brown (color online) contours show the case where the P_Ia probabilities were explicitly changed to depend on redshift through Equation (28). The black curves illustrate the case where the overall probabilities are left unchanged, but the number of SNe Ia relative to the non-SNe Ia is changed as a function of redshift.
Download figure:
Standard image High-resolution image

5.3. Dependence on Probability

The BEAMS algorithm naturally uses some indication of the probability of a data point to belong to the SN Ia population, whether it is some measure of the goodness of fit of the data to an SN Ia light-curve template or something more robust such as the relative probability that the point is an SN Ia compared to the probability of being of a different type. By including a normalization factor, we can correct for general biases in the probabilities of the SN Ia points. One might still question, however, how sensitive BEAMS is to the input probability of the objects. For the Level I Gaussian simulation, where we assign the probabilities, P_Ia, directly, we can change the relationship between the true underlying distribution of the types (i.e., the ratio of SNe Ia to non-SNe Ia in the sample) and the input probability value (the number we input into the BEAMS algorithm as the P_Ia). If the probabilities are unbiased, then the distribution of types should follow the probability distribution of the data; in other words, 60% of the points with P_Ia = 0.6 should be SNe Ia. This is the standard case. We then modify the probabilities by assigning a probability of P_Ia = 0.3 to all points (which we know will be biased since the mean probability of the sample is 0.667).

We compare the constraints in the two cases in Figure 13. If we ignore all probability information and set it to a (biased) value of P_Ia = 0.3, the probability information is essentially controlled by the normalization parameter. A tends to a value of 4.7, which when inserted into Equation (4) yields a "normalized" probability of P_Ia = 0.668. Hence, BEAMS uses the normalization parameter to remap the mean of the given probabilities to ones that have a mean that fits the true unbiased probabilities. In correcting for this effect, BEAMS manages to recover cosmological parameters consistent with the unbiased case.

For the Level II SNANA simulation, the true underlying probability distribution is more complicated, and so we test for dependence on probability in a different way. The SNANA simulation mimics real-life observations in that it treats the simulated light curves as "real" data and fits them in the same way one would fit and analyze current data. The main bias from this data set will be any bias introduced in the probabilities of the data to be of Type Ia, since we have no guarantee a priori that the probabilities will be unbiased. We thus fit for the cosmology assuming different proxies for the probability, either taking the probability from the light-curve fitter alone, P_fit, or setting the probabilities to an arbitrary value of P = 1/2. In the Level III SDSS-II SN data, we add in a case where an additional, typer probability P_typer (Sako et al. 2008, 2011) is used, which computes the relative goodness of fit of different SN Ia and non-SN Ia templates to the data. In the case of the Level III SDSS-II SN data, the average probability obtained using the PSNID fitting procedure is 〈P_typer〉 = 0.79. In this case the normalization parameter A peaked around 0.3, resulting in a normalized average probability of 〈P^(A)_{Ia, i}〉 = 0.525. In the case where the MLCS2k2 goodness-of-fit probabilities are used, the average probability of being an SN Ia is lower, 〈P_fit〉 = 0.41. In this case the A parameter is distributed around A = 25, leading to 〈P^(A)_{Ia, i}〉 = 0.95—BEAMS tries to increase the average probability of all points to be close to unity. Finally, when setting the probability to 0.5, A is centered around 2.5, leading to 〈P^(A)_{Ia, i}〉 = 0.71. Typing SNe effectively is an active area of research (Johnson & Crotts 2006; Kuznetsova & Connolly 2006; Connolly & Connolly 2009; Poznanski et al. 2006; Rodney & Tonry 2009, 2010; Sako et al. 2008, 2011; Scolnic et al. 2009; Newling et al. 2011b); indeed, a recent community-wide challenge provided a way of testing the ability of various approaches to type SN efficiently (see Kessler et al. 2010a, and references therein). With more data and improved algorithms, the probabilities used in photometric SN analysis will greatly improve. As Figure 14 illustrates, however, BEAMS can use the minimal amount of probability information and recover consistent results.

**Figure 14.** Using different probabilities for the Level II SNANA simulations and the Level III SDSS-II SN data. Top: the blue filled 1σ, 2σ contours (color online) show the constraints when setting the probabilities of all points to P = 0.5, while the orange dashed curves (color online) show the 2σ contours when using the goodness-of-fit probabilities from the MLCS2k2 fitter for the Level II SNANA simulation. Bottom: the solid 1σ, 2σ blue contours (color online) are from ignoring probability information and setting all the probabilities of the points to P_Ia = 0.5 for the Level III SDSS-II SN data. The light curves (2σ constraints) result when using the MLCS2k2 goodness-of-fit probability, which is un-normalized relative to the other types and is typically low for the sample. The dark purple contours (color online) are from using the probabilities for each point from the PSNID prescription (Sako et al. 2008, 2011), also at 2σ. In both cases, the effect is a shift of <1σ in the inferred cosmological contours.
Download figure:
Standard image High-resolution image

6. CONCLUSIONS AND OUTLOOK

BEAMS is a statistically robust method of parameter estimation in the presence of contamination. The key power of BEAMS is in the fact that it makes use of all available data, hence reducing the statistical error of the measurement, whether or not the purity of the sample can be guaranteed. Rather than discarding data, the probability that the data are "pure" is used as a weight in the full Bayesian posterior, reducing potential bias from the interloper distribution. We summarize the paper as follows:

1.
We tested the BEAMS algorithm on an ideal Gaussian simulation of 37,529 SNe, consisting of one population of non-SNe Ia and one SN Ia population. We showed that the area of the contours in the Ω_m, Ω_Λ plane, when using BEAMS, is six times smaller than when using only a small spectroscopic subsample of the data. In addition, we showed that the size of the error ellipse using BEAMS decreases as Equation (16).
2.
We tested the BEAMS algorithm on a more realistic simulated sample of 35,815 SNe obtained from light-curve fitting (Kessler et al. 2009b), which includes many more than two populations of non-SN Ia, and discussed the validity of the single non-SN Ia population assumed in this version of the BEAMS algorithm. A simple two-parameter model fails to completely describe the distribution; however, the constraints using BEAMS are not significantly biased.
3.
We applied BEAMS to the SDSS-II SN data set of 792 SNe, using photometric data points with host galaxy spectroscopic redshifts, and showed that the BEAMS contours are three times smaller than when using only the spectroscopically confirmed sample of 297 SNe Ia.
4.
In both the "realistic" and Gaussian simulations, we assume a variety of models for the distance modulus distribution of the non-SN Ia population and test for a dependence of the inferred cosmology on the assumed form of the distance modulus function. BEAMS requires a model with enough freedom to capture the behavior as a function of redshift: functions of quadratic and higher order are required to fit the Level II SNANA simulations, while no strong preference is seen for any particular model using the SDSS-II SN sample.
5.
We investigated possible biases introduced through incorrect probability or rate information, or error accuracy, showing that BEAMS can correct for the biases when suitable nuisance parameters were marginalized over.
6.
We discussed the ability of BEAMS to determine the posterior probability of a point based on its fit to the best-fit model. Posterior probabilities estimated through BEAMS more accurately model the relative probabilities of the SN types.

As mentioned above, we have restricted ourselves to the binomial case of an SN Ia population and one general core-collapse, or non-SN Ia, population. While this assumption is valid for the SDSS-II SN data, we see that a more complicated model, with at least two separate non-SN Ia Gaussians, is more appropriate for the Level II SNANA simulations (see Figure 4). This remains to be confirmed with large photometric data sets, such as BOSS. The BEAMS method can easily be extended to the multinomial case, as we learn more about the distributions of the contaminant populations—this will be performed in the upcoming BEAMS analysis of the SDSS SNe with host redshifts obtained through the BOSS survey. We list some general lessons for photometric SN cosmology with BEAMS in Appendix B.

In addition to the binomial approximation for the likelihoods, we have also assumed that host galaxy redshifts are known for all SNe. One can include photometric redshift error by looping N times, once per SN, and marginalizing over the redshift of the ith SN in each case (in a similar fashion as the method of obtaining the post facto estimation of the SN Ia probability of each SN in Section 4.4). The sensitivity to redshift error can also be included directly through fitting for cosmology directly in "light-curve" space (see, for example, March et al. 2011). However, we leave this general treatment of redshift uncertainty in BEAMS to future work.

In the specific case of the SDSS-II Supernova Survey, we will apply BEAMS to the larger SN sample with spectroscopic host galaxy redshifts from the BOSS survey. Not only will the sample size of data points with accurate host redshifts increase (thereby further reducing the cosmological contours from photometric data), but the larger sample of non-SNe Ia will allow one to easily calibrate the constraints obtained with the current application of BEAMS against the case where one only has photometric redshifts, which will be the case for LSST (Gong et al. 2010).

With the wealth of new photometric data awaiting SN cosmology, BEAMS provides a platform to learn more about the SN populations, while at the same time tackling the fundamental questions about the constituents of the universe.

We thank Michelle Knights for comments on the draft. R.H. thanks Jo Dunkley, Olaf Davis, David Marsh, Sarah Miller, and Joe Zuntz for useful discussions and thanks the Kavli Institute for Cosmological Physics, Chicago, the South African Astronomical Observatory, the University of Cape Town, and the University of Geneva for hospitality while this work was being completed. M.K. thanks AIMS for hospitality during part of the work. R.H. acknowledges funding from the Rhodes Trust and Christ Church. M.K. acknowledges funding by the Swiss NSF. B.B. acknowledges funding from the NRF and DST. Part of the numerical calculations for this paper were performed on the Andromeda cluster of the University of Geneva. M.V. was supported by National Research Foundation grant 78870.

Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web site is http://www.sdss.org/. The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions. The Participating Institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAMOST), Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.

APPENDIX A: PROBABILITY CORRELATIONS

As we highlighted in Section 4.4, the normalization parameter A is crucial to normalize the probabilities in order to avoid biases introduced by the fitting or typing procedure. Moreover, it provides a mechanism for removing the strong dependence on probability directly, by allowing BEAMS to adjust the probabilities according to the global fit to a distance modulus function. In Figure 15, we show that there are no strong correlations between the SN Ia probability determined from the SNANA fits to the light-curve models and the difference between the input cosmology and the inferred cosmological model. In general, there is more spread in the non-SN Ia population; however, there are some non-SN Ia points with high probability, and there are low-probability SN Ia data. We leave the investigation of different forms for A to future work.

**Figure 15.** Distance-modulus–probability correlations in the Level II SNANA simulations: while a correlation exists between the SN Ia probability (determined from the χ² fit to the light-curve models within SNANA) and SN type—"true" SNe Ia have a higher probability of being SN Ia—there is no significant correlation between the residual distance modulus of the SNANA simulation and the P_Ia probability.
Download figure:
Standard image High-resolution image

APPENDIX B: BEAMS TROUBLESHOOTING

Analysis of purely photometric data brings its own challenges to the fore. We briefly highlight some considerations when applying the algorithm to such data.

1.
In order for BEAMS to recover the correct cosmology, the freedom to capture the characteristics of the non-SN Ia (and indeed the SN Ia) distributions is required. In particular, we found that the error analysis has a significant impact on the inferred cosmology, both within BEAMS but equally for a basic χ² approach. Our addition of two different intrinsic dispersion terms for the SN Ia and non-SN Ia populations effectively changes the relative weighting of the populations in a consistent manner, while still taking into account the measurement error on each point, which may or may not be a function of type.
2.
When applying fitting procedures, such as MLCS2k2 to the data set, efficiency maps (to account for Malmquist bias, for example) should be carefully calibrated not to introduce redshift-dependent biases to the data set. Alternatively, the BEAMS likelihood can be adjusted from a standard Gaussian to a truncated or deformed Gaussian distribution to account for the selection bias in the survey. We leave this investigation for future work.
3.
As the number of observations of the contaminants increases, new forms of the non-SN Ia distance modulus function may be more strongly motivated by the data. While we have tested various forms for simulated SNANA data and for the SDSS-II survey data, these functions should be varied to allow the model enough freedom to capture the deviations from the standard SN Ia distance modulus relation. In addition, future data may indicate a preference for multiple populations, a feature that is easily included in BEAMS.

PHOTOMETRIC SUPERNOVA COSMOLOGY WITH BEAMS AND SDSS-II

Article metrics

Permissions

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION

2. THE BEAMS FRAMEWORK FOR PHOTOMETRIC SN COSMOLOGY

2.1. Basic Formalism

2.2. The Likelihood Distribution for SN Ia

2.3. Forms of the Non-SN Ia Likelihood

2.4. Markov Chain Monte Carlo Methods

2.5. Comparison to Standard χ² Methods

3. DATA SETS

3.1. Level I: Gaussian Simulations

3.2. Level II: SNANA Simulations

3.3. Level III: SDSS-II SN Photometric Data

4. APPLICATION OF BEAMS

4.1. Performance of BEAMS Comparisons across Data Sets

4.2. Scaling of Error Bars

4.3. Constraining ϒ(z) Forms for the Non-SN Ia Population

4.3.1. Level I Gaussian Simulation

4.3.2. Level II: SNANA Simulations

4.3.3. Level III: SDSS-II Data

4.4. BEAMS Posterior Probabilities

4.4.1. Methodology

4.4.2. Toy-model Illustration of Posterior Probabilities

4.4.3. Application to Level II

4.4.4. Approximate Methods

5. TESTS AND CHECKS FOR BIAS

5.1. Dependence on Error Accuracy

5.2. Dependence on SN Ia/Non-SN Ia Rates

5.3. Dependence on Probability

6. CONCLUSIONS AND OUTLOOK

APPENDIX A: PROBABILITY CORRELATIONS

APPENDIX B: BEAMS TROUBLESHOOTING

Footnotes

PHOTOMETRIC SUPERNOVA COSMOLOGY WITH BEAMS AND SDSS-II

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION

2. THE BEAMS FRAMEWORK FOR PHOTOMETRIC SN COSMOLOGY

2.1. Basic Formalism

2.2. The Likelihood Distribution for SN Ia

2.3. Forms of the Non-SN Ia Likelihood

2.4. Markov Chain Monte Carlo Methods

2.5. Comparison to Standard χ2 Methods

3. DATA SETS

3.1. Level I: Gaussian Simulations

3.2. Level II: SNANA Simulations

3.3. Level III: SDSS-II SN Photometric Data

4. APPLICATION OF BEAMS

4.1. Performance of BEAMS Comparisons across Data Sets

4.2. Scaling of Error Bars

4.3. Constraining ϒ(z) Forms for the Non-SN Ia Population

4.3.1. Level I Gaussian Simulation

4.3.2. Level II: SNANA Simulations

4.3.3. Level III: SDSS-II Data

4.4. BEAMS Posterior Probabilities

4.4.1. Methodology

4.4.2. Toy-model Illustration of Posterior Probabilities

4.4.3. Application to Level II

4.4.4. Approximate Methods

5. TESTS AND CHECKS FOR BIAS

5.1. Dependence on Error Accuracy

5.2. Dependence on SN Ia/Non-SN Ia Rates

5.3. Dependence on Probability

6. CONCLUSIONS AND OUTLOOK

APPENDIX A: PROBABILITY CORRELATIONS

APPENDIX B: BEAMS TROUBLESHOOTING

Footnotes

2.5. Comparison to Standard χ² Methods