Brought to you by:

The following article is Open access

New Mass Estimates for Massive Binary Systems: A Probabilistic Approach Using Polarimetric Radiative Transfer

, , , , , , and

Published 2022 May 5 © 2022. The Author(s). Published by the American Astronomical Society.
, , Citation Andrew G. Fullard et al 2022 ApJ 930 89 DOI 10.3847/1538-4357/ac589e

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/930/1/89

Abstract

Understanding the evolution of massive binary stars requires accurate estimates of their masses. This understanding is critically important because massive star evolution can potentially lead to gravitational-wave sources such as binary black holes or neutron stars. For Wolf–Rayet (WR) stars with optically thick stellar winds, their masses can only be determined with accurate inclination angle estimates from binary systems which have spectroscopic $M\sin i$ measurements. Orbitally phased polarization signals can encode the inclination angle of binary systems, where the WR winds act as scattering regions. We investigated four Wolf–Rayet + O star binary systems, WR 42, WR 79, WR 127, and WR 153, with publicly available phased polarization data to estimate their masses. To avoid the biases present in analytic models of polarization while retaining computational expediency, we used a Monte Carlo radiative-transfer model accurately emulated by a neural network. We used the emulated model to investigate the posterior distribution of the parameters of our four systems. Our mass estimates calculated from the estimated inclination angles put strong constraints on existing mass estimates for three of the systems, and disagree with the existing mass estimates for WR 153. We recommend a concerted effort to obtain polarization observations that can be used to estimate the masses of WR binary systems and increase our understanding of their evolutionary paths.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Massive stars enrich the interstellar medium with heavy elements via stellar winds and supernova explosions. Their high-mass remnants, such as neutron stars and black holes, have been detected with gravitational-wave observations (e.g., Abbott et al. 2016, 2017a). The evolutionary paths that massive stars take to reach different types of supernovae, or even core collapse without a luminous event, are strongly dependent on their masses and mass-loss rates (Puls et al. 2008; Langer 2012; Abbott et al. 2016; Woosley 2019).

A large fraction of sufficiently massive (18–80 M) individual stars are expected to evolve to the Wolf–Rayet (WR) stage (Sander et al. 2019), which is characterized by strong emission line spectra and high mass-loss rates (∼10−5 M yr−1; Crowther 2007). However, the optically thick winds of WR stars make it impossible to measure their masses using $\mathrm{log}g$. Fortunately, massive stars typically occur in binary systems (Mason et al. 2009; Sana et al. 2014) so it is possible to obtain mass estimates using orbital models. Massive binaries can evolve into compact binaries that emit gravitational waves (Abbott et al. 2017b). Aside from using binaries for mass estimation, binary interactions can have a strong effect on the pre-supernova masses and structure of massive stars (e.g., Laplace et al. 2021) and more than 50% of massive binaries interact during their lifetime (Sana et al. 2012, 2013). Mass transfer as part of these interactions reduces the mass at which stars can reach the WR stage (McClelland & Eldridge 2016). It is therefore important to constrain the masses of massive binary systems so that we can understand whether or not the WR components were formed as a result of binary interactions.

Only three massive Milky Way binaries with WR stars have masses derived using both spectroscopic and visual binary orbits: γ2 Velorum (Lamberts et al. 2017), WR 133 (Richardson et al. 2021), and WR 140 (Thomas et al. 2021). Fifteen Milky Way WR binary stars have mass estimates derived from spectroscopy, photometry, and polarimetry, although their uncertainties are not well constrained (Crowther 2007). Outside the Milky Way, some progress has been made to determine WR + O masses in the Magellanic Clouds using radiative-transfer models of WR and O star spectra (Shenar et al. 2016, 2019).

Measuring the masses of noneclipsing binary stars is a difficult task. The most robust method is to combine spectroscopic radial velocity orbital solutions that provide $M\sin i$ values with astrometric orbits to obtain the inclination angle of the system, and thus derive masses. Obtaining the inclination angle requires alternative solutions in the absence of a visual orbit. Methods that have been used to determine inclination angles without a visual orbit include photometric models that are reliant on wind eclipses (e.g., Lamontagne et al. 1996), stellar spectral synthesis models (e.g., Martins et al. 2005), models of colliding winds (e.g., Hill et al. 2000), and polarimetric models (e.g., Brown et al. 1978). These methods span a wide range of uncertainties, with the most precise being the photometric models (on the order of 50% formal uncertainty in WR mass; Lamontagne et al. 1996) and the least precise the models based on stellar spectral synthesis. The true uncertainties of the photometric models may be larger (see footnote 11 of Lamontagne et al. 1996). The stellar spectral synthesis method does not report uncertainties, but rather a likely mass range for the O star based on its spectral class, and that informs the inclination angle of the system to determine the WR mass in concert with previous inclination angle estimates (Vanbeveren et al. 1998, 2020). Furthermore, the methods do not consistently agree on the masses for a given WR + O system, exacerbated by the range of reported uncertainties (or lack thereof). For example, previous estimates for the mass of WR 79 from the photometric and polarimetric methods disagree by more than 3σ (Lamontagne et al. 1996). This motivates the more robust method that we describe in this paper.

The polarimetic method mentioned previously works by modeling the time-varying linear polarization signal that occurs when the hot winds of the WR star act as electron-scattering regions for photons produced both by the WR star and its companion (see Figure 1, showing an electron-scattering region and WR + O illumination sources). This produces a constant linear polarization from the WR star when the wind is asymmetric (Brown & McLean 1977). Illumination by the secondary star produces the time-varying linear polarization, which is partly dependent on the orbital inclination angle i (Brown et al. 1978), and this provides a way to measure inclination angles and thus masses. To recover accurate uncertainties for the inclination angle, we must determine the posterior distribution of the inclination angle through repeated model evaluations.

Figure 1.

Figure 1. Schematic diagram of the binary model. The diagram is presented such that the system is viewed at an inclination angle of 60° measured from the normal to the orbital plane. A definition of the model inclination angle i is shown in the bottom right. The stars are the labeled circles of radius r*; the dotted ellipse is the scattering region of electron scattering optical depth τ. The diagram is not to-scale (r*a). Other parameters are described in Section 3.1 and Table 2.

Standard image High-resolution image

There are two primary methods for modeling polarimetric signals: analytic methods and numerical radiative-transfer methods. Analytic models suffer from a variety of challenges, including a bias toward high inclination angles and limitations to optically thin scattering regions (τ < 0.01; Aspin et al. 1981; Brown et al. 1982; Simmons et al. 1982; Wolinski & Dolan 1994; Buchner 2019). In contrast, numerical radiative-transfer methods are not biased by inclination angle and can include the effects of high optical depths such as multiple scattering (e.g., Hoffman et al. 2003) though these simulations are computationally expensive. We have used an enhanced version of the Monte Carlo radiative-transfer (MCRT) code slip (Hoffman 2007; Huk 2017; Shrestha et al. 2018) to perform numerical simulations of binary stars in circular orbits, for cases where one star has a scattering region surrounding it. To obtain sufficient signal-to-noise for numerical polarimetric estimates, which are often much less than 1% of the total emitted intensity, slip requires large numbers of photons to be propagated through the simulation, slowing its evaluation times. To reconstruct the posterior distribution of the model parameters, we must evaluate the model many times. This motivates the adoption of acceleration methods.

One such method to accelerate models is that of emulators based on neural networks. Emulators relate the approximated model to its input parameters. Neural networks are capable of rapidly approximating any model (see, e.g., Cybenko 1989; Hornik et al. 1989). In our case, the trained neural network rapidly generates an approximated polarimetric slip model based on the input parameters. This means the emulator can be rapidly sampled to obtain many models and infer parameters using Bayesian statistics with maximum-likelihood testing for problems that lie within the range of the training data. This technique has been successfully applied to spectral models (Czekala et al. 2015; O'Brien et al. 2021; Kerzendorf et al. 2021).

In this paper, we use an enhanced version of the slip radiative-transfer code accelerated by an emulator to produce a robust probabilistic estimate of the inclination angles and thus masses of four WR + O binary systems in the Galaxy with existing polarimetric observations. In Section 2, we describe the polarimetric data and the WR + O systems. In Section 3, we describe our radiative-transfer model and emulator procedure. In Section 4, we present our results and discuss their implications. Appendix A describes the SLIP code in detail, and we validate our model in Appendix B. In Appendix C, we provide additional details about our emulator method. The full parameter spaces for our investigations of each system are presented in Appendix D.

2. Data

We investigated four WR + O binary systems with publicly available, phased polarization data. We sourced the data from St-Louis et al. (1987; WR 42 and WR 79) and St-Louis et al. (1988; WR 127 and WR 153). All of the objects were observed with the Minipol polarimeter at Las Campanas, Chile from 1985 to 1986 (Frecker & Serkowski 1976), using a blue filter with central wavelength 4700 Å and FWHM 1800 Å. Table 1 lists the objects and their estimated orbital characteristics. We refer to the systems via their WR catalog number as defined in Crowther (2015). 7

Table 1. WR + O System Spectral Types and Orbital Data

WRSpectral TypePeriod (d)Source i(°)SourceΩ(°)Source
42WC7 + O7V7.8912136–442, 363.1–72.86
79WC7 + O5-88.8911129–452, 3103.2–111.26
127WN5o + O8.5V9.5550255–90483.6–1037
153WN6o/CE + O3-66.6887365–78593.5–95.47

Note. Spectral types are taken from Crowther (2015).

References. 1: Hill et al. (2000); 2: Lamontagne et al. (1996); 3: Hill et al. (2002); 4: de La Chevrotière et al. (2011); 5: Demers et al. (2002); 6: St-Louis et al. (1987); 7: St-Louis et al. (1988).

Download table as:  ASCIITypeset image

These systems have similar orbital periods (∼1 week) and circularized orbits. They are split into two each of WC- and WN-type WR stars with O-type companions. We used the orbital phases calculated by St-Louis et al. (1987, 1988) despite the determination of newer ephemerides (e.g., Nazé et al. 2021), because any uncertainties in the ephemerides are greatly amplified by the large number of orbits of each system that have occurred since the polarimetric data were taken. The ephemerides in St-Louis et al. (1987, 1988) are closer in date to the observations, and thus the effects of uncertainties on the orbital phase are lowest.

3. Methods

For this investigation, we used an enhanced version of the MCRT code slip (Hoffman 2007; Huk 2017; Shrestha et al. 2018, 2021) that includes binary star capabilities (Fullard 2020). Appendix A gives more information about the design of slip. Using this code, we created a model system and trained a neural-network emulator in order to analyze the observations.

3.1. Model Binary System

Our model binary system is described using four parameters: Ifrac, τ, zell, and i. Figure 1 shows a schematic diagram of the model. Here, Ifrac is the fractional intensity of the central photon source (the WR star) relative to the total number of photons. Thus, the O star intensity is 1 − Ifrac. τ is the electron scattering optical depth of the scattering region at i = 90°. zell is the half-height of the ellipsoidal scattering region as measured normal to the orbital plane, in arbitrary units. xell and yell are the x and y extents of the scattering region and are held constant and equal to each other. If zell = xell = 1.5, the scattering region is spherical. If zellxell, the scattering region can be considered prolate or disk-like. i is the inclination (viewing) angle, measured from the normal to the orbital plane. The remaining parameters shown in Figure 1 are held constant and are described in Table 2. slip produces simulations of the fractional q and u polarization signal produced by the electron scattering in this model, normalized from the Stokes Q and U vectors as q = Q/I and u = U/I.

Table 2. Constant Model Parameters for the Emulator

ParameterValueDescription
r*4.65 × 10−4 Stellar radii
a 3.1Binary separation
xell, yell 1.5 x, y extent of the scattering region

Note. Lengths are displayed as relative values (see Figure 1).

Download table as:  ASCIITypeset image

The zell parameter should not be thought of as the true geometry of the scattering region. Rather, it is a measure of the asymmetry of the scattering region or the density distribution within it (e.g., a disk-like scattering region produces polarization signals similar to a spherical scattering region with equatorially enhanced density). We assume only electron scattering is an important contributor to the polarization signal, which is a reasonable assumption for the case of a highly ionized WR star wind. To reduce the number of parameters required for the model, we assume a constant electron density in the scattering region and a constant albedo of unity, as well as total ionization of the wind. The uniform density means that for scattering regions that are asymmetric, there is a nonlinear relationship between zell and τ as a function of i. The constant albedo of one means that the polarization signal is independent of wavelength.

We assume that the stars are sufficiently separated (parameter a) so that eclipse effects are negligible, which is appropriate for the range of expected inclination angles and periods of our chosen objects. Fox (1994) showed that occultation is only important in extremely close binary systems, where the separation of stars is less than 10 times the radius of the primary. In the case of WR 127, eclipsing inclination angles are unlikely as they would result in a mass for the system that is significantly lower than any other estimate (see comparisons and discussion in Section 4). An exploration of eclipse effects would require varying star-separation and star-size parameters, with a commensurate increase in the number of model runs required to train the emulator.

The sample space of our model, tabulated in Table 3, was chosen to represent the four WR + O binaries described in Section 2. The range of scattering region geometries encapsulated by zell transitions from disk-like (0.2 < zell < 1.5) through to prolate (1.5 < zell < 3.1) distributions. For the WR + O star binaries we have investigated in this paper, the scattering region approximates various geometries (or, equivalently, density distributions) of the WR star wind. The optical depth range covers scattering from optically thin material (the outer regions of WR star winds) to more optically thick material (closer to the location of emission lines that drive the wind; Sander et al. 2020). The inclination angle range is valid for Stokes Q and U polarization measurements, which can distinguish between above and below the orbital plane, or equivalently the orbital direction about the orbital plane normal vector.

Table 3. Sample Space of the Model

ParameterMinMaxDescription
Ifrac 0.01.0Fractional intensity of the WR star
τ 0.01.0Electron scattering optical depth
zell 0.23.1 z extent of the scattering region
i 0fdg0180fdg0Orbital inclination angle

Download table as:  ASCIITypeset image

3.2. Transforming to the Observational Frame

When a binary system is observed in polarized light, two additional factors arise that we do not model using slip. These are interstellar polarization (ISP) and the position angle of the binary system's orbit as projected onto the sky.

The ISP in the Milky Way at optical wavelengths is described by the empirical Serkowski law (Serkowski et al. 1975). We treat it as constant with time for our purposes, because the observations were taken over a timescale that is much shorter than the expected timescale of ISP change (∼30–50 Myr; Voshchinnikov & Hirashita 2014).

In our case of broadband filter polarimetry, we add ISP to the q and u models separately by calculating the qISP and uISP at the peak wavelength of each filter. This ISP estimate is therefore a flux-weighted ISP across the observed filter.

The observed position angle of the binary orbit manifests itself as a rotation of the observed polarization in the qu plane, and thus applying this rotation to the model moves it to the observational frame. The rotation is applied to the model using

Equation (1)

Equation (2)

where $p=\sqrt{{q}_{\mathrm{em}}^{2}+{u}_{\mathrm{em}}^{2}}$ (total polarization), θ is the position angle of the observation, and θR is the rotation angle of the model. Following Villar-Sbaffi et al. (2005), we recover the angle of the line of ascending nodes, Ω, using Ω = (θR + π)/2.

A final correction is needed to match the model's orbitally phased polarization signal to the observed phases. We project the model onto an orbital phase grid of 40 points, shifted to the center of each orbital phase bin that is output from slip. We linearly interpolate the model onto the observed phases so that the likelihood function (see Section 3.4.2) is valid for each observed phase ϕ.

3.3. Emulator Pipeline

Typical run times for our model in the sample space are around 2–4 hr (see Appendix C.1). This makes model evaluations intractable for producing probabilistic estimates of parameters. We accelerate the process with an emulator, which is a machine-learning method, in this case a neural network, that is trained to relate model parameters to the output of the model. Such emulators have been successfully applied to astrophysics problems (e.g., Type Ia supernovae; O'Brien et al. 2021). The emulator learns the relationship between model parameters and output by fitting to a parameter set and its associated model outputs (the training data). The user chooses the training data, and these constrain the region over which the emulator is valid.

The construction of an emulator thus requires training data but also a specified neural-network architecture. We trained the emulator on the model described in Section 3.1. The emulator training set is described in Appendix C.1 and the emulator architecture is described in Appendix C.2. The emulator increases the model evaluation speed by a factor of ∼108 over running slip for a typical model in the sample space in Table 3. The emulator representation of slip for the training set is accurate to within 5% of the slip model.

3.4. Parameter Inference

Two components are required for a rigorous parameter inference from our model: a prior distribution and a likelihood function. Obtaining the posterior distribution requires the selection of a prior probability distribution across the parameter space and a likelihood function to compare the data with the model. We describe the prior distributions in Section 3.4.1. Polarimetric models are produced by the emulator from samples drawn from the prior distribution. We calculate the likelihood of an individual prior input using a χ2 likelihood function, described in Section 3.4.2. We describe our exploration of the posterior via Monte Carlo sampling of the priors in Section 3.4.3.

3.4.1. Priors

We constructed the Bayesian prior of the parameters listed in Table 2 using information about the target object. We sampled the parameters Ifrac, τ, and zell uniformly over the entire sample space of the model, because we do not have existing constraints on these parameters that are narrower than the sample space. We chose a normal distribution for the prior of the inclination angle, i, with the standard deviation set as the half-range of previous inclination angle estimates for each system in Table 1, and the mean taken to be the center of the range. In this way, we incorporate existing information about the system, including photometric and spectroscopic measurements. Ifrac is allowed to vary over the entire emulator sample space.

For the transformation parameters described in Section 3.2, we sample qISP and uISP from Gaussian distributions with the peaks located at the estimated ISP from Fullard et al. (2020) for each target. We take σ values from the propagated uncertainty of qISP and uISP, using the uncertainties reported by Fullard et al. (2020). For WR 42, we refit the ISP following the methods in Fullard et al. (2020) without the intrinsic constant polarization component, because our model should produce intrinsic constant polarization by varying Ifrac, τ, and zell. The ISP values are presented in Table 4. The transformation parameter, θR , is taken as a uniform prior from −π/2 ≤ θR π/2 because existing estimates for the directly related orbital parameter, Ω, were produced with the biased analytic polarization model. Note that this specific range occurs because the polarization position angle is degenerate (repeating in π instead of 2π rotations in the qu plane).

Table 4. WR + O ISP Estimates

WR Number qISP (%) uISP (%)
42−0.085 ± 0.164−1.13 ± 0.179
79−0.284 ± 0.020−0.214 ± 0.025
127 0.618 ± 0.149 0.661 ± 0.140
153−0.182 ± 0.170 4.16 ± 0.018

Note. Calculated from Fullard et al. (2020), except for WR 42 (see text).

Download table as:  ASCIITypeset image

3.4.2. Likelihood

We formulated the likelihood function with a log-likelihood function

Equation (3)

where qem and uem are the predicted q and u Stokes vectors from the emulator using parameters t . qobs and uobs are the observed Stokes vectors, and σq and σu are the associated uncertainties of the observed Stokes vectors. The likelihood function is summed over ϕ, the orbital phase of the individual qobs and uobs data points. Stokes q and u are independent, hence we simply sum them as part of the logarithmic transformation of the likelihood function.

The transformation to the observational frame described in Section 3.2 replaces qem and uem in Equation (3) with transformed versions. The transformations are applied in the following order: interpolation to the observed phases, then rotation to the observed position angle, then addition of qISP and uISP to qem and uem.

3.4.3. Sampling

Following the methodology of O'Brien et al. (2021), we used the UltraNest 8 (Buchner 2021) nested sampling package to achieve robust sampling of the posterior distribution with the MLFriends algorithm (Buchner 2016, 2019). The posterior distribution was explored with 800 live points. It converged (based on the UltraNest measures) after ∼25,000 iterations and required ∼6 × 106 model evaluations for each target. The convergence of the sampling process typically took 2 hr.

4. Results and Discussion

We sampled the posterior distribution following the prescription described in Section 3.4.3, which produced a six-dimensional posterior distribution across the parameter space (one dimension for each parameter). Figure 2 shows the posterior distributions for the masses of the WR star components, derived from the inclination angles using ${M}_{\mathrm{WR}}{\sin }^{3}i$ measurements taken from Davis et al. (1981), Seggewiss (1974), de La Chevrotière et al. (2011), and Lamontagne et al. (1996). Existing mass estimates for the systems we investigated are based on spectral class (Shao & Li 2016; Vanbeveren et al. 2020), analysis of their polarimetric behavior (St-Louis et al. 1987, 1988), colliding-wind models (Hill et al. 2000), and photometric eclipse models (Lamontagne et al. 1996). Of these existing mass estimates, the spectral-class models do not provide uncertainties.

Figure 2.

Figure 2. WR 42, WR 79, WR 127, and WR 153 WR star mass posterior distributions, shown as solid curves. The gray shaded regions show the 90% credible interval of the distribution calculated using the highest-density interval algorithm. Vertical lines show previous mass estimates for the systems and are labeled with the method that was used to obtain the estimate. Spec. class: spectral-class mass estimates from Vanbeveren et al. (2020; WR 42, WR 79, WR 127) and Shao & Li (2016; WR 153). Arrows show lower limits. Phot: photometric mass estimates from Lamontagne et al. (1996). Pol: polarimetric estimates from St-Louis et al. (1987; WR 42, WR 79) and St-Louis et al. (1988; WR 127, WR 153). Wind model: mass estimates using a spectroscopic colliding-wind model from Hill et al. (2000).

Standard image High-resolution image

Figure 3 shows the posterior distributions of masses for the O star components. Table 5 shows the 90% credible interval(s) for the component masses of each object, and we discuss these results below. The credible intervals were calculated using the highest-density interval algorithm of the arviz Python package (Kumar et al. 2019). We display the full set of posterior distributions as corner plots in Appendix D.

Figure 3.

Figure 3. WR 42, WR 79, WR 127, and WR 153 O star mass posterior distributions, shown as solid curves. The gray shaded regions show the 90% credible interval of the distribution. Vertical lines show previous mass estimates for the systems and are labeled with the method that was used to obtain the estimate (see the caption to Figure 2).

Standard image High-resolution image

Table 5. Our New WR + O System Mass Estimates

WR Number MWR (M ) MO (M) $M{\sin }^{3}i$ Source
4216.2–44.027.4–74.61
79 (mode 1)8.3–12.822.5–34.82
79 (mode 2)15.3–62.841.8–170.92
79 (mode 3)89.5–152.2243.7–414.22
12711.1–102.519.8–183.13
15313.7–14.225.3–26.34

Note. Derived from our radiative-transfer models, including all modes for WR 79.

References. 1: Davis et al. (1981); 2: Seggewiss (1974); 3: de La Chevrotière et al. (2011); 4: Lamontagne et al. (1996).

Download table as:  ASCIITypeset image

4.1. WR 42

The WR mass 90% credible interval calculated from our method for WR 42 (16.2–44.0 M) agrees with existing mass estimates from all of the methods displayed in Figure 2, taking into account their uncertainties. For the previous polarimetric analysis (St-Louis et al. 1987), the estimate was 8–14 M; however, this did not fully take into account the bias in the analytical polarimetric model (Wolinski & Dolan 1994) and thus the true uncertainty is larger. The photometric method mass estimate (Lamontagne et al. 1996) did not provide WR mass uncertainties, though the reported inclination angle of 40fdg3 ± 2.9 produces a WR mass range of 11–16 M. It is important to note that the inclination angle uncertainty here is the "formal" uncertainty and the real values can be larger (see footnote 11 of Lamontagne et al. 1996). The wind model of Hill et al. (2000) produces a very uncertain mass estimate of 54 ± 43 M so we provide it for the sake of completeness. Our derived 90% credible interval for the O star mass agrees with the existing mass estimates from spectral-class models, encapsulating the Vanbeveren et al. (2020) estimated mass range of 27–37 M. The WR mass reported by Vanbeveren et al. (2020) is "adopted" as 16 M, based on the previous reported inclination angles for the system (and thus similarly uncertain as the polarimetric and photometric mass estimates).

It is interesting to note that the polarimetric measurement from St-Louis et al. (1987), produced using the same data, overlaps (within the St-Louis et al. 1987 polarimetric uncertainties) a small secondary peak in our posterior distribution. This may indicate a local minimum was found with the analytic model. If the uncertainty in $M{\sin }^{3}i$ reported by Davis et al. (1981) for WR 42 is used to calculate the mass range, this extends both the WR and O star mass 90% credible intervals to 25–48 M and 19–81 M, respectively.

Fullard et al. (2020) reported that the mean polarization signal of WR 42 had a large constant intrinsic component in both q and u. As noted in Section 3.4.1, we fit the ISP without the intrinsic polarization terms. As a result, the model predicts a strongly asymmetric WR wind that is more disk-like, or equivalently a symmetric WR wind with a strong density increase parallel to the orbital plane. This may explain the intrinsic polarization, and potentially represents the asymmetry produced by the colliding wind structures proposed by Hill et al. (2000).

4.2. WR 79

We find multiple peaks in the WR 79 mass posterior distribution. This creates a series of three modes with associated credible intervals, which we report from low to high mass in Table 5. The first credible interval (8.3–12.8 M) agrees with both the spectral-class-model mass lower limit of 10 M (Vanbeveren et al. 2020) and the photometric-model mass estimate of 9–13 M (Lamontagne et al. 1996) The second, more mathematically probable posterior peak and associated credible interval (15.3–62.8 M) matches the mass estimate from the colliding-wind model of Hill et al. (2000). However, we note that the wind-model mass estimate is very uncertain at 27 ± 17 M (Hill et al. 2000) and therefore it also overlaps with the first credible interval. The third credible interval (89.5–152.2 M) is very unlikely, and suggests masses well outside any previous model. None of the intervals agree with the previous polarimetric mass estimate that used the same data (St-Louis et al. 1987), confirming that the basic analytic model is unlikely to provide accurate results for binary systems with low inclination angles. Note that our model recovers an inclination angle greater than 90°, which is possible for polarimetry because the sign of Stokes U breaks the degeneracy present in spectroscopic orbital models. This inclination angle is equivalent to a reversed orbital motion compared to the model's motion in the inclination range 0° ≤ i ≤ 90°.

The physical implications of the multiple modes are of some interest. Scale renderings of the models are shown in Figure 4. The first inclination angle (mass) mode (top in Figure 4) corresponds to models with an extremely asymmetric WR wind, low electron scattering optical depth of τ < 0.1, and the majority of emission (∼95%) originating from the O star. The asymmetric wind can also be considered to be a strong polar density enhancement. However, such an enhancement is unlikely. Distortions to the WR wind caused by wind–wind collisions are observed to occur primarily in the plane of the orbit (e.g., Callingham et al. 2020) and models agree (e.g., Hill et al. 2000; Lamberts et al. 2017). Corotating interaction regions (see, e.g., Mullan 1984) are another potential source of wind asymmetries, but their effect on polarization is small unless they have an optical depth of order unity (Carlos-Leblanc et al. 2019), which would disagree with this first mass mode.

Figure 4.

Figure 4. (a) WR 79 representative model of the first posterior mode of mass 8.3–12.8 M at i = 144°. (b) WR 79 representative model of the second posterior mode of mass 15.3–62.8 M at i = 156°. Both figures are at orbital phase 0.5 and to scale. The triple lines on the scattering regions show the orbital planes, and the semitransparent fills represent their optical depths. The thick black lines are the projected O star orbits. The stars have relative emission equivalent to each model.

Standard image High-resolution image

The second mass mode (bottom in Figure 4) corresponds to models with a nearly spherical WR wind (with zell > xell, or equivalently a slight polar density enhancement), an electron scattering optical depth of τ ∼ 0.3, and the majority of emission (∼70%) originating from the WR star. The second WR mass mode at ∼27 M would further increase the similarities between WR 42 and WR 79, which are identical in WR type, similar in O star type, and with orbital periods only 1 day apart. The corresponding second mode O star mass of ∼60 M is compatible with the upper estimate for the O star from Vanbeveren et al. (2020). Another implication of the higher WR star mass is that the WR star would be more likely to have reached the WR stage without interacting with its companion, based on single-star evolutionary tracks (Sander et al. 2019).

4.3. WR 127

The posterior for the mass of the WR component of WR 127 (90% credible interval 11.1–102.5 M) is in agreement with Vanbeveren et al. (2020), since they provide only a lower limit. We also agree with both photometric and polarimetric estimates. However, our method provides an upper limit on the mass of 49.5 M. It also increases the mass of the lower limit to 11.1 M. The most probable mass is ∼14 M. The same is true for the O star, with its most probable mass near ∼25 M, at the upper end of the 17–24 M estimate given by Vanbeveren et al. (2020).

Our mass estimate also agrees with both the previous polarimetric (St-Louis et al. 1988) and photometric (Lamontagne et al. 1996) methods, which reported WR masses of 11–21 M and 12–16 M, respectively. Although these are lower uncertainty mass estimates than our presented values, the previous polarimetric estimate was published before the complete bias analysis of Wolinski & Dolan (1994) and thus the uncertainty is likely to be larger in reality. Furthermore, the photometric model assumes a spherical WR wind and formal uncertainties. This is unlikely for WR 127 given the evidence for a possible wind collision from X-rays (see Section 4.5 and Nazé et al. 2021).

4.4. WR 153

We find that WR 153 has a significantly lower mass (90% credible interval 13.7–14.2 M) than measured by photometric and spectral-class models (where we define "significant" as outside the 90% credible interval), with a low uncertainty compared to the other stars in the sample. The photometric-model inclination angle estimate is similarly certain, with a mass range of 14.5–14.7 M (Lamontagne et al. 1996), though the true uncertainty is likely to be larger. Similarly to Vanbeveren et al. (2020), Shao & Li (2016) simply present an adopted mass for the WR star of 15 M with no associated uncertainty. Our posterior distribution agrees with the analytic polarimetric estimate using the same data (St-Louis et al. 1988), which is unsurprising, given that the high inclination angle of the system reduces the bias of the analytic model. The O star mass is low for its spectral type, which is expected to be greater than 30 M (Vanbeveren et al. 1998). However, there is not currently a measure of its luminosity class, which makes this expectation very uncertain. Furthermore, measurement uncertainties for $M{\sin }^{3}i$ would broaden the range of masses, potentially including the mass estimate from Shao & Li (2016), Demers et al. (2002) produced an additional measurement of $M{\sin }^{3}i$, but it was only a lower limit.

4.5. Wind Asymmetry

Our findings indicate that all four systems we investigated have some form of asymmetry in their winds. It is unsurprising to find asymmetry in the winds of close massive stellar binaries because of the possibility of wind–wind interaction between the components. Hill et al. (2000) modeled the colliding winds in both WR 42 and WR 79, and found that optical line variations could be explained by the presence of conic wind collision regions, making it plausible that our detection of wind asymmetries in these systems is caused by wind–wind collision. Recent work by Nazé et al. (2021) has shown that WR 127 exhibits X-ray emission that indicates the presence of wind collision regions, though WR 42 was a nondetection and WR 153 showed only faint X-rays compared to the expectation for a wind collision. Thus, the possible wind asymmetry we find for WR 153 may not be caused by wind–wind collision. These results also cast some doubt on the presence of wind–wind collision regions in WR 42. Instead, the wind asymmetries could be caused by rotation of the WR star expanding the wind equatorially (Vink & Harries 2017) or optically thick corotating interaction regions producing areas of enhanced density in the wind (Carlos-Leblanc et al. 2019).

4.6. Other Model Parameters

Our posterior distribution reveals relationships between parameters. Across all models, we note relationships between τzell, Ifracτ, Ifraczell, and qISPuISP. In the case of τzell and Ifracτ, the two-dimensional posterior distributions show strong exponential-like relationships. The Ifraczell correlation is similar, but generally less strong. This is unsurprising, given that the probability of a photon scattering in a path length l is $1-\exp (-\tau )$, and τl. zell changes the path length and thus the required optical depth to reach a given polarization. Therefore, zell and τ are directly related. Then, a more asymmetric scattering region produces more polarization for a given Ifrac, which produces the Ifraczell relationship; the relationship between τzell follows from Ifraczell and τzell. Finally, the relationship qISPuISP occurs because the rotation of the model by θR transforms the polarization directly between q and u.

As part of the modeling process, we also recover ISP and Ω posterior distributions. Our ISP posteriors are strongly influenced by the priors, though all the systems show some deviation from the existing estimates in Fullard et al. (2020). We present the maximum-likelihood estimates in Table 6. The uncertainties for the distributions are displayed in Appendix D as part of the corner plots, split into upper and lower 34% quantiles.

Table 6. WR + O System ISP Maximum-likelihood Estimates from Our Analysis

WR Number qISP (%) uISP (%)
42−0.52−0.49
79−0.31−0.24
1270.690.68
153−0.044.15

Note. Uncertainties are displayed in Appendix D as part of the corner plots.

Download table as:  ASCIITypeset image

Figure 5 shows the posterior distribution of Ω for each system. The gray shaded regions show the 90% credible interval of the distributions. Our posterior distributions for Ω are in complete disagreement with the previous estimates presented in Table 1 for all but WR 127 (i.e., the previous estimates lie outside the plotted range of Ω in Figure 5). However, this is not surprising given that the sources of these estimates did not provide uncertainty estimates for Ω that take into account the bias on the analytic model that was used to derive Ω (Wolinski & Dolan 1994).

Figure 5.

Figure 5. Posterior distributions of the longitude of the ascending node, Ω, for WR 42, WR 79, WR 127, and WR 153, shown as solid lines. The shaded area in each panel shows the 90% credible interval.

Standard image High-resolution image

The final parameters we recover are Ifrac and τ, the fraction of photons arising from the WR star and the optical depth of the scattering region, respectively. Our method confirms that these parameters are related to each other and to zell. This means that the posterior distributions are difficult to interpret as standalone results. For example, in the case of WR 79, the Ifrac distribution peaks at zell ∼ 1.6 and τ ∼ 0.2. There are also peaks in the posteriors that correspond to values of τ ∼ 0, zell ∼ 3.0, and Ifrac ∼ 0. Although these two possible models are peaks in the posterior distributions, there is a range of possible models between these peaks because of the relationship between the parameters. If we disregard these relationships and focus on the median τ for each object, we typically find low optical depths, indicating that the broadband polarization arises in the outer regions of the WR star wind.

5. Conclusions

We have found probabilistic estimates for the mass of four WR + O binary systems, with a new method based on an emulated radiative-transfer model. Our results are different to previous estimates for WR 153, and we provide robust upper limits in all cases, as well as information about degeneracies in the model. Our disagreement with existing polarimetric estimates for these systems shows that care must be taken when drawing conclusions using masses derived from analytic polarimetric models.

We have shown that our method can break the orbital inclination degeneracy in $M{\sin }^{3}i$, but this still requires spectroscopic determination. Thus, we recommend improvements to spectroscopic measurements of $M{\sin }^{3}i$ to obtain more certain mass estimates for the four systems presented in this paper, and others like them. Phased polarimetric measurements are also a requirement for this method, and thus more accurate polarimetric observations are necessary to apply this method to more targets. We are obtaining accurate phased polarimetric observations of a larger sample of Milky Way WR + O systems (Fullard et al. 2018; Johnson et al. 2019), and will apply this method to them in the future. In principle, the model can also be applied to the 30% of Be stars that have companions (Oudmaijer & Parr 2010), because their disks produce polarization signals. Further improvements to our model could be made with the incorporation of elliptical orbits, to investigate systems such as WR 133 that already have polarimetric observations available.

With the advent of telescopes like the Polstar UV spectropolarimetric satellite (Scowen et al. 2021), it will be critical to have polarization models that can be explored with statistical approaches to understand the observations. UV observations of WR + O binary systems can provide new constraints for the priors presented here.

We thank Selma de Mink for her insightful comments about the content of this manuscript.

We gratefully thank Johannes Buchner for UltraNest assistance.

TACC is operated by the Extreme Science and Discovery Environment (XSEDE), which we access under allocation AST-120067 to J.H.

J.H. is grateful for funding from the National Science Foundation under award AST-1816944, and acknowledges that the University of Denver occupies land within the traditional territories of the Arapaho, Cheyenne, and Ute peoples.

R.I. acknowledges funding support for this research from a grant with the National Science Foundation, AST-2009412.

The data for the models in this article are available in a Zenodo repository at https://doi.org/10.5281/zenodo.6226123.

Software: arviz, jupyter (Thomas et al., 2016), lmfit, tensorflow, scikit-learn, blender (Blender Online Community 2021), corner (Foreman-Mackey 2016), ultranest.

Contributor roles. Conceptualization: Andrew Fullard, Wolfgang Kerzendorf. Data curation: Andrew Fullard. Formal Analysis: Andrew Fullard, Patrick van der Smagt. Investigation: Andrew Fullard, Patrick van der Smagt. Methodology: Andrew Fullard. Project administration: Wolfgang Kerzendorf. Resources: Institute for Cyber-Enabled Research at Michigan State University, Texas Advanced Computing Center. Software: Andrew Fullard, Jennifer Hoffman, Manisha Shrestha, Patrick van der Smagt, John O'Brien. Supervision: Wolfgang Kerzendorf. Visualization: Andrew Fullard. Writing original draft: Andrew Fullard. Writing review & editing: Andrew Fullard, John O'Brien, Patrick van der Smagt, Jennifer Hoffman, Richard Ignace, Wolfgang Kerzendorf, Manisha Shrestha.

Appendix A: SLIP Monte Carlo Radiative-transfer Code

slip is a Fortran + MPI code based on the MCRT method outlined in Whitney (2011).The simulation grid is a uniformly spaced spherical polar coordinate system in r, θ, and ϕ. Photon packets are emitted from user-specified locations in the grid and propagate through the user-defined scattering region. At each grid cell the optical depth is integrated until the photon scatters or exits the grid cell. The photons are collected as they exit the simulation limits and are binned into different observational directions in θ and ϕ, with uncertainties calculated via Poisson statistics. In this way, a single model can be viewed from multiple angles, reducing computation time. For additional details of the slip code, see Huk (2017) and Shrestha et al. (2018).

The original slip code included a single photon source at the center of the simulation grid. We used an enhanced version of the code that includes an additional photon source located at an arbitrary distance along the x-axis (θ = 90°, ϕ = 0°) from the central source (each source can be a finite size or a point). The number of photons emitted from each source is controlled as a fraction of the total photon count. We simulate circular orbits by simply moving the "observer" around the grid in the ϕ direction. In Appendix B, we provide validation of these new capabilities against the existing analytic model of Brown et al. (1978), while being aware of the inherent limitations described in Appendix B.

Appendix B: Validating the SLIP Binary Model

To test the validity of the enhanced slip code, we computed models with three circumstellar geometries that can be fit with the analytic model of Brown et al. (1978); that is, a spherical distribution of electron scattering region, a circular disk-like distribution in the orbital plane with an opening angle of 1fdg8 (a single θ angle bin width in height), and a prolate distribution. The prolate distribution is described by an ellipsoid with the major axis parallel to the z-axis, described by

Equation (B1)

where x, y, and z refer to the standard Cartesian coordinate system that slip transforms to its spherical coordinate system. The denominators are the values of xell, yell, and zell, the x, y, and z axes of the ellipsoid.

The finite nature of the slip grid means that the disk-like distribution is a single grid cell thick in the θ direction. All three geometries were centered on the source located at the origin. Both stellar sources emitted an equal proportion of the photons, and both were set as point sources so eclipse effects are not important and photons are not absorbed within either star. We set the optical depth of all these CSM distributions at θ = 90° to τ = 0.1 for 0 < ϕ < 2π to match the optically thin assumption of the analytic model (Carlos-Leblanc et al. 2019 found that this was the limit between optically thin and optically thick for their similar MCRT models). Note that the CSM density is set to be constant and proportional to the optical depth at θ = 90°, so the prolate model shows an increase in optical depth toward θ = 0° and 180° because of its increased radius toward those angles. Model calculations were made on the Stampede high-performance computing cluster at the Texas Advanced Computing Center, 9 with 8 × 109 photons per simulation. The output was binned to 23 inclination bins and 40 orbital phase bins.

We fit the analytic model to the numerical results using lmfit (Newville et al. 2014) to minimize χ2, and algebraically calculated the orbital parameters i and Ω from the Fourier coefficients following Drissen et al. (1986). The Fourier coefficients qi and ui are found by fitting the following equations:

Equation (B2)

Equation (B3)

where ϕ is the azimuthal coordinate and ϕ/2π is the equivalent orbital phase. The fits had median ${\chi }_{\nu }^{2}$ values of 3.30, 2.85, and 0.90 for the disk-like, spherical, and prolate density distributions, respectively (using ${\chi }_{\nu }^{2}={\chi }^{2}/(N-10)$, where N is the number of data points and 10 is the number of fit variables). The ${\chi }_{\nu }^{2}$ value of 0.9 suggests some overfitting or overestimate of uncertainties in the case of the disk-like density distribution.

The comparisons between the numerical and analytic models are shown in Figures 6, 7, and 8 for the disk-like, spherical, and prolate density distributions, respectively. In all Figures 68, the color of the points corresponds to the inclination angle of the slip model. The uncertainty of the polarization increases as the inclination angle approaches zero (i.e., perpendicular to the orbital plane) because the amplitude of polarization is directly proportional to the orbital inclination angle (Brown et al. 1978), and a lower amplitude increases the Monte Carlo noise in the simulation because fewer photons have a polarization signal. In Figure 6, the significantly higher uncertainty of the slip model arises due to the extremely thin disk-like scattering region. This reduces the polarization from the model by an order of magnitude relative to the spherical and prolate density distributions, and thus the uncertainty is proportionally larger.

Figure 6.

Figure 6. Comparison of slip results for a thin disk to the analytic model of Brown et al. (1978). The circles represent slip results for a range of inclination angles from 0° to 90° as displayed in the color bar. The data have been fit by the model of Brown et al. (1978; gray lines).

Standard image High-resolution image
Figure 7.

Figure 7. Comparison of slip results for a sphere to the analytic model of Brown et al. (1978). The circles represent slip results for a range of inclination angles from 0° to 90° as displayed in the color bar. The data have been fit by the model of Brown et al. (1978; gray line).

Standard image High-resolution image
Figure 8.

Figure 8. Comparison of slip results for a prolate ellipsoid (extended in the direction perpendicular to the orbital plane) to the analytic model of Brown et al. (1978). The circles represent slip results for a range of inclination angles from 0° to 90° as displayed in the color bar. The data have been fit by the model of Brown et al. (1978; gray line).

Standard image High-resolution image

The fit results for a selection of inclination angles are presented in Figure 9 for the thin disk, sphere, and prolate CSM models. As described by Wolinski & Dolan (1994), the analytic model requires extremely high precision measurements relative to the polarization variation amplitude to recover accurate derived parameters as the inclination angle decreases from 90° to 0°. Our results agree with those of Wolinski & Dolan (1994): despite the high accuracy of the numerical models, the truly low inclination angle numerical models are fitted with a biased higher inclination angle by the analytic model. The disk model is most strongly affected because it has the lowest amplitude of polarization variation of the models.

Figure 9.

Figure 9. Comparison of slip inclination for a thin disk (a), a sphere (b), and a prolate ellipsoid (c) distribution of scattering material to the analytic model of Brown et al. (1978). The points show the true slip model inclination vs. the calculated Brown et al. model inclination. The black lines show unity.

Standard image High-resolution image
Figure 10.

Figure 10. Corner plot of WR 42 showing posterior distributions for each parameter and their relationships.

Standard image High-resolution image
Figure 11.

Figure 11. Corner plot of WR 79 showing posterior distributions for each parameter and their relationships.

Standard image High-resolution image
Figure 12.

Figure 12. Corner plot of WR 127 showing posterior distributions for each parameter and their relationships.

Standard image High-resolution image
Figure 13.

Figure 13. Corner plot of WR 153 showing posterior distributions for each parameter and their relationships.

Standard image High-resolution image

Appendix C: Emulator Details

Here we present the details of our emulator, including its training set and architecture.

C.1. Training Set

The data set used to train the neural network was produced using 4517 runs of the modified slip MCRT code using uniform random sampling of the parameter space described in Table 3.

To create the training set, we used the MSU high-performance computing cluster provided by the Institute of Cyber-Enabled Research. Each run took up to 4 hr across eight CPUs with 2 × 108 photons per CPU. This photon count was chosen to produce uncertainties in q and u below ∼0.05% with a median uncertainty of ∼0.03%. The resulting training set including all inclination angle bins has size 406,530 × 40 because of the uniform sampling of i in each model and because each model has 40 total ϕ (phase) bins.

C.2. Emulator Architecture

The emulator is based on the proven DALEK emulator methods that have previously been applied to the tardis radiative-transfer code (Kerzendorf & Sim 2014; Kerzendorf et al. 2021). It was built using Tensorflow (Martín et al. 2015) and scikit-learn (Pedregosa et al. 2011) tools to provide the neural network and data preprocessing, respectively.

The neural-network architecture was investigated using a hyperparameter search of 2000 possible architectures on a cluster with 54 GPUs running Polyaxon, on-premise at the Volkswagen Group Machine Learning Research Lab. The resulting best architecture was chosen based on the total loss (here, using the mean-squared error) and the number of steps required to train. The neural network consists of five hidden layers of width 30, with softplus activation (where $\mathrm{softplus}(x)=\mathrm{log}[\exp x+1]$) and Glorot normal initialization (Glorot & Bengio 2010). No dropouts or layer normalization was deemed necessary by the hyperparameter search.

The data and model input parameters were scaled using the StandardScaler function of scikit-learn. The data were randomly split into 90%/10% train/validation sets. Optimal neural-network architectures were those that had a minimal error on the validation set. The training was performed on an nVidia RTX 2080Ti graphics card with 12 GB of VRAM, and took approximately 19 hr to reach 20,000 epochs. Of these 20,000 steps, the training result with the lowest total validation set loss was chosen to represent the best result.

Two emulators were produced from the training set, one for Stokes q and one for Stokes u. Validation of the emulator against random samples of the training set showed that the two emulators returned consistent model parameters independently of each other.

Appendix D: Corner Plots

The corner plots, in Figures 1013, show the six-dimensional posterior distributions of each parameter with two-dimensional posterior distributions for each pair of parameters. These distributions are available at Zenodo at https://doi.org/10.5281/zenodo.6226123.

Footnotes

Please wait… references are loading.
10.3847/1538-4357/ac589e