Brought to you by:

The following article is Open access

A Guide to Realistic Uncertainties on the Fundamental Properties of Solar-type Exoplanet Host Stars

, , , and

Published 2022 March 3 © 2022. The Author(s). Published by the American Astronomical Society.
, , Citation Jamie Tayar et al 2022 ApJ 927 31 DOI 10.3847/1538-4357/ac4bbc

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/927/1/31

Abstract

Our understanding of the properties and demographics of exoplanets critically relies on our ability to determine the fundamental properties of their host stars. The advent of Gaia and large spectroscopic surveys has now made it possible, in principle, to infer the properties of individual stars, including most exoplanet hosts, to very high precision. However, we show that, in practice, such analyses are limited by uncertainties in both the fundamental scale and our models of stellar evolution, even for stars similar to the Sun. For example, we show that current uncertainties on measured interferometric angular diameters and bolometric fluxes set a systematic uncertainty floor of ≈2.4% in temperature, ≈2.0% in luminosity, and ≈4.2% in radius. Comparisons between widely available model grids suggest uncertainties of order ≈5% in mass and ≈20% in age for main-sequence and subgiant stars. While the radius uncertainties are roughly constant over this range of stars, the model-dependent uncertainties are a complex function of luminosity, temperature, and metallicity. We provide open-source software for approximating these uncertainties for individual targets and discuss strategies for reducing these uncertainties in the future.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Answering questions about the formation, evolution, composition, and habitability of exoplanets requires large samples of precise measurements of planetary properties such as mass, radius, and age. Since most of these properties are measured relative to their host stars, such work also requires detailed stellar characterization. For example, the irradiation-dependent planet radius gap between super-Earth- and sub-Neptune-sized planets was only recently quantified because the feature is narrow enough to be visible only in samples with careful spectroscopic (Fulton et al. 2017), asteroseismic (Van Eylen et al. 2018), or astrometric (Berger et al. 2018; Hardegree-Ullman et al. 2020) characterization. Even so, the dominant mechanism causing this gap is still debated, with core-powered mass loss (Ginzburg et al. 2018) and photoevaporation (Owen & Wu 2017) both making predictions consistent with the observations. This debate can be quelled only by more precise stellar properties.

More generally, the availability of high-resolution spectroscopy and high-precision Gaia parallaxes has recently allowed claims of extremely precise exoplanet host star, properties with uncertainties on mass and radius approaching and sometimes reaching below 1%. While such small uncertainties would be a boon to exoplanet demographic studies, they raise questions about whether the fundamental systematic uncertainties have been fully considered. In general, neither stellar mass nor radius can be measured directly, which suggests that there is likely to be a floor in how precisely a star's mass and radius can be estimated.

Stellar radii, for example, are often inferred from a combination of parallaxes and either photometric or spectroscopic estimates of temperature and metallicity. Such estimates rely on bolometric corrections, reddening maps, and stellar atmosphere models, all of which have been shown to be uncertain (González Hernández & Bonifacio 2009; Torres et al. 2012; Casagrande et al. 2021). In addition, uncertainty in the fundamental temperature scale computed from stars with measured angular diameters adds additional complexity.

Stellar mass estimates are often even less direct, with stellar models being used to infer mass from the inferred luminosity, effective temperature, and composition. Numerous model grids are publicly available to allow this (e.g., MESA Isochrones and Stellar Tracks, MIST, Choi et al. 2016; PARSEC, Bressan et al. 2012; Dartmouth Stellar Evolution Program, DSEP, Dotter et al. 2008; Yonsei-Yale, Spada et al. 2013; etc.), and a variety of tools have been developed to simplify the inference, such as isochrones (Morton 2015), isoclassify (Huber 2017; Berger et al. 2020), PARAM (da Silva et al. 2006; Rodrigues et al. 2014), exofast (Eastman et al. 2013, 2019), and kiauhoku (Claytor et al. 2020).

Previous work has shown that the choice of stellar modeling code has at most a very small effect on model predictions, as long as the exact same physics is used (Silva Aguirre et al. 2020). However, the generation of these grids of stellar models requires many physical choices to be made. Convection, for example, is an inherently three-dimensional process that must be parameterized into one dimension, and different model grids have made different choices of how to do this (Tayar et al. 2017). Similarly, choices about the atmosphere boundary condition (Choi et al. 2018b), composition (van Saders & Pinsonneault 2012; Capelo & Lopes 2020), rotation (van Saders & Pinsonneault 2013), opacities (Valle et al. 2013), overshoot (Pedersen et al. 2018), and so on can impact the models.

Since much of this physics is uncertain at a significant level, modelers often make perfectly reasonable but slightly different physical choices that result in offsets between different grids of stellar models. In the past, such offsets were substantially less important than the observational uncertainties. However, with the ability to calculate extremely precise luminosities using Gaia Data Release 2 (Lindegren et al. 2018) combined with the availability of high-resolution, high signal-to-noise spectra for estimating temperatures and metallicities, the observational uncertainties on estimated stellar parameters can now be comparable to or smaller than the systematic uncertainties coming from model physics or the fundamental temperature scale. Incorporating such systematic host star uncertainties has only been done heterogeneously in the literature, with some authors adding empirical or theoretical errors and others, particularly in the context of discovering and characterizing a single planetary system, leaving that discussion out entirely. We argue that realistic uncertainties on exoplanet host stars are critical to derive reliable uncertainties on the fundamental properties of exoplanets, especially when samples from different studies are combined (for example, through the NASA Exoplanet Archive; Akeson et al. 2013) to study exoplanet demographics or when interpreting observations of exoplanet atmospheres, as underestimated error bars or systematic shifts between systems can lead to spurious trends between planets in different parts of parameter space that are often characterized by different groups or, more often, obscure trends, gaps, and structures that would otherwise be visible in a holistic study of planetary systems.

In this paper, we aim to provide a guide to realistic uncertainties on the fundamental properties of exoplanet host stars by investigating sources of systematic errors on both observable quantities (such as temperature, radius, and luminosity) and those that are estimated from evolutionary models (such as mass and age). We note that there are additional complexities in low-mass (Kraus et al. 2011), massive (Holgado et al. 2020), and pre-main-sequence (Somers & Stassun 2017) stars that make them more challenging to characterize; thus, their error floor is likely higher. We therefore focus only on solar-type (FGK main-sequence, subgiant, and lower giant branch stars) stars here, which in theory should be easier to work with but practically still have significant uncertainty.

2. Uncertainties from Input Observables

2.1. Bolometric Fluxes and Luminosities

The most readily available observations for a given exoplanet host star are broadband photometry in optical and near-infrared wavelengths such as Tycho-2 (Høg et al. 2000), the Two Micron All Sky Survey (2MASS; Skrutskie et al. 2006), and Gaia (Evans et al. 2018) and a high-precision parallax from Gaia (Lindegren et al. 2018). The former is typically used to estimate the bolometric flux received on Earth (fbol) either by approximating the spectral energy distribution (SED) by integrating fluxes from broadband photometry in combination with model atmospheres (SED fitting; e.g., van Belle & von Braun 2009; Huber et al. 2012; Mann et al. 2013; Stassun & Torres 2016; Stevens et al. 2017) or by applying bolometric corrections derived from model atmospheres (e.g., Alonso et al. 2004; Torres 2010; Casagrande & VandenBerg 2018). The latter requires an estimate of Teff, which is typically obtained through calibrated color–Teff relations (Casagrande et al. 2011, 2021), while SED fitting typically simultaneously solves for Teff and fbol or uses external Teff estimates from high-resolution spectroscopy. The combination of fbol with the distance d estimated from the parallax (e.g., Bailer-Jones 2015) then allows the computation of the luminosity:

Equation (1)

Typical fractional distance uncertainties for exoplanet host stars in the Gaia era are ≪1% and therefore negligible. The error floor for fbol is set by the accuracy of photometric zero-points, which are typically known to 1%–2% for ground-based photometry (Mann et al. 2015; Casagrande & VandenBerg 2018; Maíz Apellániz & Weiler 2018) based on comparisons with space-based spectrophotometry from the Hubble Space Telescope Space Telescope Imaging Spectrograph (HST/STIS; Bohlin et al. 2014). Ground-based photometry obtained from different surveys can also exhibit substantial offsets at the ≳0.01 mag level, as shown, for example, in the various photometric surveys that have targeted the Kepler field (e.g., Greiss et al. 2012; Pinsonneault et al. 2012). Therefore, care should be taken when combining literature photometry without consideration of the accuracy of input photometry. Another source of uncertainty is the differences in predicted fluxes between stellar model atmospheres, which can reach up to ≈5% (e.g., Appendix A in Zinn et al. 2019).

Bolometric flux measurements require corrections for interstellar reddening, which can be measured if supplementary information on Teff, $\mathrm{log}g$, and [Fe/H] is available from spectroscopy and/or asteroseismology to constrain the shape of the SED (Rodrigues et al. 2014; Huber et al. 2016). If only broadband photometry and parallaxes are available, reddening and Teff are degenerate, and unphysical extinction values may result from compensating for differences between models and observations that are not actually related to extinction. This can be partially avoided through the use of infrared photometry and three-dimensional extinction maps (e.g., Greene et al. 2016), although the latter may suffer from significant systematic errors, particularly for nearby stars (Godoy-Rivera et al. 2021). In summary, typical uncertainties due to reddening corrections are at the ≈0.02 mag level, comparable to photometric zero-point offsets (Huber et al. 2016).

Spots, active regions, and other processes can also cause variation at the level of a few percent in the stellar brightness in a particular band over time. While the total bolometric flux is only weakly sensitive to these effects from their impact on the stellar structural variables (Somers & Pinsonneault 2015), estimates of the bolometric flux from the brightness of a single band at a single time or multiple bands at different times could add additional scatter to the estimated flux.

Figure 1 illustrates the combined effects of these uncertainties by comparing fbol estimates from SED fits for exoplanet host stars from Stassun et al. (2017), which are commonly adopted as default stellar properties in the NASA Exoplanet Archive, to independent fbol measurements using SED fitting (Figure 1(a); McDonald et al. 2017), the infrared flux method (Figure 1(b); Casagrande et al. 2011), and interpolating K-band bolometric corrections from the MIST grid (Choi et al. 2016) as implemented in isoclassify (Figure 1(c); Huber et al. 2016; Berger et al. 2020). Note that fbol in Figure 1(c) was derived using the same Teff and extinction values and thus solely reflects systematic differences between photometric zero-points and model atmospheres. The typical scatter ranges between 2% and 4%, with median offsets of 0.9% ± 0.2%, 1.8% ± 0.2%, and 0.5% ± 0.2% and maximum offsets across 100 K temperature bins of 3.3% ± 1.1%, 3.8% ± 1.5%, and 4.0% ± 0.9%. Taking the average of these optimistic and conservative systematic error estimates, we conclude that contributions of photometric zero-points, model atmosphere grids, and reddening set a fundamental floor of 2.4% ± 0.6% on bolometric fluxes (and thus luminosities) for a given exoplanet host star.

Figure 1.

Figure 1. Bolometric fluxes for exoplanet host stars derived using SED fitting from Stassun et al. (2017) compared to an alternative SED-fitting methodology (panel (a); McDonald et al. 2017), the infrared flux method (panel (b); Casagrande et al. 2011), and K-band bolometric corrections from the MIST grid (Choi et al. 2016) as implemented in isoclassify (panel (c); Huber et al. 2016). Gray points show individual stars, and red circles show binned averages in steps of 100 K. The typical random scatter and systematic offsets between methods as a function of Teff are between 2% and 4%, setting a fundamental limit on the uncertainty of Gaia-derived luminosities.

Standard image High-resolution image

2.2. Angular Diameters and Effective Temperatures

The effective temperature of a star is defined through its bolometric flux and angular diameter θ:

Equation (2)

A fundamental Teff measurement thus requires measurements of both fbol and the angular diameter. Temperatures from high-resolution spectroscopy, color–Teff relations, or SED fitting have to be calibrated using stars with measured angular diameters. The internal consistency of measured angular diameters then sets a fundamental limit as to how well Teff (and, in combination with luminosity, radius) can be determined.

The most successful method to resolve the small angular sizes of stars is optical long-baseline interferometry using facilities such as the Center for High Angular Resolution (CHARA) Array (ten Brummelaar et al. 2005), the Navy Precision Optical Interferometer (NPOI; Armstrong et al. 2013), and the Very Large Telescope Interferometer (VLTI). A few hundred stars have measured diameters with uncertainties of a few percent (von Braun & Boyajian 2017). The accuracy of the angular diameter measurements relies on calibration to account for the temporal and spatial reduction of fringe visibilities due to atmospheric turbulence, with major sources of systematic errors including the estimates of calibrator diameters, underresolving target stars, and uncertainties in the adopted wavelength scale (e.g., van Belle & van Belle 2005). Recent diameter measurements using different instruments have shown significant systematic offsets (Karovicova et al. 2018; White et al. 2018), which are important because many indirect methods are calibrated on a small number of stars with published angular diameters (Casagrande et al. 2014, 2021). An example of this "calibration pyramid" is the infrared color–Teff relation by González Hernández & Bonifacio (2009), which is used to calibrate temperatures for hundreds of thousands of giants in large spectroscopic surveys such as APOGEE (Majewski et al. 2017) but hinges on a handful of measured angular diameters with unknown systematic errors.

Figure 2(a) compares angular diameter measurements from the CHARA array for stars with multiple published results in the literature (Berger et al. 2006; Baines et al. 2008, 2009, 2010; Boyajian et al. 2008, 2012a, 2012b, 2013; Akeson et al. 2009; Bazot et al. 2011; von Braun et al. 2011, 2014; Creevey et al. 2012, 2015; Ligi et al. 2012, 2016; Maestro et al. 2013; White et al. 2013, 2018; Challouf et al. 2014; Howard et al. 2014; Johnson et al. 2014; Kane et al. 2015; Karovicova et al. 2018). The comparison is grouped by the beam combiners used to obtain the measurements: CLASSIC (ten Brummelaar et al. 2005), MIRC (Monnier et al. 2006), VEGA (Mourard et al. 2006), and PAVO (Ireland et al. 2008). We observe individual systematic differences of up to ≈10%, which correlate with the instrument combination used for the measurement. Figure 2(b) compares a larger sample of measurements from the JMMC Measured Stellar Diameters Catalogue (Duvert 2016), showing a similar result and a trend of increasing systematic errors with decreasing angular size. The latter implies that the dominant source of systematic error is underresolving target stars, which, for a given angular size, is more severe for infrared than optical beam combiners, since infrared wavelengths yield a lower angular resolution at a given baseline. While recent comparisons from CHARA and VLTI have shown more promising agreement (Rains et al. 2020; O. Creevey et al. 2022, in preparation), it is clear that a larger number of diameter measurements of the same stars with different instruments are needed to pin down sources of systematic error in literature measurements, and that Teff calibrations require careful sample selection (Casagrande et al. 2014). We note that the vast majority of main-sequence stars have angular sizes <1 mas and thus are most affected by the biases described above.

Figure 2.

Figure 2. Panel (a): ratio of interferometric angular diameters from the CHARA array for stars with multiple published measurements as a function of diameter. Colors and symbols compare different beam combiners used to obtain the measurements (see text for details). Only measurements with formal uncertainties <5% are shown. Panel (b): same as panel (a) but for angular diameters listed in the JMMC Measured Stellar Diameters Catalogue (Duvert 2016).

Standard image High-resolution image

The systematic differences in Figure 2 set a fundamental limit on the accuracy of effective temperature scales and thus stellar radii derived from Gaia parallaxes. To illustrate this, the dark gray band in Figure 2 shows the required accuracy to reach a 1% calibration in Teff, which is smaller than the systematic differences in the measurements. The median absolute systematic offset over all instrument combinations in Figure 2(a) is 4% ± 1%, which corresponds to a systematic error of Teff of 2.0% ± 0.5%. The latter is a factor of ≈2–4 higher than typical Teff uncertainties quoted for recently discovered exoplanet host stars, particularly those discovered by TESS, in the literature (currently, about two-thirds of the planets in the NASA Exoplanet Archive have host star temperatures quoted to better than 2%).

The current sample of interferometric angular diameters thus sets a fundamental limit of 2.0% ± 0.5% on the effective temperature scale for solar-type stars (≈110 K at solar Teff), which does not include systematic uncertainties in common observational measurement techniques (e.g., Spina et al. 2020). We note in particular that the discussion above also does not include any added uncertainty from the inclusion of starspots, which change the temperature of individual regions of the photosphere. While an effective temperature is still formally defined in such cases (e.g., Somers & Pinsonneault 2015), observational estimates of the photospheric temperature can become wavelength-dependent and offset from that value by over 100 K (Gully-Santiago et al. 2017; Flores et al. 2020). In pre-main-sequence and other stars where the spot filling factor is significant (50%), spots can alter the radius, surface temperature, core temperature, and therefore nuclear reaction rates and evolutionary timescales (Somers et al. 2020). For solar-like stars with more modest filling factors (a few percent or less), the structural impacts of the spots are almost negligible, although they can still make the measurement of the temperature and luminosity more challenging. We do not discuss them further here, but observers should be aware of this additional complexity in estimating accurate effective temperatures for active stars and consider whether the use of multiband or multiepoch photometry or color tables, tracks, and isochrones that include the effects of spots or magnetic fields are more appropriate for their stars (e.g., Feiden & Chaboyer 2013; Somers et al. 2020).

2.3. Summary Recommendation

The comparisons discussed in this section 4 demonstrate that the typical limits for measurements of bolometric fluxes and angular diameters, set by uncertainties in photometric zero-points, model atmospheres, extinction, and interferometric calibration, are currently 2.4% ± 0.6% and 4% ± 1%, respectively. This directly sets lower limits on the uncertainty of the "observed" fundamental properties of stellar luminosity (2.4% ± 0.6%) and effective temperature (2.0% ± 0.5%) and thus stellar radius (4.2% ± 0.9%). We recommend that these uncertainties be added in quadrature to the formal uncertainties for exoplanet host stars to account for methodology-specific differences, unless the uncertainties have already been estimated from multiple independent methods, the properties have been directly measured from space-based spectrophotometry or long-baseline interferometry, or the measurements are of solar twins and have been obtained differentially with respect to the Sun. We note that spectroscopic abundances, which are also used as input observables for evolutionary models and not explicitly discussed here, show similar method-specific differences that are typically larger than formal uncertainties (e.g., Torres et al. 2012; Hinkel et al. 2016).

3. Uncertainties from Model Grids

Estimates of stellar masses are often even less fundamental. In many cases, a combination of luminosity, temperature, and metallicity is compared to a grid of stellar models, and the best match is used to read off the likely mass and age of the star in question. However, we contend that the answer returned depends on the physical assumptions used to construct the stellar evolution model.

3.1. Model Physics

In order to estimate the theoretical uncertainties on estimates of stellar masses and ages as a function of luminosity, effective temperature, and composition, we compare the predictions of several widely used model grids. For each grid, we use models between 0.6 and 2.0 M at intervals of 0.1 M. We also run the analysis at [Fe/H] = −1.0, −0.5, 0.0, and +0.5 as defined by each model. We list the different model grids used in this work and summarize the physics in Table 1. We recognize that there are many other grids of models available, but we believe that the grids considered are a representative sample of the choices of model physics and calibration commonly used for the characterization of solar-type stars. We note that this comparison excludes pre-main-sequence stars and M dwarfs, as both of these regimes have results that can be even more dependent on the assumed model physics, and the models can be significantly discrepant with the observed stellar properties.

Table 1. Summary of the Input Physics Used in Each Model

ParameterYRECMISTDSEPGARSTEC
ReferenceThis workChoi et al. (2016)Dotter et al. (2008)Serenelli et al. (2013)
AtmosphereGrayKurucz (1993)PHOENIX (Hauschildt et al. 1999a, 1999b)Gray
Convective overshootStep: 0.16Hp Diffusive: 0.0160 (core) and 0.0174 (env)Step: 0.2Hp Diffusive: 0.02
DiffusionYesMain sequence onlyModifiedYes
Equation of stateOPAL+SCVHOPAL+SCVH+ MacDonald+HELM+PCIdeal gas with Chaboyer & Kim (1995)+ Irwin (2004)Irwin (2004)
High-temperature opacitiesOPALOPALOPALOPAL
Low-temperature opacitiesFerguson et al. (2005)Ferguson et al. (2005)Ferguson et al. (2005)Ferguson et al. (2005)
Mixing lengthTayar et al. (2017)1.821.9381.811
Mixture and solar Z/X Grevesse & Sauval (1998)Asplund et al. (2009) protosolarGrevesse & Sauval (1998)Grevesse & Sauval (1998)
Nuclear reaction ratesAdelberger et al. (2011)Cyburt et al. (2010)Adelberger et al. (1998)+Imbriani et al. (2004)+Kunz et al. (2002)+Angulo et al. (1999)Adelberger et al. (1998)+Angulo et al. (1999)
RotationTayar & Pinsonneault (2018)NoneNoneNone
Weak screeningSalpeter (1954)Alastuey & Jancovici (1978)Salpeter (1954)+ Graboske et al. (1973)Salpeter (1954)
Solar X 0.7094520.71540.70710.7090
Solar Y 0.27256930.27030.274020.2716
Solar Z 0.01794920.01420.018850.0193
ΔYZ 1.34261.51.53271.194
Surface (Z/X) 0.02530.01730.02290.0245

Download table as:  ASCIITypeset image

3.1.1. Yale Rotating Evolution Code Models

The Yale Rotating Evolution Code (YREC; Pinsonneault et al. 1989) grid of models is the only grid of models used in the work presented here for the first time and represents an expansion of the grid presented in Tayar & Pinsonneault (2018). These models are unique in that they are not calibrated to the Sun but rather designed to replicate the observed properties of stars on the red giant branch as a function of metallicity (Tayar et al. 2017). These models use a Grevesse & Sauval (1998) chemical mixture with a helium enrichment of $\tfrac{{\rm{\Delta }}Y}{{\rm{\Delta }}Z}=1.3426$ (Tayar et al. 2017). They have a convective step overshoot of 0.16Hp, calibrated on the luminosity of the secondary red clump (Tayar & Pinsonneault 2018), and rotational evolution that includes angular momentum loss according to the Pinsonneault, Matt, and Macgregor wind loss law (van Saders & Pinsonneault 2013), as described in Tayar & Pinsonneault (2018) for stars below the Kraft break, but they do not include rotational mixing. The models do include diffusion (Bahcall & Loeb 1990) as implemented for Somers & Pinsonneault (2016), use a Kurucz (1997) atmosphere boundary condition, and rely on OPAL (Iglesias & Rogers 1996) high-temperature opacities and Ferguson et al. (2005) low-temperature opacities. They also use the SCVH (Saumon et al. 1995) and OPAL (Rogers & Nayfonov 2002) equations of state.

3.1.2. MESA Isochrones and Stellar Tracks Models

The MIST (Choi et al. 2016) models are used here in the form of models rather than isochrones. Their creation and properties are described extensively in Choi et al. (2016). In brief, they are generated using the MESA stellar evolution code (Paxton et al. 2011, 2013, 2015, 2018, 2019) and are available online. 5 The solar calibration for these models was chosen to minimize the combined offsets from solar values in log L, log R, surface composition, the base of the surface convection zone, and the sound speed at 4.57 Gyr. The composition is based on Asplund et al. (2009), and diffusion of helium and heavy elements is included for main-sequence stars following the Thoul et al. (1994) formalism, balanced by radiation turbulence (Morel & Thévenin 2002). Exponential overshoot is included for both convective cores and convective envelopes. The nonrotating version of the model grid was used for this analysis, chosen because most of the low-mass stars studied here are expected to be relatively slowly rotating, but a rotating grid is also available, and work is ongoing to update these grids, including the addition of more sophisticated rotation physics (Gossage et al. 2021).

3.1.3. Dartmouth Stellar Evolution Program

The DSEP models are used here as presented in Dotter et al. (2008) and available online. 6 In these models, atomic diffusion and gravitational settling are included but are inhibited in the outer 0.01 M, as described in Chaboyer et al. (2001). The boundary condition for these models is based on a grid of PHOENIX model atmospheres (Hauschildt et al. 1999a, 1999b) matched at the point where T = Teff. While models are available for a range of [α/Fe] values, we use only [α/Fe] = 0 models in this work.

3.1.4. Garching Stellar Evolution Code Models

The grid of models made using the Garching Stellar Evolution Code (GARSTEC; Weiss & Schlattl 2008) used here was first presented in Serenelli et al. (2013). The mixing length and reference composition were chosen to match the parameters of a solar model at the solar age, including the effects of diffusion. Convective overshooting is modeled diffusively following the prescriptions of Freytag et al. (1996), although limits are placed to prevent extensive overshooting in small convective cores (Magic et al. 2010). Updated versions of this grid have been used for Bayesian estimates of stellar parameters including asteroseismic inputs (e.g., Serenelli et al. 2017), but we use them here with only classical parameters.

3.2. Interpolation

We use the Python package kiauhoku (Claytor et al. 2020) to estimate the mass and age of stars given their metallicity, luminosity, and effective temperature for each grid of models. The packaged model grids for use with kiauhoku are available on Zenodo under an open-source Creative Commons Attribution license 7 (Claytor & Tayar 2020). We interpolate to find the best-fit mass and age at 100 effective temperatures between 4000 and 8000 K, 100 luminosities between log(L/L) = −1 and 1.5, and four different metallicities ([Fe/H] = −1.0, −0.5, 0.0, and 0.5).

The kiauhoku package works by resampling evolution tracks to equivalent evolutionary phases (EEPs; Dotter 2016) and then interpolating stellar parameters given initial metallicity, initial mass, and EEP (see Claytor et al. 2020, for more details). While kiauhoku does have a built-in Markov Chain Monte Carlo (MCMC) method to estimate the parameters, a full MCMC run is extremely expensive given the number of grid points we wish to fit and unnecessary for our purposes. Instead, we use the StarGridInterpolator.gridsearch_fit method to fit the models to our temperature, luminosity, and metallicity grid points. For each grid, we search models with initial masses between 0.6 and 2.0 M and EEPs between the beginning of the main sequence (EEP 202) and the red giant branch bump (EEP 606). The gridsearch_fit algorithm iterates through a down-sampled model grid, using each starting point as an initial guess for a Nelder–Mead (Nelder & Mead 1965) optimizer until a user-specified loss threshold is reached. We used mean squared error loss to fit logarithmic values of Teff/K and L/L, with a fit tolerance of 10−6. To reduce the dimensionality of the search space, we assume surface metallicity does not change appreciably over the duration of an evolution track, allowing us to use the track's initial metallicity as the de facto "current" surface metallicity. This is a very good approximation for most stars until the red giant branch, when the first dredge-up mixes core helium with envelope hydrogen, reducing the surface ratio of metals to hydrogen. The effects of this approximation are errors of a few percent in both fit mass and age, but since the model offsets on the red giant branch are already large, we ignore them. We also note that the overall shape of the offsets discussed in future sections is not particularly sensitive to the exact details of the fitting process.

3.3. Current Offsets

As shown in Figure 3, the different models make different predictions for the evolution of solar-metallicity stars as a function of stellar mass. Zooming in on the solar-mass models, it is evident that different calibration choices for some models lead to predictions that do not match the IAU values for the Sun (5772 K) at the solar age (4.57 Gyr). This does not necessarily imply that something is wrong with the generation of the model grid. There are well-documented incompatibilities between recent spectroscopic solar abundance estimates (Asplund et al. 2009) and helioseismic results (e.g., Turck-Chièze et al. 2004; Buldgen et al. 2019), so modelers must either use an older abundance scale or accept inconsistencies with solar parameters. In other cases, the physics of the models may have been optimized for stars in other parts of the H-R diagram, or a slightly different solar temperature or age might have been used in the calibration.

Figure 3.

Figure 3. Different models make different predictions. Top: predicted evolutionary tracks for solar-metallicity stars of different masses for four different model grids. Bottom: predictions for a solar-mass, solar-metallicity star. For each grid, the age of the Sun (4.57 Gyr; Bahcall et al. 1995) is highlighted for comparison to the IAU solar values (5772 K; Prša et al. 2016), represented by the solar symbol.

Standard image High-resolution image

We show in Figure 4 that it is not only the solar case but also across the H-R diagram that the physics choices made lead to slight offsets in the locations of the model tracks. Models that use step overshoot versus exponential overshoot, for example, can have blue hooks with slightly different locations and shapes, which causes offsets in the masses inferred in that region. Similarly, the exact location of the giant branch can be changed quite a bit by a number of physical assumptions (e.g., Tayar et al. 2017; Choi et al. 2018b), causing significant offsets in evolved stars. Looking at these plots, it is also clear that there is no single grid that is uniquely offset from the rest. Each pairwise set of models has places where they are more similar or more different, and it is the ensemble that likely indicates the true uncertainty in stellar evolution at this time. In the future, we hope that improvements to the models and their calibration can reduce these discrepancies (see Section 3.4.2), but such work is still to be done.

Figure 4.

Figure 4. Maximal fractional offset in mass between model grids for stars at solar metallicity (top left: maximal difference; top right: MIST-YREC differences; bottom left: DSEP-YREC differences; bottom right: GARSTEC-YREC differences) as a function of temperature and luminosity. The offsets are largest on the giant branch and in the blue hook, where overshoot choices are very important.

Standard image High-resolution image

For this reason, we argue that the maximal difference between grids of models should be considered an additional systematic error source for most applications. For example, many users want to estimate a stellar mass and age for a star given its measured properties, such as an inferred luminosity from Gaia and an estimated temperature and metallicity from spectroscopy or photometry. We show in Figure 5 that the maximal differences between models are a complicated function of luminosity, temperature, and metallicity. This means that even in the case of perfect measurements without uncertainties, the resulting estimated mass will depend on the model grid chosen. Specifically, we plot the maximum difference between any two grids of models at each point of interpolation and show that while offsets on the main sequence and subgiant branches are usually ≈5% in mass, systematic differences between models can be greater than 10%, particularly near the base of the giant branch. We also note that the estimated masses disagree more significantly in regions where the choice of overshoot is relevant. Around the blue hook, for example, the doubling back of the tracks, as well as the choices of the amount and type of overshoot, can impact the inferred mass by up to 7%, and for stars around 1.1 M, where a small convective core has developed, similarly large offsets can exist.

Figure 5.

Figure 5. Maximal fractional offset in mass between model grids for stars at different metallicities (top left: +0.5; top right: solar; bottom left: −0.5; bottom right: −1.0) as a function of temperature and luminosity. The offsets are largest (darker colors) for stars near the giant branch and ≈5% for most dwarfs and subgiants.

Standard image High-resolution image

While the absolute errors in age follow a similar pattern, we show in Figure 6 that this is not the case for the fractional age uncertainties. When calculating ages, the difference between a zero-age main-sequence star in a model grid that tends to be hotter and a main-sequence turnoff star in a model grid that tends to be cooler is very small in mass but almost 100% in age. This means that the dominant age uncertainty for some stars may be the choice of model grid, and this uncertainty should not be ignored, particularly near the zero-age main sequence.

Figure 6.

Figure 6. Maximal fractional offset in age between model grids for stars at different metallicities (top left: +0.5; top right: solar; bottom left: −0.5; bottom right: −1.0) as a function of temperature and luminosity. The offsets are largest (brighter colors) for stars near the zero-age main sequence and closer to ≈10% for most subgiant stars.

Standard image High-resolution image

3.4. Mitigating Offsets

3.4.1.  kiauhoku Tools

The discrepancies between stellar model grids demand an accounting of systematic uncertainties from grid to grid. To aid future efforts in this, we provide a Jupyter Notebook interface to kiauhoku. 8 This notebook allows a user to input observables for an individual star and compute model mass and age offsets in the four currently implemented model grids without downloading anything. The notebook can be executed and edited on a remote computer through the use of services like Google Colab by importing the notebook using the GitHub URL or clicking the Colab badge in the notebook itself. While this makes computing systematic uncertainties for individual stars more convenient, for larger data sets, we recommend installing kiauhoku 9 and using the Python interface.

3.4.2. Improving Stellar Models

While our previously described tool provides a method for accounting for uncertainties on stellar parameters due to uncertainties in stellar models, this is obviously not the optimal solution. Removing all uncertainties on stellar physics also seems unlikely to occur in the near future, especially since three-dimensional simulations of stellar interiors over cosmic time are still computationally unfeasible. However, we suggest that in the near future, it will be possible to at least calibrate the models that we do have for many interesting regions of parameter space. Asteroseismology, in particular, is allowing the estimation of the masses of tens of thousands of stars, which, when combined with spectroscopic characterization, can be used to check the calibration of stellar models (e.g., Tayar et al. 2017). Work in open clusters allows the checking of models as a function of age (e.g., Choi et al. 2018a; Sandquist et al. 2020). Finally, the careful characterization of double-lined eclipsing binaries (Claret & Torres 2019), or in some cases even single-lined eclipsing binaries (Stevens et al. 2018), can constrain mass and age simultaneously. While the accuracy and precision of all of these measurements are still being refined (see, e.g., Gaulme et al. 2016), they represent an exciting opportunity to substantially improve our ability to estimate the masses and ages of stars to high precision. Combining these samples to select a set of ∼100 of the best-characterized stars at a range of metallicities, temperatures, and luminosities could provide a set of benchmarks for modelers to validate new stellar evolution grids, analogous to unit tests in computer science, the Gaia benchmark stars in spectroscopy (Heiter et al. 2015; Jofré et al. 2018), or the way in which 16 Cyg has functioned for solar-like asteroseismologists.

4. Worked Examples

4.1.  π Mensae

The host of the first planet discovered by TESS (Gandolfi et al. 2018; Huang et al. 2018) was π Mensae. Both discovery papers adopt effective temperatures with uncertainties of <1%, which individually differ by ≈3σ. They also adopt stellar radii and masses with uncertainties of 1%–2% and 3%–4%, respectively.

To estimate the expected systematic errors from models only (ignoring differences in effective temperatures), we adopt the derived properties from Huang et al. (2018) without observational uncertainties: Teff = 6037 K, L = 1.444 L, and [Fe/H] = 0.08. Applying these values to the method described in Section 3 yields grid masses of 1.090, 1.099, 1.110, and 1.123 M for YREC, MIST, DSEP, and GARSTEC, respectively. All of these are consistent with the recently derived asteroseismic scaling relation mass from TESS 20 s cadence observations in Sectors 27 and 28 (1.145 ± 0.08; D. Huber et al. 2022, in preparation), but their spread is comparable to the range established by the observational uncertainties quoted by Huang et al. (2018; 1.094 ± 0.039 M), suggesting that in this case, the systematic errors are comparable to the random observational uncertainties. We also note that the range of model ages (3.50, 2.28, 2.33, and 1.73 Gyr for YREC, MIST, DSEP, and GARSTEC, respectively) spans a similar range as the observational uncertainties, quoted as ${2.98}_{-1.3}^{+1.4}$ Gyr. Thus, the systematic uncertainties should not be neglected in this regime and should be added in quadrature to the observational uncertainties.

The astute reader will notice that the mass quoted for the DSEP grid here is not exactly the same as the mass quoted in Huang et al. (2018), which also used the DSEP grid. We expect that this difference is likely the result of exactly how the fit was done: what parameters were used, how the search proceeded, how the search function penalized deviations from the input parameters, etc. In particular, model searches of precisely characterized stars that overdetermine the stellar parameters can often produce results that require extra care to interpret, even though they could naively be expected to result in better fits.

4.2. TOI-197

The first example of an oscillating planet-hosting star with TESS (Huber et al. 2019) was TOI-197. The host star is near the end of the subgiant phase and has a well-constrained temperature (5080 ± 90 K), metallicity ([Fe/H] = −0.08 ±0.08), and luminosity (5.15 ± 0.17 L). Again, assuming the central values for each of these parameters and no observational uncertainties, the different model grids would have inferred masses of 1.252, 1.210, 1.137, and 1.194 M for YREC, MIST, DSEP, and GARSTEC, respectively, a spread that is comparable to the observational uncertainties on the quoted asteroseismic mass (1.212 ± 0.074 M).

This star illustrates the challenges of estimating masses from stellar models as stars approach the red giant branch. Since the models are not substantially separated in temperature, precise mass estimates are impossible without exceptionally good observations or additional information. Given the full range of observational uncertainties, the MIST grid of models would have allowed for masses between 1.04 and 1.33 M. This, therefore, is still a regime where the observational uncertainties for stars without asteroseismology dominate over the systematic differences between model grids. However, it must be noted that the 10% systematic uncertainty between model grids is not entirely negligible even in this challenging observational regime. The range of ages in this regime is also significant, with models giving ages of 4.9, 4.9, 6.6, and 5.6 Gyr for YREC, MIST, DSEP, and GARSTEC, respectively. This is an ≈30% systematic spread, which needs to be added in quadrature to the uncertainties arising from the errors on the measurements in order to accurately characterize the full uncertainty in our estimates of stellar ages.

5. Conclusions

The recent advent of high-precision astrometry from Gaia and large spectroscopic surveys has enabled precise measurements of single field star properties such as radius, mass, and age, which, in principle, allow exciting new explorations into the demographics, compositions, and atmospheres of the planets that orbit these stars. However, we have demonstrated here that caution is required when quoting very precise fundamental properties for stars and exoplanets, as systematic uncertainties can dominate the error budget for stellar properties. For example, the uncertainty on the fundamental temperature scale from interferometry, in combination with the uncertainties on flux scales, extinctions, and bolometric corrections, in most cases limits temperature estimates to ≈2%, luminosities to ≈2%, and radii to ≈4%, a factor of 2–4 higher than the typically quoted uncertainties in the recent literature.

Estimating stellar masses from stellar models also has significant uncertainty, as different grids of stellar models will disagree on the inferred mass and age at the few percent level, due to uncertainties on the physics of the stellar interior. We have shown that these offsets between models are luminosity-, temperature-, and metallicity-dependent and commonly on the order of ≈5% in mass and ≈20% in age, although they can be substantially larger. This uncertainty from the model choices can be as large as or larger than the uncertainties from observations in some cases, and as such, it should be considered in analyses. Most properly, this should be done by perturbing all of the uncertain physics and comparing to external checks to properly calibrate the models in the regions of interest. In practice, the precision and volume of data necessary to undertake such a study are only now becoming available. In the interim, we recommend using the range of results returned from various available model grids as a measure of the systematic uncertainty of a star's mass and age, and we have provided open-source software to estimate these values given the stellar parameters so that they can be added in quadrature to the observational uncertainties.

We note that the uncertainty estimates we provide here are a guideline for typical stars, and that it may be possible to do better in carefully studied individual cases or in stars with additional constraints. Future improvements in stellar model physics, as well as a larger number of dedicated fundamental measurements through interferometry and space-based spectrophotometry, will be required to reduce systematic errors in host star (and thus exoplanet) properties to the level of precision that current observational data sets enable. However, such careful work has the potential to change what we can discover about stars and their planets and is thus a worthwhile effort.

We thank Aaron Dotter and Aldo Serenelli for their assistance in using their grids and Gail Schaefer for help with compiling angular diameter measurements from the literature. We also acknowledge helpful discussions with Andrew Mann, Marc Pinsonneault, Jim Davenport, Ellie Abrahams, Gail Schaefer, and the online.tess.science eclipsing binary working group. J.T. acknowledges that support for this work was provided by NASA through NASA Hubble Fellowship grant No. 51424 awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS5-26555 and by NASA award 80NSSC20K0056. The authors thank the Kavli Institute for Theoretical Physics for hosting the Exostar19 meeting and acknowledge that this research was supported in part by the National Science Foundation under grant No. NSF PHY-1748958. Z.R.C. acknowledges support from the TESS Guest Investigator Program (80NSSC18K18584) and the Heising Simons Foundation. D.H. acknowledges support from the Alfred P. Sloan Foundation, the National Aeronautics and Space Administration (80NSSC19K0597), and the National Science Foundation (AST-1717000). This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program.

Facility: Exoplanet Archive - (Akeson et al. 2013).

Software: Software: kiauhoku (Claytor et al. 2020), NumPy (Harris et al. 2020), Astropy (Astropy Collaboration et al. 2013, 2018), Matplotlib (Hunter 2007), SciPy (Virtanen et al. 2020).

Footnotes

Please wait… references are loading.
10.3847/1538-4357/ac4bbc