Uncertainty Quantification in Sunspot Counts

, , , , and

Published 2019 November 13 © 2019. The American Astronomical Society. All rights reserved.
, , Citation Sophie Mathieu et al 2019 ApJ 886 7 DOI 10.3847/1538-4357/ab4990

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/886/1/7

Abstract

Observing and counting sunspots constitutes one of the longest-running scientific experiments, with first observations dating back to Galileo (around 1610). Today the sunspot number (SN) time series acts as a benchmark of solar activity in a large range of physical models. An appropriate statistical modeling, adapted to the time series' complex nature, is, however, still lacking. In this work, we provide the first comprehensive uncertainty quantification analysis of sunspot counts. We study three components: the number of sunspots (Ns), the number of sunspot groups (Ng), and the composite Nc, defined as ${N}_{c}:= {N}_{s}+10{N}_{g}$. Those are reported by a network of observatories around the world and are corrupted by errors of various types. We use a multiplicative framework to provide, for these three components, an estimation of their error distribution in various regimes (short-term, long-term, minima of solar activity). We also propose a robust estimator for the underlying solar signal and fit density distributions that take into account intrinsic characteristics such as overdispersion, excess of zeros, and multiple modes. The estimation of the solar signal underlying the composite Nc may be seen as a robust version of the International Sunspot Number (ISN), widely used as a proxy of solar activity. Therefore, our results on Nc may help characterize the uncertainty on ISN as well. Our results pave the way for a future monitoring of the observatories in quasi-real time, with the aim of alerting the observers when they start deviating from the network and preventing large drifts from occurring.

Export citation and abstract BibTeX RIS

1. Introduction

1.1. The International Sunspot Number (ISN)

On white-light images, sunspots are visible as dark areas. They correspond to regions of locally enhanced magnetic field and act as an indicator of changing solar activity over time. They have been observed and counted since the invention of the telescope at the beginning of the seventeenth century. As such, the counting of sunspots constitutes one of the "longest-running scientific experiment[s]" (Owens 2013). In 1848, J. R. Wolf from Zürich Observatory created an index, denoted Nc, of solar activity by summing up the total number of sunspots Ns with 10 times the total number of sunspot groups Ng on a daily basis:

Equation (1)

Figure 1 displays smoothed averages of the median value of these three quantities across a set of 21 observatories (also called "stations") chosen for the present study (see Section 2). Modeling the statistics of Nc is far from trivial, as this quantity jumps from 0 to 11 when a sunspot appears on the Sun (Ns = 1, Ng = 1, and thus Nc = 11). By construction, each active region appears thus twice in Nc. The multiplication factor in Equation (1) was introduced by J. R. Wolf to put the number of groups on the same scale as the number of spots. Indeed, during this historical period, a group contained on average 10 spots (Izenman 1985). Note that in recent solar cycles the average number of spots per group is rather around six.

Figure 1.

Figure 1. Time evolution during 1947–2013 of the median values across 21 observing stations (see Table 1) for the sunspot counts: (top) Ns, (center) Ng, and (bottom) Nc. The data are averaged over 81 days (black dotted line), 1 yr (red dashed line), and 2.5 yr (blue solid line).

Standard image High-resolution image

The index Nc, or rather the formula behind it, is at the basis of the ISN. The ISN is distributed through the World Data Center Sunspots Index and Long-term Solar Observations (WDC-SILSO).3 The Nc values from each observing station in the SILSO network are collected and rescaled, i.e., multiplied by a factor k, to compensate for their differing observational qualities. The Nc values are then combined on a monthly basis to produce the ISN (Clette et al. 2007), which constitutes the international reference for modeling solar activity over the long term. Despite the fact that it is arguably the most intensely used times series in all of astrophysics (Hathaway 2010), its historical part suffers from a number of errors and inconsistencies that have been partly addressed by the recalibration of the ISN in 2015 (Clette & Lefèvre 2016). Even the most recent part (1981–now) lacks proper error modeling and uncertainty quantification; see Section 1.2 below.

1.2. ISN: Origin and Computation

In order to place our work in context, we first describe how the ISN is currently obtained at the WDC-SILSO center. Ns and Ng are entered through an interface (www.sidc.be/WOLF) and stored in a database. The main processing is described in Clette et al. (2007), and we summarize a more recent version of it in Figure 2. The processing uses a pilot station, here the Locarno station, as a reference. It compares the values obtained by a station i to the pilot station via a scaling factor ki, often referred to as the "k-coefficient":

Equation (2)

where Yi(t) is the composite index of station i, observed at time t (expressed in days), and pilot(t) is the value of the pilot station. The monthly scaling factors are computed from a sigma-clipping mean of Equation (2), i.e., values differing by more than two standard deviations from the mean are eliminated from the computation process. This processing still suffers from its historical heritage, summarized in Table 1 of Dudok de Wit et al. (2016). For example, this table shows that between 1926 and 1981 (when the sunspot collection center was in Zurich) there were several standard observers, and no pilot station. As an index derived from count data, Nc (or Ns and Ng) does not necessarily follow a Gaussian distribution (Vigouroux & Delache 1994; Usoskin et al. 2003; Dudok de Wit et al. 2016). A processing based on sigma-clipping is thus not fully adapted, but still undoubtedly better than what was done during the Zurich era. Finally, some steps in the processing date back from the mid-nineteenth century (when J. R. Wolf introduced the sunspot index) and have not been upgraded when the collection and preservation center was moved from Zurich to Brussels in 1981. There were two reasons for this: (1) the new curators of the ISN wanted to keep the uniformity of the series, and (2) the numerical tools available at that time were limited.

Figure 2.

Figure 2. Flowchart of the WDC-SILSO data import procedure, illustrating the succession of hierarchical tests applied to raw observing reports (adapted from Clette et al. 2007).

Standard image High-resolution image

The WDC-SILSO team is currently working on improving the ISN computation and coordinates an important community effort to correct past errors. Such effort includes, among others, work by a team from the International Space Science Institute (ISSI) on recalibration of the SN,4 organization of sunspot workshops,5 and editorial work for a Solar Physics topical issue on SN recalibration (Clette et al. 2016).

1.3. Previous Works on SN Uncertainty Quantification

Long-term analyses started with models of the shape of the sunspot number time series (Stewart & Panofsky 1938; Stewart & Eggleston 1940). They pursued the works by M. Waldmeier himself (Waldmeier 1939), who tried to understand the solar cycle and predict upcoming cycles. Later on, Morfill et al. (1991) investigated the short-term dynamical properties of the SN series using a Poisson noise distribution superimposed on a mean cycle variation. Vigouroux & Delache (1994) also use a Poisson distribution to approximate the dispersion of daily values of the SN at different regimes of solar activity. Usoskin et al. (2003) develop a reconstruction method for sparse daily values of the SN and model the monthly number of groups corresponding to a certain level of daily values by a Poisson distribution. Schaefer (1997) emphasizes the need for error bars on the AAVSO sunspot series6 (Foster 1999), and more recent results in Dudok de Wit et al. (2016) present a first uncertainty analysis of the short-term error, through time domain errors and dispersion errors among observing stations, still assuming a Poisson distribution. In Dudok de Wit et al. (2016), however, the authors uncover the presence of overdispersion in the SN and approximate the SN by a mix of a Poisson and a Gaussian distribution in an additive framework. Although non-Poissonian, this additive model fails to capture some of the characteristics of sunspot data. Chang & Oh (2012), on the other hand, use a multiplicative model to simulate sunspot counts in view of assessing the dependency of correction factors on the solar cycle.

1.4. Motivation and Contribution

Our goal in this work is to go beyond the above-mentioned historical heritage by developing a comprehensive uncertainty quantification model for the count data Ns, Ng, and Nc. These quantities are subject to different types of errors and do not behave exactly like Poisson random variables: (1) they experience more dispersion than the Poisson distribution, (2) they are not independent from one day to another (since sunspots can last from several minutes to several months on the Sun), and (3) they exhibit a large number of zeros owing to periods of minimal solar activity.

Our contribution is twofold. First, we develop robust estimators for the physical solar signal, denoted "true" signal in this paper, underlying Ns, Ng, and Nc. We propose a model for their densities that takes into account characteristics such as overdispersion and large number of zero counts. Our processing and estimators are robust to missing values and do not require filling in missing observations, contrarily to previous studies.

Second, we propose an uncertainty model that is motivated by first studies in Dudok de Wit et al. (2016) and that works within a multiplicative framework. Our model distinguishes three error types. The short-term error accounts for counting errors and variable seeing conditions from one station to another (e.g., weather, atmospheric turbulence), whereas the long-term error provides an overall bias in the number of spots (e.g., gradual ageing of the instrument or observer). Finally, a third error type aims at modeling inaccuracies occurring at solar minima and helps differentiating true from false zero counts. As an illustration, the short-term variations coming from the solar variability and the observational errors are clearly visible in Figure 1, superimposed on the approximate seasonality of the 11 yr solar cycle.

In future prospects, our work paves the way for a more robust definition of the ISN. Indeed, the analysis of the different error types allows studying the stability of observatories involved in the computation of ISN. Our study lays the ground for a future monitoring of all active stations within the SILSO network in quasi-real time, with the aim of (1) defining a stable reference of the network, (2) alerting the observers when they start deviating from the network, and (3) preventing large drifts from occurring. A new ISN could then be defined from a stable reference rather than a single pilot station and could benefit from the robust estimators and procedures (including rescaling of observing stations; see Section 4) developed in this work.

Our paper is structured as follows. Section 2 introduces the data set considered. The uncertainty model is presented in Section 3, while Section 4 details the preprocessing of the data. Section 5 provides the estimators (or proxy) for the "true" solar signal underlying Ns, Ng, and Nc, as well as their densities. Finally, Section 6 displays our results on quantification of the different error types, as well as a first stability analysis that takes into account both short- and long-term variability.

2. Data

Similarly to what is done in Dudok de Wit et al. (2016), we study a subset of 21 stations, whose main characteristics are listed in Table 1. The period under study goes from 1947 January 1 until 2013 December 31. It ranges from the maximum of solar cycle (SC) 18 until the ascending phase of SC 247 and covers thus almost six solar cycles.

Table 1.  Main Characteristics of the Subset of Stations

ID Name Location Prof. versus Team versus Observing Level % % Obs.
      Amateur Individual Period   Obs. Period
A3 Athens Obs. Athens (Greece) Prof. team 1949–1995 1.039 30.16 44.01
BN-S WFS Obs.* Berlin (Germany) Am. team 1965–2013 1.179 23.50 32.74
CA Catania Obs. Catania (Italy) Prof. team 1950–2019 1.039 61.87 64.80
CRA Cragg† Australia Am indiv. 1947–2009 0.904 72.43 77.44
FU Fujimori Nagano (Japan) Am indiv. 1968–2019 1.055 45.73 67.32
HD-S Hedewig* Germany Am indiv. 1967–2013 0.931 25.42 36.96
HU Public Observatory Hurbanovo (Slovakia) Am team 1969–2019 1.004 35.452 52.80
KH KOERI Kandilli (Turkey) Prof. team 1967–2019 0.968 48.81 51.38
KOm Koyama Tokyo (Japan) Am indiv. 1947–1996 1.052 40.18 54.84
KS2 Kislovodsk Mountain Obs. Kislovodsk (Russia) Prof. ∼indiv. 1954–2019 1.057 85.96 95.98
KZm University of Graz Kanzelhohe (Austria) Prof. team 1944–2019 1.110 74.23 74.24
LFm Luft New York (USA) Am indiv. 1944–1988 0.985 34.06 54.68
LO Specola Solare Locarno (Switzerland) Prof. ∼indiv. 1958–2019 1.260 68.27 81.68
MA Manila Obs. Manila (Philippines) Prof. team 1971–1988 1.023 20.85 78.69
MO Mochizuki (Urawa) Saitama (Japan) Am indiv. 1978–2019 1.073 35.51 66.09
PO Observatory Postdam (Germany) Prof. team 1955–1999 0.991 22.12 29.73
QU PAGASA weather Bureau Quezon (Philippines) Prof. ∼indiv. 1957–2019 0.829 45.46 53.83
SC-S Schulze* Germany Am indiv. 1960-2007 0.943 23.32 33.16
SK Skalnate Pleso Obs. Vysoke Tatry (Slovakia) Prof. team 1950–2012 0.992 37.95 40.75
SM San Miguel Obs. Buenos Aires (Argentina) Prof. team 1967–2013 1.220 39.09 56.34
UC USET Uccle (Belgium) Prof. team 1949–2019 0.991 57.00 59.64

Note. Main characteristics on the set of 21 stations used in this study: acronym (ID), last name of observer or name of station, location, type of observatory (professional versus amateur), type of observer (individual or team), observing period, averaged scaling factor with respect to the network over the studied period (level), and percentage of observations on the full period studied and on their observing period (% Obs. period). Note that ∗ and † symbols represent stations or observers from the SONNE and AAVSO networks, respectively. They are two distinct networks of observing stations that are not members of the WDC-SILSO network and hence are not used to produce the ISN.

Download table as:  ASCIITypeset image

Table 1 summarizes properties of the stations such as their location, name, type (amateur vs. professional, individual vs. team), observing period, percentage of missing values, and mean scaling factor (level) with respect to the network over the period studied. The procedure that was used to compute the mean scaling factors will be described in Section 4. These mean scaling factors may be viewed as an indication of the general level of counts recorded by the station as compared to the median of the network. Thus level = 1 corresponds to a station that observes the same number of spots as the median of the network. For example, Locarno (LO) with a level of 1.26 observes in general around 20% more spots than the others.

The location of the observatories gives an indication of the weather conditions of the stations and might explain part of the missing values. Moreover, the type of observatory usually impacts the quality of observations and the length of observing periods: an individual might experience less short-term variability than a team (alternating the observer from one week to another), and/or amateurs may have shorter observing periods than professionals.

Our data set contains the daily number of spots Ns, groups Ng, and the composite Nc observed in each of the 21 stations. As Table 1 indicates, the data present an important amount of missing values due to weather conditions preventing stations from observing, periods of instrument maintenance or definitive closures, or births of new observatories.

3. Model

In this section, we present step by step our uncertainty model. It characterizes the observations of the stations (either for the number of spots Ns, groups Ng, or composite Nc) in a multiplicative framework and involves different types of observing errors, as well as a quantity generically denoted by s(t), for Ns, Ng, and Nc. s(t) is a latent variable representing the "true" solar signal, i.e., the actual number of spots Ns, groups Ng, or composite Nc lying on the Sun. It cannot be directly observed, as the counts of the stations are corrupted by different error sources. Our goal is to estimate the distribution of the "true" solar signal and of the errors degrading it. In particular, we are interested in the mean and the variance of these distributions, but also in higher-order moments since the estimated densities are far from Gaussian.

The mean of s(t), denoted by μs(t), will be estimated in Section 5 based on the entire network (to be robust against errors of an individual station) and will be used as a proxy for s(t) in the remaining part of the article. Since our model is multiplicative, a good estimation of μs(t) is the key to get access to the multiplicative errors; see Equation (6). Moreover, a precise estimation of the mean level of an individual station is required for future monitoring, and this depends on the accuracy of the estimation of s(t).

We use a model that is conditional on the latent s(t) and decomposes the observations along two regimes: when s(t) = 0 (solar minima) and when s(t) > 0 (outside periods of minima); see Section 3.1. This allows introducing, outside of minima, a model with short-term observing errors and long-term drifts; see Section 3.2. A specific error model is then developed for periods of solar minima in Section 3.3, and the complete model is shown in Section 3.4. Finally, Section 3.5 introduces the Hurdle model in order to fit distributions exhibiting an excess of zero values, as is the case here owing to the presence of solar minima and observing errors.

3.1. Conditional Model

The observed counts are studied in two distinct situations: when there are sunspots (s > 0) and when there are none (s = 0). This separation is motivated by the idea that the absence or the presence of sunspots is led by a series of phenomena involving complex dynamo processes in the solar interior, and which can be modeled by a latent variable with two states. This analysis will lead to a better understanding of the observations and allows differentiating the "true" zeros of the counting process from the "false" zeros that occur when a station reports zero sunspot count in the presence of one or more spots on the Sun.

Let Yi(t) represent either the number of spots, groups, or composite actually observed (raw, unprocessed data). The index 1 ≤ i ≤ N denotes the station, and 1 ≤ t ≤ T is the time. The probability density function (pdf) of $Y:= {Y}_{i}(t)$ may be decomposed as

Equation (3)

Terms "1" and "3" in Equation (3) represent the short-term error in the presence of one or more sunspots. Term "1" reflects a situation where no sunspots are reported while there are actually some spots on the Sun ("false" zeros or observational errors due, e.g., to a bad seeing) and leads to an excess of zeros in short-term error distribution.

Term "2" captures the "true" zeros (no sunspot and no sunspot reported), while term "4" reflects a situation where the station reports some sunspots when there are no sunspot on the Sun. Term "4" is neglected outside of solar minima periods. Together, these two terms form the distribution of the error at minima, which has an excess of "true" zeros and a tail modeling the errors of the stations and the short-duration sunspots.

3.2. Short-term and Long-term Errors

Results in Dudok de Wit et al. (2016) evidence a short-term, rapidly evolving, dispersion error across the stations that accounts for counting errors and variable seeing conditions. We define a similar term allowing a possible station dependence, and we denote it epsilon1(i, t). Assuming ${\mathbb{E}}({\epsilon }_{1}(i,t))=0$, where ${\mathbb{E}}$ is the expectation sign, our interest lies in modeling its variance and its tail to study the short-term variability of the stations.

Next, we introduce epsilon2(i, t) to handle station-specific long-term errors such as systematic biases in the sunspot counting process. We want to estimate its mean, μ2(i, t), and detect whether this mean experiences sudden jumps or drifts on longer timescales.

Both epsilon1(i, t) and epsilon2(i, t) are multiplicative errors, as an observer typically makes larger errors when s(t) is higher (Chang & Oh 2012). Assembling these two types of errors, we propose the following noise model outside of solar minima:

Equation (4)

3.3. Errors at Solar Minima

Let epsilon3 denote the error occurring during minima of solar activity, when there exist extended periods with no or few sunspots. We assume the error epsilon3 to be significant when there are no sunspots (s(t) = 0) and otherwise negligible in order to not interfere with the errors epsilon1 and epsilon2. epsilon3 captures effects like short-duration sunspots and nonsimultaneity of observations between the stations. At solar minima, the model becomes

Equation (5)

3.4. General Model

Combining the three error types, we may write our uncertainty model in a compact and generic way as follows:

Equation (6)

We assume the random variables (r.v.) epsilon1, epsilon2, and epsilon3 to be continuous, and the r.v. s, epsilon1, epsilon2, and epsilon3 to be jointly independent. Although the "true" number of counts s(t) is discontinuous, its product with a continuous r.v. (epsilon1 + epsilon2) remains continuous. This is consistent with the fact that, after the preprocessing, the observed data Yi(t) may be modeled by a continuous r.v.

3.5. Excess of Zeros

All terms in Equation (6) exhibit an excess of zeros, that is, an unusual local peak in the density at zero, due to solar minima periods. As the solar minimum is an important part of a solar cycle, the zeros must be properly treated. Specific models such as the zero-altered (ZA) or the zero-inflated (ZI) two-part distributions may be used for this purpose (Zuur et al. 2009; Colin Cameron & Trivedi 2013). The main difference between both models is that the ZI distribution allows the zeros to be generated by two different mechanisms, contrarily to the ZA model, which treats all zeros in the same way.

As "true" and "false" zeros do not appear together in a single term of Equation (6), we find it appropriate to work with the ZA two-part model (also called the "Hurdle" model) and denote its density by f(x). In this model, the zero values are modeled by a Bernoulli distribution ${f}_{0}{(x)={b}^{1-x}(1-b)}^{x}$ of parameter b. Nonzero values follow a distribution described generically by f1(x), either another discrete distribution (in case of modeling the counts μs(t) in Section 5) or a continuous distribution for epsilon1 and epsilon3 in Section 6:

Equation (7)

The ZA distribution will be used to model the estimated densities of μs(t) in Section 5 and those of epsilon1(i, t) and epsilon3(i, t) in Section 6.

4. Preprocessing

Due to the different characteristics of the observing means (telescope aperture, location, personal experience, etc.), each station has its own scaling. These differences mainly impact the count of small spots, which cannot be observed with low-resolution telescopes, and the splitting of complex groups, where the personal experience of the observer matters. A preprocessing is thus needed to rescale all stations to the same level when comparing stations on the short term and at solar minima. It is also required to compute a robust estimator of the solar signal based on the entire network. For the analysis of long-term errors, however, the preprocessing will not be applied, as it would suppress long-term drifts that we want to detect. Our proposed preprocessing is robust to missing values and proceeds in two steps.

First, we compute the "timescale," that is, the duration of the period where the scaling factors are assumed to be constant. It is a trade-off between short periods and long periods: the former tends to standardize the observations of the stations, thereby suppressing any differences between the observers, whereas the latter may be too coarse to correct for important changes in observers and instruments. A statistically driven study based on the Kruskal–Wallis (KW) test (Kruskal & Wallis 1952) shows that the appropriate timescale varies with the stations and with the type of counts Ns, Ng, and Nc; see the Appendix for a full description of the test. This timescale may also evolve over time when a station is constant over several months before suddenly deviating from the network. However, to avoid introducing potential biases between the stations, we use the same timescale, generically denoted by τ, for all stations over the entire period studied. The selected values of τ are 8 months for Ns, 14 months for Ng, and 10 months for Nc. We note that that these periods are close to the 12-month period chosen by J. R. Wolf to compute the historical version of the scaling factors.

Second, having determined the timescale τ, we compute the scaling factors using ordinary least-squares regression (OLS) as follows. Recall that Yi(t) represents either the number of spots, groups, or composite actually observed in a station i, 1 ≤ i ≤ N at time t, 1 ≤ t ≤ T (daily values). For convenience, we rearrange the time by an array of two indices t = (t1, t2), where 1 ≤ t1 ≤ τ and 1 ≤ t2 ≤ T/τ. Thus, t1 corresponds to the index of an observation inside a block of length τ and t2 is the index of the block.

Let ${{\boldsymbol{Y}}}_{i,{t}_{2}}:= {[{Y}_{i}(({t}_{1},{t}_{2}))]}_{1\leqslant {t}_{1}\leqslant {\tau }^{\star }}$ denote the vector of the daily observations in station i on block t2 of length τ and ${{\boldsymbol{X}}}_{i,{t}_{2}}:= {[\mathop{\mathrm{med}}\limits_{1\leqslant i\leqslant N}{Y}_{i}(({t}_{1},{t}_{2}))]}_{1\leqslant {t}_{1}\leqslant {\tau }^{\star }}$ be the vector containing the daily values of the median of the network, also of length τ. The scaling factors are computed using the slope of the $\mathrm{OLS}({{\boldsymbol{Y}}}_{i,{t}_{2}}| {{\boldsymbol{X}}}_{i,{t}_{2}})$ regression:

Equation (8)

The new definition of the scaling factors in Equation (8) is a robust version of a ratio between the observations of the stations and the median of the network. It is similar to the definition of the historical k in Equation (2), where the median of the network replaces the single pilot station as the reference level. The rescaled data, denoted Zi in the sequel, are defined as

where the reference appears now in the denominator. In a sense, the ratio in Equation (2) is inverted in order to limit the problem of dividing by zero whenever the stations observe no spots. We explored other methods such as orthogonal regression (also called total least-squares) and $\mathrm{OLS}({{\boldsymbol{X}}}_{i,{t}_{2}}| {{\boldsymbol{Y}}}_{i,{t}_{2}})$ (Feigelson & Babu 1992). We choose the $\mathrm{OLS}({{\boldsymbol{Y}}}_{i,{t}_{2}}| {{\boldsymbol{X}}}_{i,{t}_{2}})$ method since it leads to the smallest Euclidean and total variation distances between the median of the network and the individual stations.

5. Solar Signal Estimation

5.1. Choice of the Estimator

To use Equation (6), we need an estimate of a proxy for s(t). We choose this proxy to be the mean of s(t), denoted μs(t). We propose as a robust estimator for μs(t) a transformed version of the median of the network:

Equation (9)

where ${M}_{t}=\mathop{\mathrm{med}}\limits_{1\leqslant i\leqslant N}{Z}_{i}(t)$ represents the median of the network and T denotes a transformation composed of an Anscombe transform and a Wiener filtering (Davenport & Root 1968). This filtering is applied in order to clean the data from very high frequencies, which can lead to instabilities in the subsequent analysis. The generalized Anscombe transform stabilizes the variance (Murtagh et al. 1995; Makitalo & Foi 2013). It is written as

Equation (10)

It is commonly applied in the literature to Gaussianize near-Poissonian variables. It is needed here, as the Wiener filtering performs better on Gaussian data. Similarly to Dudok de Wit et al. (2016), pp. 14–15, we fix α = 4.2 in Equation (10). This is the optimal value found for the composite Nc. Before applying the Wiener filtering, missing values of the median of the network are imputed using the algorithm described in Dudok de Wit (2011). Only 49 values are imputed, which represents 0.2% of the total number of values on the period studied. The Wiener filtering is then applied on the transformed and complete set of median values and suppresses the highest frequencies of the signal. Finally, the imputed missing values were reset to NaN ("not a number") in ${\hat{\mu }}_{s}(t)$. The block diagram of the procedure is described in Figure 3.

Figure 3.

Figure 3. Block diagram of the T procedure defining the solar signal estimator ${\hat{\mu }}_{s}(t)$, where ${M}_{t}=\mathop{\mathrm{med}}\limits_{1\leqslant i\leqslant N}{Z}_{i}(t)$ is the median of the network. An Anscombe transform is first applied on the median, and missing values are imputed. Then, a fast Fourier transform (FFT) is used to convert the signal to a power spectrum in the frequency domain, followed by an attenuation from a Wiener filter. A step function cancels the amplitude of the frequencies corresponding to the periods inferior to 7 days (low-pass filter). The threshold at 7 days is selected from Dudok de Wit et al. (2016; see Figure 5). It is the smallest visible timescale of the signal, corresponding to the weekly shift of some observatories. Finally, an inverse FFT and an inverse Anscombe transform are applied to the signal.

Standard image High-resolution image

Among other tested estimators (based on the mean, the median of the network, or a subset of stations), with or without application of T, the estimator proposed in Equation (9) turns out to be the most robust to outliers.

5.2. Comparison with Space Data

To test the quality of our estimator ${\hat{\mu }}_{s}$, we compare it with a sunspot number extracted from satellite images of the Sun. We expect less variability when Ns, Ng, and Nc are retrieved from satellite images using automated algorithms, as the rules to count spots and groups are clearly defined. Nevertheless, the measurements are biased by these rules, and the most complex cases, e.g., at maxima, most often require either human intervention or a specific procedure in the algorithm. In any case, a measure of the "true" number of spots and groups does not exist.

As exercise for this comparison, we use the sunspot Tracking And Recognition Algorithm (STARA) sunspot catalog (Watson & Fletcher 2010), regrouping observations from 1996 May to 2010 October (solar cycle 23). This number is extracted using an automated detection algorithm from the images obtained by the MDI instrument on Solar and Heliospheric Observatory. It has a lower scaling than our estimator for the number of spots, ${\hat{\mu }}_{{N}_{s}}$, as expected since the definition of a spot in the detection algorithm excludes the pores (spots without penumbra).

We compare three quantities on the period where STARA data are available (1996–2010): Ns (STARA), ${\hat{\mu }}_{{N}_{s}}$, and Ns as recorded by the Locarno station. These are shown in Figure 4. We test the level of variability by computing the mean value of a moving standard deviation over a window of 81 days. It is equal to 14.07 for Ns (STARA) rescaled on ${\hat{\mu }}_{{N}_{s}}$ (5.38 for Ns (STARA) without scaling) against 15.68 for ${\hat{\mu }}_{{N}_{s}}$ and 27.13 for Locarno. As expected, Ns (STARA) experiences less variability than a single station, but its variability is comparable to the one of our estimator.

Figure 4.

Figure 4. Comparison between the SN obtained from STARA and that from our procedures, for the period 1996 May to 2010 October. The number of spots obtained from the STARA catalog is represented in blue, the actual (unprocessed) number of spots observed in Locarno (LO) is represented in yellow, and ${\widehat{\mu }}_{{N}_{s}}$ is plotted in orange. The three quantities shown are averaged over 81 days.

Standard image High-resolution image

Nevertheless, satellite images of the Sun have only been available since 1980, and data extracted from those images cannot be traced back until the seventeenth century. Gathering space observations during several decades also requires the use of different satellites and instruments, as instruments age in space. These instruments need calibrations that create additional inaccuracies to the extracted numbers. We thus conclude that ${\widehat{\mu }}_{s}(t)$ is a more robust estimator of the solar activity, and it will be used as a proxy for s(t) in the sequel.

5.3. Solar Component for Ns and Ng

We present here the statistical modeling of the number of spots Ns and the number of groups Ng. We do this separately for each component since their physical origins are driven by different phenomena: the groups convey information about the dynamo-generated magnetic field in the solar interior, whereas the emergence of individual spots would rather come from fragmented surface flux and agglomeration of small magnetic fields (Thomas & Weiss 2008). Together, the analysis of the spots and groups helps us to better understand the composite Nc and the solar activity, which is not satisfactorily described by only one of the two numbers (Dudok de Wit et al. 2016). In the remainder of this paper, we define a specific notation for the generic ${\hat{\mu }}_{s}(t)$ from Equation (9): ${\hat{\mu }}_{{N}_{s}}(t)$ for the number of spots, ${\hat{\mu }}_{{N}_{g}}(t)$ for the number of groups, and ${\hat{\mu }}_{{N}_{c}}(t)$ for the composite.

The authors in Dudok de Wit et al. (2016) showed that the numbers of spots and groups experience more overdispersion than actual Poisson variables. In order to estimate how far the distribution of the "true" s(t) departs from a Poisson distribution, we regress the conditional variance $\mathrm{Var}({{\rm{Z}}}_{i}({\rm{t}})| {\hat{\mu }}_{s}({\rm{t}})=\mu )$ versus the conditional mean ${\mathbb{E}}({Z}_{i}(t)| {\hat{\mu }}_{s}(t)=\mu )$ by OLS; see Figure 5. Whereas in a Poisson context the slope of the fit should be close to 1, for Ns > 10, our fit shows a slope of 1.5, with confidence interval (CI) CI95% = [1.48, 1.51]. This points to overdispersion and the need for a generalization of a Poisson pdf. On the contrary, the same plot for Ng > 0 displays a slope of 0.96, with CI95% = [0.93, 0.99], confirming the validity of a Poisson process assumption. Note that values <11 are excluded from the fit of Ns, as they seem to indicate a different regime. This change in the alignment may indicate the presence of a multimodal distribution; see Figure 6.

Figure 5.

Figure 5. Estimation of the conditional mean–variance relationship for Ns (left) and Ng (right). The red line is a linear fit of the points (shown on a log–log scale), starting at Ns > 9 and Ng > 0, respectively. In both plots, the legend shows the value of the fitted slope together with its confidence interval at 95%. The value of the intercept is −1.21 for Ng and −0.57 for Ns. The same fit starting at Ns > 0 (not shown here) leads to a slope of 1.25 and CI95% = [1.23, 1.28].

Standard image High-resolution image
Figure 6.

Figure 6. Left: histogram of ${\widehat{\mu }}_{{N}_{s}}(t)$ values, computed with a bandwidth (binning) equal to 3, and estimated density for nonzero values of ${\widehat{\mu }}_{{N}_{s}}(t)$ (shown by the black line). The complete density is modeled by a ZA mixture of generalized NBs. For the zero values, the MLE value of the Bernoulli parameter is equal to b = 0.1. For nonzero values, the MLE values of the parameters in Equation (12) are r1 = 1.25, p1 = 0.11, r2 = 2.39, p2 = 0.04, and w1 = 0.32. Right: histogram of ${\widehat{\mu }}_{{N}_{g}}(t)$ values, computed with a bandwidth equal to 1, and corresponding density fitted by MLE (shown in black line). The density is modeled by a mixture of an NB and a Poisson distribution as defined in Equation (13). The fitted parameter values are μ2 = 8.62, r1 = 1.65, p1 = 0.37, and w1 = 0.36.

Standard image High-resolution image

Count data with overdispersion are widely modeled by the negative binomial (NB) distribution in the literature (Colin Cameron & Trivedi 2013; Rodriguez 2013) or by its generalization (Jain & Consul 1970):

Equation (11)

where r > 0, p ∈ (0,1), q = (1 − p) and Γ is the gamma function.

A visual inspection of the histogram of estimated values ${\widehat{\mu }}_{{N}_{s}}(t)$ in the left panel of Figure 6 reveals a local maxima in the distribution around 20–40. We refer to these local maxima as modes in the remainder of the article. The underlying density of ${\hat{\mu }}_{{N}_{s}}(t)$ may thus be multimodal, as suspected from the left panel of Figure 5. Such pdf's are classically modeled by a mixture model. As the density shows a typical excess of zeros as well, it requires the use of a ZA distribution defined in Equation (7). We thus fit the complete pdf of the estimated number of spots, ${\hat{\mu }}_{{N}_{s}}(t)$, by a ZA mixture of generalized NB distributions. The density at zero, f0(x), is represented by a Bernoulli distribution, whereas the density outside zero, f1(x) in Equation (7), is identified by a mixture of NB distributions:

Equation (12)

where g1, g2 are NB densities and w1 is the mixture weight.

Similarly, the histogram of ${\hat{\mu }}_{{N}_{g}}(t)$ exhibits a clear excess in the range of 1–3 compared to a Poisson-like distribution centered between 5 and 8. The pdf of ${\widehat{\mu }}_{{N}_{g}}(t)$ shows thus two modes: one around 1–3, and one around 5–8. Such a pdf may be modeled by a mixture of NB and Poisson distributions:

Equation (13)

where μ2 > 0 and, as above, w1 is the mixture weight.

The fit of these parametric densities is shown in Figure 6 by a black line superimposed on the histograms. All the fits in this article are computed using the maximum likelihood estimation (MLE). The nature of the different modes in the pdf of ${\hat{\mu }}_{{N}_{s}}$ and ${\hat{\mu }}_{{N}_{g}}$ will be discussed in Section 5.5.

5.4. Solar Component for Nc

We now use Equation (9) to estimate the ${\mu }_{{N}_{c}}$, the "true" value of the composite Nc = Ns + 10Ng. Again looking at the conditional mean–variance relationship, we observe in the left panel of Figure 7 an overdispersion with a slope of 1.29 and CI95% = [1.27, 1.31] for Nc > 20. As a compound of both quantities, Nc experiences less overdispersion than Ns and more than Ng. A visual inspection of the histogram of ${\hat{\mu }}_{{N}_{c}}(t)$ values in the right panel of Figure 7 indicates an excess of zeros and several modes, most probably coming from the modes observed in the pdf's of ${\hat{\mu }}_{{N}_{g}}$ and ${\hat{\mu }}_{{N}_{s}}$. We thus find it appropriate to approximate the density of ${\hat{\mu }}_{{N}_{c}}(t)$ by a ZA mixture of three NB distributions, where the density outside zero values, f1(x) in Equation (7), is identified with

Equation (14)

where wi are the mixture weights and ${\sum }_{i=1}^{3}{w}_{i}=1$. The fit of f1 is represented in the right panel of Figure 7 by a black line.

Figure 7.

Figure 7. Left: conditional mean–variance relationship for Nc, shown on a log–log scale. Right: histogram of ${\hat{\mu }}_{{N}_{c}}(t)$ values, with a binning equal to bw = 3. The estimated density outside of zero values is shown by a black line. It is modeled as a mixture of three NB distributions (see Equation (14)), with MLE parameter values equal to r1 = 3.18, p1 = 0.48, r2 = 4.02, p2 = 0.15, r3 = 3.05, p3 = 0.02, w1 = 0.08, and w2 = 0.19. The Bernoulli parameter of the density f0 at zero is equal to b = 0.07.

Standard image High-resolution image

A statistical analysis (not presented here) shows that the distribution of the ISN is statistically close to the distribution of ${\hat{\mu }}_{{N}_{c}}$. The uncertainty analysis for ${\hat{\mu }}_{{N}_{c}}$, presented in the remainder of the article, remains thus valid for the ISN.

5.5. Conditional Correlation

Due to the physical nature of the data, the local maxima for the densities of ${\widehat{\mu }}_{{N}_{s}}$ and ${\widehat{\mu }}_{{N}_{g}}$ are not independent. We therefore look at the conditional correlation $\mathrm{Corr}({N}_{s},{N}_{g}| {\widehat{\mu }}_{{N}_{c}}=s)$ with the goal to better understand the nature of the modes observed in these two densities, and thus also in the density of ${\hat{\mu }}_{{N}_{c}}$.

Figure 8(a) shows the conditional correlation for different values of ${\hat{\mu }}_{{N}_{c}}$ between 0 and 400. Note that even when ${\hat{\mu }}_{{N}_{c}}=s$, the value of the composite Ns + 10Ng for a particular station may be larger (resp. smaller) than s. Our analysis highlights three regimes of activity:

  • Minima: ${\hat{\mu }}_{{N}_{c}}\in [0,11]$. Here, the number of spots and groups oscillates between 0 and 1. As the number of spots equals exactly the number of groups, the correlation is high.
  • Medium activity: ${\hat{\mu }}_{{N}_{c}}\in [12,60]$. The correlation progressively decreases, because the number of spots increases faster than the number of groups and then stabilizes. This regime is characterized by the development of smaller spots without penumbra or with a small penumbra. Figure 8(b) shows the bivariate boxplot of Ns and Ng when ${\hat{\mu }}_{{N}_{c}}=40$. For Ng = 1 or Ng = 2, we observe values of Ns as high as 40. We clearly observe groups containing a large number of spots, as well as groups, composed of fewer spots, that appear progressively as the penumbra grows and that indicate a transition toward groups with fewer but larger spots. The effect of this transition from small to larger spots is observed in Figure 5 from Clette & Lefèvre (2016).
  • High activity: ${\hat{\mu }}_{{N}_{c}}\gt 60$. Figure 8(c) shows the bivariate boxplot of Ns and Ng when ${\hat{\mu }}_{{N}_{c}}=70$. The plot has a potato shape around (Ns = 30, Ng = 4). We now observe all kinds of groups. The correlation between groups and spots slightly increases as the number of groups begins to grow as well.

Figure 8.

Figure 8. (a) Conditional correlation of Ns and Ng: $\mathrm{Corr}({N}_{s},{N}_{g}| {\hat{\mu }}_{{N}_{c}}=s)$ for $s\in [0,400]$. Bivariate boxplot (also called "bagplot") of Ns and Ng when (b) ${\hat{\mu }}_{{N}_{c}}=40$ and (c) ${\hat{\mu }}_{{N}_{c}}=70$. The white cross represents the depth median (Rousseeuw et al. 2012). The bag contains 50% of the observations, and it is represented by a polygon in red. The fence (not represented) is obtained by inflating the bag by a factor three. The observations that are outside of the bag but inside of the fence are indicated by a light-gray loop. Outliers are represented by a black star. The correlation is indicated by the orientation of the bag.

Standard image High-resolution image

The medium and high regimes are reflected in the estimated densities of ${\widehat{\mu }}_{{N}_{s}}$ and ${\widehat{\mu }}_{{N}_{g}}$ in Figure 6. The first mode of ${\hat{\mu }}_{{N}_{g}}$, ranging from 1 to 3, corresponds to the medium regime, while the second mode, ranging from 5 to 8, reflects the high regime. The two distinct regimes provide another justification for the use of a multimodal distribution to characterize the pdf of ${\hat{\mu }}_{{N}_{g}}$. Similarly, there is also a mode in the distribution of ${\widehat{\mu }}_{{N}_{s}}$ around 20–40 that comes from the transition between the medium and the high regime. The mode is correctly represented by a mixture model. The study of the conditional correlation constitutes the first step toward retrieving the distribution of Nc from its composites Ns and Ng. However, this task is challenging and goes beyond the scope of the article because (1) the distributions of Ns and Ng are complex mixtures and (2) the number of spots is nontrivially correlated to the number of groups.

6. Distribution of Errors

We are now in a position to analyze the error distribution in sunspot counts, the modeling of the distributions of epsilon3, epsilon1, and epsilon2. To do so, we separate minima from nonminima regimes. We also consider two timescales: short-term periods, that is, timescales smaller than one solar rotation (27 days), and long-term periods. Section 6.1 estimates error at solar minima, i.e., when s(t) = 0. Section 6.2 analyzes short-term variability of the preprocessed observations when s(t) > 0. For the study of long-term error in Section 6.3, we use raw data that did not undergo any preprocessing, in order to be able to detect sudden jumps and/or large drifts in the time series. The correct timescale for the long-term period is also determined in this section, based on a statistically driven procedure. Finally, Section 6.4 compares the characteristics of the different stations based on the error analysis.

6.1. Error at Minima

The study of solar minima periods is complex, as the data show a large variability and dichotomy. Observed values of the error at minima, epsilon3, are defined as counts made by the stations when the proxy for s(t), defined in Equation (9), is equal to zero:

Equation (15)

where Zi(t) corresponds to Ns, Ng, or Nc, and where the generic ${\widehat{\mu }}_{s}(t)$ has to be replaced by ${\widehat{\mu }}_{{N}_{s}}(t)$ for Ns, ${\widehat{\mu }}_{{N}_{g}}(t)$ for Ng, and ${\widehat{\mu }}_{{N}_{c}}(t)$ for Nc.

A visual inspection of the histogram of Ns (resp. Ng) in the left (resp. middle) panel of Figure 9 shows an important amount of "true" zeros together with two modes around one and two. Similar modes occur around 11 and 22 in the distribution of Nc in the right panel of Figure 9, as expected. These modes represent short-duration sunspots. Due to the nonsimultaneity of the observations between stations, the proxy for s(t) might be equal to zero even if some spots appear shortly (from several minutes to several hours) on the Sun. These modes can be represented by a t-location-scale (t-LS) distribution, which is a generalization of the Student t-distribution. This distribution has three parameters to accommodate for asymmetry and heavy tails: the location μ, scale σ > 0, and shape ν > 0 (see Taylor & Verbyla 2004; Evans et al. 2000). Its pdf is defined as

Equation (16)

The large proportion of zero values for ${\hat{\epsilon }}_{3}$ requires the use of a ZA model as in Equation (7). We choose a ZA mixture of t-LS for the complete distribution of ${\hat{\epsilon }}_{3}$. The density outside of zero, f1(x) in Equation (7), is thus identified by such a mixture of t-LS distributions:

Equation (17)

where, as before, w1 is the mixture weight. The histograms and fitted distributions for ${\widehat{\epsilon }}_{3}$ are shown in Figure 9. The visual closeness between the histogram and the fitted distribution was used as a criterion to select the best pdf among a few intuitive candidates, while the parameters of the distribution are estimated via MLE.

Figure 9.

Figure 9. Truncated histograms of ${\hat{\epsilon }}_{3}$ for Ns (left), Ng (middle), and Nc (right). The solid line shows the fits using a ZA mixture of t-LS distributions, defined in Equation (17). The values of the Bernoulli parameter in Equation (7) are equal to b = 0.9 (left), b = 0.86 (middle), and b = 0.96 (right). They represent the proportion of "true" zeros. The parameter values for the t-LS fit are (left, for Ns) μ1 = 0.91, σ1 = 0.14, ν1 = 31.16, μ2 = 1.85, σ2 = 0.71, ν2 = 2.09, w1 = 0.6; (middle, for Ng) μ1 = 0.89, σ1 = 0.07, ν1 = 6.89, μ2 = 1.75, σ2 = 0.14, ν2 = 1.33, w1 = 0.09; and (right, for Nc) μ1 = 10.24, σ1 = 1.37, ν1 = 3.89, μ2 = 20.57, σ2 = 2.33, ν2 = 1.93, w1 = 0.08. The bin width (bw = 0.0917) is the same for the histograms of both Ns and Ng. It is related to the sample size and the data range of Ns by a simple rule proposed by Scott (1979). The bin width of the histogram of Nc (right) is equal to bw = 0.4192 and is also computed by Scott's rule. Note that the right panel is enlarged: the value at zero is 0.96 and not 0.03.

Standard image High-resolution image

In the previous figures, where the error at minima is represented for all stations combined, outliers defined as ${\hat{\epsilon }}_{3}(i,t)\gt 2$ are not visible for Ng and Ns. A separate analysis (not presented here) shows that the percentage of outliers in each station is low (inferior to 0.5% for Ns). Some stations also observed a high maximal value at minima (e.g., a value of 35 was recorded in QU [Quezon, Philippines] for Ns). This extreme value for minima may correspond to a transcription error that might be verified in the future, before being encoded in the SILSO database.

6.2. Short-term Variability

When the proxy for s(t), defined in Equation (9), is different from zero, the short-term error $\widetilde{\epsilon }$ may be estimated using

Equation (18)

To select the best distribution, we proceed as follows. Different densities are fitted to the values of $\widehat{\widetilde{\epsilon }}$, outside of zero, using MLE.8 Then, the AIC criterion is used to choose the best pdf, which in this case is a t-LS distribution.

As we observe an excess of zero, we need a ZA t-LS distribution to represent the complete distribution of $\widehat{\widetilde{\epsilon }}$. Figure 10 shows the histogram and the fitted pdf of $\widehat{\widetilde{\epsilon }}$ outside of zero. For the latter, the mean is close to 1, indicating that on average the stations are aligned with ${\hat{\mu }}_{s}(t)$. The histogram exhibits a probability mass at zero representative of "false" zeros, that is, of stations that do not observe any sunspot when there are actually some on the Sun. The histogram also shows a tail on the right-hand side, caused by outliers. This asymmetry requires a t-LS rather than a Gaussian distribution to be fitted.

Figure 10.

Figure 10. Histograms of $\hat{\widetilde{\epsilon }}$ for Ns (left), Ng (middle), and Nc (right). The solid line shows the fits using a t-LS distribution defined in Equation (16). The values of the Bernoulli parameter in Equation (7) are equal to b = 0.04 (left), b = 0.02 (middle), and b = 0.06 (right). They represent the proportion of false "zeros," i.e., stations reporting no sunspot where there are some. The parameter values for the t-LS fit are (left, for Ns) μ = 1, σ = 0.26, ν = 2.8; (middle, for Ng) μ = 0.99, σ = 0.16, ν = 2.33; and (right, for Nc) μ = 1.01, σ = 0.17, ν = 2.12. The bin widths (bw) of the histograms are computed using Scott's rule. For Ns and Ng they are the same (bw = 0.0328 ) and for Nc it is equal to bw = 0.0433.

Standard image High-resolution image

The violin plots of four different stations are shown in Figure 11 for the number of spots Ns, where the differences between the stations are the most visible. The mean of the Locarno station (LO), the current reference of the network, is slightly higher than the three other means (and higher than the means of all other stations), around 1.19. This results from its particular way of counting: large spots (with penumbra) count for more than small spots without penumbra.

Figure 11.

Figure 11. Truncated violin plots of the estimated short-term variability $\widehat{\widetilde{\epsilon }}$ for Ns in four stations (FU, LO, SM, and UC). A violin plot (Hintze & Nelson 1998) combines a vertical boxplot with a smoothed histogram represented symmetrically to the left and right of the box. The white dot in the center of the violin locates the mean of the distribution. The thick gray bar shows the interquartile range, and the thin gray bar depicts the interdecile range. The bin width (bw = 0.05) is the same for all stations and is computed with Scott's rule.

Standard image High-resolution image

Another characteristic feature is how the error is distributed around the mean. A violin plot may be seen as a pdf with the x-axis of the density drawn along the vertical line of the boxplot. For example, the pdf of the short-term error of LO is concentrated around the mean, but the entire distribution is shifted upward, unlike the pdf of Uccle (UC), which has much lower values. UC is a professional observatory. Different observers record from one week to another the number of spots, groups, and composite on the Sun. As their experience and methodology slightly change, the shift of observers probably increases the short-term variability of the station. Usually a team of observers experience more variability than a single person, like in FU (Fujimori, Japan). This station has remarkable short-term stability.

Similarly, the San Miguel (SM) station shows the typical shape of a professional observatory. On the other hand, the LO station shows an $\widehat{\widetilde{\epsilon }}$ distribution almost characteristic of a single observer: that is because until recently LO had one dominant main observer.

6.3. Long-term Variability

A generic estimator for the long-term error epsilon2(i, t) may be defined by

Equation (19)

where the ⋆ denotes the smoothing process, Yi(t) are the raw observations, and ${M}_{t}=\mathop{\mathrm{med}}\limits_{1\leqslant i\leqslant N}{Z}_{i}(t)$ is the median of the network. The T transform from Equation (9) is not required here, as we apply a moving average (MA) of length L defined below.

This length L should be larger than what is considered as short-term, that is, periods inferior to one solar rotation (27 days). Long-term, on the other hand, is usually defined as periods above 81 days (Dudok de Wit 2011), beyond which the effects of the solar rotation and of the sunspot's lifetime are negligible. The midterm temporal regime corresponds to periods between 27 and 81 days. To select the long-term scale for a given station i, we make the assumption that, for all t belonging to a window of length L, we have

Equation (20)

where Ci is a constant, which might differ from 1. Having Ci = 1 means that the station i is at the same level as the median of the network. We test different lengths L (L > 27 days) for the MA window and select the long-term regime as the shortest length for which the above assumption is valid. We consider thus ${\hat{\mu }}_{2}(i,t){\rm{s}}$ of Equation (19) in sliding windows of length L over the total period (1947–2013). We apply a nonparametric equivalent of the t-test (the Wilcoxon rank sum test; Bridge & Sawilowsky 1999) on the ${\hat{\mu }}_{2}(i,t){\rm{s}}$ to test whether Equation (20) is verified within each window. Longer windows correspond thus to a stronger smoothing but also contain more values to test. As a result of this procedure, we define the long-term regime as all scales above 81 days, as we found this to be the shortest length such that the constant assumption on μ2(i, t) is not violated more than roughly 10% of the time. This ties in with what solar physicists consider as the long-term regime.

Depending on our interest in detecting long-term drifts or jumps, different window lengths may be chosen in Equation (19) (some well above 81 days). Indeed, drifts require long smoothing periods (several months, or even years) to be observed, whereas jumps might be oversmoothed by such long smoothing and hence need a smaller MA window.

Figures 12(a)–(d) represent the long-term drifts associated with Ns in four stations starting from 1960. Figures 12(e)–(h) show the scaling factors κi(t2)s for the same stations used at short-term and minima regimes. We do not represent years before 1960 because FU and SM show too few observations in that period. FU and UC appear relatively stable, unlike stations LO and SM, which display severe drifts. Bias in the counting process is also larger during solar minima when there are short-duration sunspots. This effect is clearly visible in LO. Indeed, erroneous encoding of counts leads to much higher relative errors during minima than during the remaining part of the solar cycle. Some jumps are also visible on the graphs with the smallest MA length (81 days) in green. This scale is more appropriate to observe the jumps, while longer scales only highlight the long-term drifts of the stations.

Figure 12.

Figure 12. (a–d) Estimation of ${\hat{\mu }}_{2}(i,t)$ for Ns in four stations (SM, FU, LO, and UC). ${\hat{\mu }}_{2}(i,t)$ is computed with different MA window lengths: 81 days (green dotted line), 1 yr (red dashed line), and 2.5 yr (blue solid line). (e–h) Estimation of the scaling factors for Ns in the same stations. The κi(t2)s, with 1 ≤ t2 ≤ T/τ, are computed using the $\mathrm{OLS}({{\boldsymbol{Y}}}_{i,{t}_{2}}| {{\boldsymbol{X}}}_{,{t}_{2}})$ regression in Equation (8) on a block of τ = 81 days (green dotted line), 1 yr (red dashed line), and 2.5 yr (blue solid line). The solar cycle is represented in black at the bottom of the figures for Ns.

Standard image High-resolution image

We emphasize here the strong link between the preprocessing and the long-term analysis. Indeed, the scaling factors presented in Section 4 are a rough estimate of the long-term error, inspired by the historical procedure of J. R. Wolf. This rough estimation is required to rescale the stations to the same level. This rescaling is used to compute the median Mt of Equation (19). Contrarily to the piecewise constant κis computed in Section 4, the ${\hat{\mu }}_{2}(i,t){\rm{s}}$ are smooth over time and hence are more adapted to a future monitoring of the stations.

6.4. Comparing Stations with Respect to Their Stability

In previous sections, we presented separately the estimations of the short-term error $\hat{\widetilde{\epsilon }}(i,t)$, the long-term error ${\hat{\mu }}_{2}(i,t)$, and the error at minima ${\hat{\epsilon }}_{3}(i,t)$. All three types of errors are needed to assess the quality and stability of one station. It is more important for a station to have a low variability (low interquantile range) than to be aligned on the mean on the network. Indeed, as seen in Section 4, it is easy to rescale a station on the mean of the network.

Figure 13 displays a visual representation of long-term against short-term error for each station. It shows the long-term versus short-term empirical interquantile range on a 2D plot and thus characterizes the stability of the stations outside of minima. Stations in red are the teams of observers. They usually experience more short-term variability than an individual. We see that MO (Mochizuki, Japan), FU, and KOm (Koyama, Japan) have low variability in both the short and long term. They correspond to long individual observers with stable observation practices. On the other hand, the LO station shows a poor long-term stability, while its short-term variability is remarkably low for a professional observatory. As mentioned earlier, this is due to the fact that there is a main observer. UC shows a large variability in the short term (due to many observers) but an interesting long-term stability, as already noticed in Figure 11. SM experiences the most severe long-term variability of the network. It also has a large short-term variability, characteristic of a team of observers. QU shows a large short-term variability and a low long-term variability level. Although it seems that it is a single observer, it appears there was a move from one place to another during the observing period, and maybe a change of instrument that would impact the short-term variability. This surprising behavior will prompt SILSO to ask for more metadata.

Figure 13.

Figure 13. Scatter plot showing the interquantile range of the estimated short-term error $\hat{\widetilde{\epsilon }}(i,t)$ and the interquantile range of the estimated long-term error ${\hat{\mu }}_{2}(i,t)$, station by station. Stations in red represent the teams of observers; the others are single observers.

Standard image High-resolution image

7. Conclusion and Future Prospects

In this article, we propose the first comprehensive uncertainty model in a multiplicative framework for counting spots, groups, and composite on the Sun. Our approach is robust to missing values and was applied on 66 yr of data (1947–2013). We presented several parametric models for the density of the "true" Ns, Ng, and Nc, as well as for the density of their error distribution at minima, short term, and long term. This error quantification allows proposing a first classification of the 21 stations of our pool based on their stability. It shows that the observatories are affected differently by the various types of errors: some are stable with respect to the network at short term but experience large drifts, and vice versa. The analysis highlights the hazards of using a single pilot station as the unique reference of the network.

We intend to use the error models presented in Section 6 for a parametric monitoring of all stations of the network, with a particular focus on new stations. Data from newborn observatories can be recorded for several months. Their distributions may then be compared to the density of the short-term error (or the error at minima if we are in a minima period) of the entire network obtained in this paper. If the stations experience similar errors, they may be included in the network. Otherwise, the stations might need to improve or correct their observing procedure before entering the SILSO network.

A nonparametric monitoring that aims to detect in quasi-real time the long-term drifts of the stations in the network is also under development. An example of a classical monitoring procedure is the CUSUM chart (Koshti 2011). It is frequently used to control the production quality in industry. The chart accumulates the deviations of the mean value above a reference level in a statistic. If the value of the statistic exceeds a predefined threshold depending on the standard deviation of the process, the process is considered out of control and an alert is given. This simple method based on the two first moments of the distribution is obviously not adequate to control heavily skewed variables such as the number of spots. More complex methods need to be developed that will strongly depend on the models of the data.

The work presented here enhances our comprehension of the ISN and its error. It is part of a larger project that aims at improving the quality of the ISN. We started this project a few years ago when a revised version of the ISN was published (Clette & Lefèvre 2016), and we will pursue with the future monitoring of the stations to provide a yet missing quality-control procedure for the ISN. As the ISN is used as a benchmark in several fields of astrophysics and space physics, this is a much-needed task.

This work benefited from highlighting discussions with T. Dudok de Wit. The first author gratefully acknowledges funding from the Belgian Federal Science Policy Office (BELSPO) through the BRAIN VAL-U-SUN project (BR/165/A3/VAL-U-SUN). We also want to thank the International Space Science Institute (ISSI-Bern) and the members of the "Recalibration of the sunspot Number Series" team for providing support.

Appendix: Timescales of the Preprocessing

This appendix details the statistical procedure selecting the timescales of the preprocessing described in Section 4. It is composed of three steps. First, the daily scaling factors are computed using

Equation (21)

where, as in Section 4, we rewrite the time by an array of two indices 1 ≤ t1 ≤ 30 and 1 ≤ t2 ≤ T/30, corresponding, respectively, to the day and the month of the observation.

Second, the nonparametric KW test (Kruskal & Wallis 1952) is applied on blocks of 30 factors, since the "k-coefficients" of Equation (2) are currently estimated on a monthly basis at the WDC-SILSO. Let ${{\boldsymbol{\kappa }}}_{i,{t}_{2}}={[{\kappa }_{i}(({t}_{1},{t}_{2}))]}_{1\leqslant {t}_{1}\leqslant 30}$ denote the vector of the daily factors on 1 month. The test assesses whether the ${{\boldsymbol{\kappa }}}_{i,{t}_{2}}{\rm{s}}$ of consecutive months are statistically different. The procedure starts by comparing the distribution of the first month of the period studied, ${{\boldsymbol{\kappa }}}_{i,1}$, to the distribution of the second month, ${{\boldsymbol{\kappa }}}_{i,2}$. If the test shows that both distributions are significantly different, the next two distributions ${{\boldsymbol{\kappa }}}_{i,2}$ and ${{\boldsymbol{\kappa }}}_{i,3}$ are tested. Otherwise, the distribution of the first two months $[{{\boldsymbol{\kappa }}}_{i,1}$ ${{\boldsymbol{\kappa }}}_{i,2}]$ is compared to the distribution of the third month ${{\boldsymbol{\kappa }}}_{i,3}$. The algorithm is iterated until the end of the period, for each station. Note that the KW test performs well when comparing two or more independent samples of unequal sizes. The correlation of the data is thus neglected in this procedure. Despite the presence of correlations between consecutive days, the correlation between consecutive months is low. The test provides thus a station-specific segmentation, shown in Figure 14 for Ns. The length of the segments indicates the number of consecutive blocks of scaling factors that come from the same distribution. We assume that these factors are constant within each segment.

Figure 14.

Figure 14. Bar chart representing the results of the KW test applied to the scaling factors for Ns. The x-axis represents the stations indexed from 1 to 21, and the y-axis shows the total period studied expressed in months (1 unit ≈ 30 days). The y-axis is not ordered in time, for readability purposes, but it is ordered with respect to the length of the segments. The colors of the chart correspond to the number of blocks that may be grouped into a single factor ("κ = 5" means that a single scaling factor may be computed for a period of 5 months).

Standard image High-resolution image

In the last step, the global timescales for each index are defined from the segmentations of the individual stations. The length of the most frequent segment is first selected in each station. Then, a global scale is estimated from the median of the most frequent lengths by station, for Ns, Ng, and Nc.

Footnotes

Please wait… references are loading.
10.3847/1538-4357/ab4990