Introduction

Synthesis reports like the Global Carbon Budget characterise changes in the ocean carbon sink by analysing the annual air–sea carbon flux1. However, year-to-year changes or the interannual variability of carbon fluxes may be driven by seasonal processes that are neglected in the annual average2. For most parts of the global ocean, the seasonal cycle is the most prominent mode of variability in air–sea carbon exchange3. As a result, seasonal mechanisms and their link to interannual variability have received considerable attention4,5,6,7. Despite this, inferring the interannual variability of the ocean carbon sink remains a major scientific challenge8. Future emissions will lead to seasonally divergent signals9,10, hence understanding long-term temporal changes of the ocean carbon sink hinges on resolving short-term seasonal signals that drive their variability4,11.

Seasonal changes in the carbon sink are largely driven by internal processes, such as a shift in temperature-dependent solubility or biological processes that reduce the mean partial pressure of carbon dioxide (pCO2) in the ocean and increase the amplitude of the annual cycle12. In contrast, interannual changes in the carbon sink are driven by external forcing, influenced by climate phenomena like El Niño Southern Oscillation, or in response to increasing atmospheric carbon concentrations13. On decadal timescales, feedbacks rein in the interannual variability so that its magnitude is much smaller than that of seasonal variability. Assessing the weaker interannual variability signal requires long-term measurements at the same location and season, an observational constraint seldom satisfied in the global ocean. Consequently, a combination of models and observation-based methods have been employed to understand the mechanisms that connect changes in the seasonal cycle to interannual variability and to acquire reliable state estimates of the oceanic carbon inventory.

Earth System Models (ESM) are internally consistent tools well suited to represent global annual carbon fluxes and their long-term trends. However, there remains a significant mismatch between ESMs and observation-based data on seasonal timescales and global trends14,15. Further, considerable differences exist even among models of the same generation and complexity in regional carbon flux estimates, with disagreements on the phasing of the seasonal cycle in the high latitudes11,16.

However, both models and observation-based products face limitations14,17,18,19. On the one hand, global ocean biogeochemical models that are forced by the atmosphere have been historically used to close the Global Carbon Budget14,20, but limited observational data makes it challenging to test and potentially reduce model biases and uncertainty when ocean and atmosphere models are coupled. On the other hand, observation-based carbon flux reconstructions rely on sparse measurements of pCO2 at the air–sea interface and some form of interpolation technique to statistically fill spatial and temporal gaps21,22,23,24,25,26. As a result, despite being measurement-driven and sourced from the same observational database, there is a lack of consistency across observation-based products14,18,27, and measurement uncertainties may propagate to the individual products.

In this study, we investigate the seasonality of carbon fluxes in context with their respective annual fluxes over three decades (1990–2020) for which observational constraints are available. We contrast two tools used to assess carbon fluxes—an ESM large ensemble—the Max Planck Institute Grand Ensemble (MPI-GE, n = 100) and an observation-based CO2 flux ensemble (using products from SOCOM (n = 6). Details in Data and Methods). The mean of the model ensemble represents the forced signal across members, and the spread between the individual members represents its internal variations. In contrast to the model, the differences between the observation-based products do not stem from differences in internal (or climatic) variations but rather from measurement uncertainties and random errors related to the extrapolation method. All observation-based products rely on significant averaging or interpolation using satellite-derived observations of sea surface temperature (SST), chlorophyll (Chl a) or climatologies of mixed layer depth (MLD) to make basin-scale estimates. Additionally, differences in the choice of the kinetic gas transfer formulation across the air–sea interface and the wind product contribute to the spread, though previous studies have shown that harmonising the pCO2 product ensemble using a common gas transfer formulation can significantly decrease the spread28. Here, we follow the approach of the Global Carbon Budget1 and consider the disharmonised gas transfer as part of the ensemble spread, and hence the overall uncertainty. Regardless of the phasing of internal (or climatic) variations, we treat both ensembles equally, not differentiating between them and how they were built. Contrasting the model and observation-based ensembles, we investigate how seasonal and annual carbon fluxes compare and ascertain if a link exists between seasonal variability that helps diagnose the interannual variability of air–sea carbon fluxes in either ensemble.

We focus on the North Atlantic basin and the Southern Ocean—two distinct ocean regions that show considerable carbon flux variability on seasonal through interannual timescales but differ in the level of convergence between model and observation-based data18,29,30,31. The first region, the North Atlantic basin (10–90°W, 15–60°N), has the most intense carbon sink per unit area32. It has historically been one of the most regularly observed ocean regions with year-round coverage and nearly equal distributions of seasonal measurements since the Ship of Opportunity programme started in the 1990s32,33. The observed variability in carbon uptake on interannual timescales and the mechanisms driving this variability can be explained across a broad range of ESMs so that there is overall good agreement between models and observation-based products on seasonal timescales34.

Competing mechanisms direct the phasing of the seasonal cycle of the carbon flux in the North Atlantic. The flux signal can be deconstructed into thermal and nonthermal components, allowing for a more mechanistic description of key driving processes. Temperature variations have a known effect (approximately a 4% increase per unit temperature increase) on the seasonal cycle of the carbon flux35. The onset of phytoplankton blooms in boreal spring as part of the ocean’s biological pump and the deepening of the mixed layer in boreal winter dampen or counteract thermal-driven variations. Externally forced thermal variations largely determine the subtropical seasonal cycle, while nonthermal variations drive the subpolar seasonal cycle3,10,31,36. As a result, the subtropical and subpolar North Atlantic have distinctly different seasonal patterns that are out of phase over the year. In the North Atlantic, particularly in the subpolar region, the North Atlantic Oscillation is considered the dominant mode of variability influencing SST patterns37; however, it is suggested that it only accounts for roughly 30% of the variability38.

The second region, the Southern Ocean (0–360°, 35–65°S), given its vast areal domain, accounts for 40% of the global carbon uptake30,39,40,41. It remains under-sampled to surface ocean pCO2 measurements and satellite-derived observations that suffer from seasonal cloud coverage, e.g., Chl (used to upscale pCO2 measurements) and wind (used to estimate the gas transfer velocity), that are leveraged to estimate the carbon flux. Except for the Drake Passage, where sea surface pCO2 is sampled in four locations parallel to the mean flow of the Antarctic Circumpolar Current42, year-round observations of pCO2 in the Southern Ocean are sparse23. Due to its remoteness and harsh weather conditions, observations remain limited to a few months of the year, mainly in the austral summer43. Additionally, in austral winter, when wind speeds and air–sea gas exchange rates are highest, first estimates of the carbon flux from biogeochemical Argo floats reveal a seasonal discrepancy compared to shipboard-based fluxes, which are yet to be resolved44,45,46.

In the Southern Ocean, interannual variability of carbon fluxes has been linked to the decadal variability of the Southern Annular Mode, changes in which have been used to explain the weakening of the carbon sink in the 1990s and subsequent reinvigoration in the 2000s5. Seasonal dynamics in the Southern Ocean are complex with contrasting extremes—with deep mixing and entrainment processes characterising winter fluxes with carbon outgassing into the atmosphere and net primary productivity characterising summer fluxes with carbon uptake by the ocean4. However, the Southern Ocean is meridionally heterogeneous in space, with large-scale drivers including wind stress, surface heating, and mesoscale ocean dynamics influencing carbon exchange. As a result, the physical processes and drivers contributing to the variability of the carbon flux in the Southern Ocean are still debated47,48, and their representation in models remains weakly constrained16.

Previous studies have demonstrated that the Max Planck Institute ESM (MPI-ESM) is in limited agreement with other ESMs on the drivers of the seasonal cycle of carbon fluxes in the Southern Ocean15,49. Most other models part of the fifth phase of the Coupled Model Intercomparison Project (CMIP550) show an exaggeration of the seasonal rates of change of SST in austral autumn and spring, which modifies the solubility of pCO2 and tips the control of the seasonal cycle toward SST. MPI-ESM compensates for the solubility bias with overly exaggerated primary productivity. Therefore, biologically driven changes in dissolved inorganic carbon (DIC) regulate the seasonal cycle of the flux49, leading to large discrepancies in the magnitude of seasonal fluxes between MPI-ESM and observation-based products. For this reason, the MPI-ESM is perceived as anomalous in the Southern Ocean. Nevertheless, MPI-ESM shows a coherent seasonal cycle and thus allows us to ascertain whether seasonal variability can help diagnose interannual variability in our study regions or if seasonal discrepancies between the two ensembles, if present, corrupt the signal.

By contrasting a model ensemble and an observation-based ensemble in two regions that span the range of observational coverage and process understanding, our study tests the credibility of model simulations, where across ocean basins, seasonal fluxes are inconsistently represented29,49. Further, we re-evaluate the robustness of seasonal observation-based data, which, so far, has been considered well-constrained16,17.

Results

How do carbon fluxes compare between ensembles?

In the North Atlantic basin and the Southern Ocean, annual carbon fluxes in the model ensemble lie within the range of values in the observation-based ensemble. However, discrepancies between the two ensembles are one to two times greater for the seasonal fluxes than the annual fluxes in the North Atlantic basin and three to four times greater in the Southern Ocean (Figs. 1, 2).

Fig. 1: Air-sea carbon fluxes in the North Atlantic basin (10–90°W, 15–60°N).
figure 1

Annual and seasonal (Pg C·year−1) carbon fluxes averaged over 1990–2020 in the observation-based (ae) and model (fj) ensemble mean with the corresponding kernel density estimation plots in the observation-based (red) and model (black) ensembles (ko). The red shadings represent individual products in the observation-based ensemble. Background colours indicate uptake by the ocean (blue) and outgassing into the atmosphere (orange).

Fig. 2: Air-sea carbon fluxes in the Southern Ocean (0–360°, 35–65°S).
figure 2

Annual and seasonal (Pg C·year−1) carbon fluxes averaged over 1990–2020 in the observation-based (ae) and model (fj) ensemble mean with the corresponding kernel density estimation plots in the observation-based (red) and model (black) ensembles (ko). The red shadings represent individual products in the observation-based ensemble. Background colours indicate uptake by the ocean (blue) and outgassing into the atmosphere (orange).

For the North Atlantic basin, the model and observation-based ensembles generally agree on carbon uptake and outgassing regions and the seasonality of fluxes. This agreement indicates that the model captures key processes despite differences in magnitude and spatial inconsistencies in their representation of seasonal fluxes (Fig. 1a–j). In the observation-based ensemble, the subpolar North Atlantic supports a robust carbon sink in all seasons except winter, where deep convection can lead to outgassing (Fig. 1b–e). In spring and summer, phytoplankton blooms boost the intense carbon sink51, atop a background state of surface waters travelling poleward that cool and subduct, creating a conduit for carbon to invade the ocean’s interior16. The model ensemble fails to capture the outgassing associated with deep wintertime convection but represents the subpolar North Atlantic’s carbon sink well in other seasons and the annual mean (Fig. 1g–j).

In the observation-based ensemble, the subtropical North Atlantic highlights a contemporary carbon sink region in response to increasing atmospheric carbon forcing31 (Fig. 1a). In contrast, the model ensemble replicates the effect of temperature variations that drive the seasonality in the subtropical North Atlantic but shows larger outgassing than shown by the observation-based ensemble. Seasonal biological production has a much more substantial impact in the subtropics than most models indicate, which is widely thought to cause the lower seasonal amplitude in observations versus models52 (Fig. 1f).

Integrated over the subpolar and subtropical regions of the North Atlantic basin, the model ensemble simulates enhanced uptake in half the year (boreal winter and spring) and enhanced outgassing in the other half (boreal summer and autumn) compared to the observation-based ensemble (Fig. 1k–o). The observation-based ensemble shows uptake in summer and autumn, so not only does the model ensemble show enhanced outgassing, but it also misses the direction of the net flux. The regional and seasonal discrepancies between the ensembles compensate in the annual mean, as evidenced by the density distributions of the annual flux, where the annual flux values of the model ensemble are within the distribution of the observation-based ensemble (Fig. 1k). However, at a seasonal timescale, the distributions of the fluxes barely overlap, except in boreal winter. Further, it is noteworthy that agreement among the observation-based products is limited on an annual and seasonal timescale for the North Atlantic basin despite pooling from the same observational dataset (SOCAT39) and being a consistently well-observed ocean region. The disagreement in the observation-based products is consistent with larger differences observed in higher latitudes of the North Atlantic, likely linked to the complex interplay of physical and biological processes, which can be highly variable in the North Atlantic subregions53, in addition to the effect of diverse choices of observational driver data used in the individual products.

For the Southern Ocean, integrated annual fluxes are close between the model and observation-based ensembles. Still, they disagree substantially on the magnitude and spatial structure of seasonal fluxes (Fig. 2a, f). While the kinetic transfer across the air–sea interface contributes to the mismatch between the two ensembles, we find that a large part of the observed difference in the air–sea carbon exchange stems from their different sea surface pCO2 seasonalities (Supplementary Fig. 1). In the observation-based ensemble, the Southern Ocean is a consistent and moderate CO2 sink in all seasons (Fig. 2b–e), with outgassing focused south of the polar front, where upwelling of old DIC-rich waters raises the surface ocean pCO2, except in the summer when biological productivity linked to the retreat of marginal ice favours carbon uptake. Contrarily, the model ensemble has stronger seasonal signals, simulating a very strong sink for half the year (austral spring and summer) and a moderate source in the other half (austral autumn and winter) (Fig. 2g–j). Previous studies link the spring and summer seasonal discrepancies in MPI-ESM with its deep entrainment bias, favouring biological production and enhancing uptake16.

Like the North Atlantic basin, discrepancies between the ensembles in the Southern Ocean compensate in the annual mean, despite no consistency between the model and observation-based ensembles on seasonal timescales (Fig. 2k–o). While there is a lack of skill in representing key processes that remain poorly understood and biased in the model, the incongruence between the ensembles in the Southern Ocean can also be attributed to the limitations of the observation-based ensemble. In particular, the seemingly strong agreement among the observation-based products could result from limited temporally biased data that reinforce weak signals over periods without observations. Therefore, even though the annual flux might suggest coherence between the model and observation-based ensembles, they compare poorly on the seasonal timescale, necessitating cautious assessment and interpretation of the annual carbon flux in the Southern Ocean in both the model and observation-based products.

Linking seasonal and interannual carbon flux variability

In a large model ensemble like the MPI-GE, the ensemble mean can be used to study the forced response54. However, the mean of the ensemble is limited in quantifying variability since averaging across the ensemble space suppresses the internal variability. However, each model ensemble member in MPI-GE has different variability since it is initialised with distinct starting conditions. We leverage this feature of the grand ensemble members (n = 100) to explore the distribution of variability across seasonal and interannual timescales (Fig. 3).

Fig. 3: Interannual variability (IAV) of air-sea carbon fluxes.
figure 3

IAV [Pg C·year−1] of annual and seasonal carbon fluxes in observation-based products (red dots) overlaid on a box plot of IAV in model members for (a) the North Atlantic basin and (b) the Southern Ocean. The centre line, edges, and whiskers of the box are the median, upper and lower quartiles, and the range of values in the model ensemble space. The black crosses represent the IAV of the model ensemble mean, which is lower than that of the members since averaging dampens variability, and the large red circle is the IAV of the observation-based ensemble mean.

Boreal winter and spring have the largest interannual variability in the North Atlantic basin in both ensembles and the largest range across individual members (Fig. 3a). The underestimation of variability in model members compared to observation-based products is not a new finding55 and is possibly due to the representation of wintertime convective mixing and biological productivity in the subpolar region. The variability in both ensembles in boreal summer and autumn is low and is dominated by temperature-driven carbon exchange in the subtropical region. The narrow bounds in the model members hint that this process is relatively stable and well-captured in the MPI-ESM. In the North Atlantic basin, both ensembles agree on the seasons that contribute the most to interannual variability, with winter and spring fluxes showing the highest variability.

However, both ensembles diverge in their representation of interannual variability in the Southern Ocean, with a much greater magnitude of interannual variability than the North Atlantic basin across temporal scales (Fig. 3b). No season appears to dominate the variability in the observation-based products with nearly constant interannual variability across seasons. Observational coverage is biased to austral spring and summer when conditions are favourable, so the lack of variability in other seasons in the observation-based products could be an artefact of limited sampling and not a robust signal. In contrast, the model members capture a range of possible states, with the greatest variability in austral summer and spring when observations can be considered reliable. Both ensembles agree in the data-scarce seasons of austral autumn and winter that would be considered the least robust in the observation-based products. Current limitations in validating either the model or the observation-based ensemble make it challenging to deem this a trustworthy signal. Therefore, while seasonal variability can be useful in diagnosing interannual variability in the North Atlantic basin, it is currently of limited use in the Southern Ocean.

Discussion

Process understanding of ocean carbon dynamics on seasonal and interannual timescales varies across the oceans due to heterogeneous observations and complex underlying mechanisms. However, there is substantially improved historical data coverage and research in the Northern Hemisphere18,39. Therefore, seasonal fluxes are expected to be better represented in the North Atlantic basin than in the Southern Ocean. However, both the subtropical and the subpolar regions suffer from limitations in the model representation, failing to capture key seasonal processes linked to biology and mixing, respectively.

Nevertheless, both model and observation-based ensembles agree on the seasons that contribute the most to the interannual variability in the North Atlantic basin. Boreal winter and spring fluxes dominate interannual variability, in agreement with the current understanding of the role of MLD and SST anomalies56. Given the coherence between the ensembles, seasonal variability can be useful in diagnosing interannual variability in the North Atlantic and can provide direction to focus future observational campaigns.

In the Southern Ocean, the phasing of the seasonal cycle matches between the ensembles (most outgassing in austral winter and most uptake in austral summer, despite the seasonal cycle amplitude being larger in the model ensemble than the observation-based ensemble. However, both ensembles diverge in their representation of interannual variability. No season appears to dominate the variability in the observation-based products, with nearly constant interannual variability across seasons. This result is not particularly surprising given the lack of long-term observations in the Southern Ocean, except in the Drake Passage, and agrees with the findings of studies adding biogeochemical floats44.

In contrast, the model members show stronger seasonal signals, with the greatest variability in austral summer and spring. Furthermore, model members cover a range of variability exceeding that of observation-based products, suggesting that observations in the Southern Ocean may be too limited to observe any robust signals in the carbon sink variability on seasonal timescales. That is, until autonomous measurement efforts, such as biogeochemical Argo floats, are deployed long enough to fill this gap44,45,46. Provided the data sparsity in the Southern Ocean, the lack of a dominant season driving the interannual variability could also result from observation-based products extrapolating summer signals, when observations are available, across the year. Without verifying such seasonal aliasing in the observation-based products and the lack of coherence across the model and observation-based ensemble, seasonal variability is currently of limited use in diagnosing interannual variability in the Southern Ocean, with a need for increased observational coverage throughout the year.

We compared annual and seasonal fluxes between an ESM large ensemble and an observation-based ensemble. Both ensembles show coherence in the magnitude and the large-scale spatial patterns of the annual carbon flux in the North Atlantic basin and the Southern Ocean. However, there are much larger discrepancies in seasonal fluxes than in annual fluxes between the two ensembles, one to two times greater in the North Atlantic basin and three to four times greater in the Southern Ocean. These seasonal discrepancies between the ensembles fortuitously compensate for the annual mean, which obscures large seasonal differences, particularly in the Southern Ocean. Therefore, considering only annual mean fluxes when comparing models and observation-based products misses a significant aspect of the comparison, which is only evident at a seasonal timescale.

We conducted this comparison with only one Earth System Model to reduce the uncertainties introduced in analysing a multi-model ensemble with varying physics, biogeochemistry, and configurations. Other ESMs (e.g., CESM-LE) show closer correspondence with observation-based products in the Southern Ocean (Supplementary Figs. 24). However, given the lack of a strong observational constraint in the Southern Ocean, it remains questionable whether the current representation of seasonal fluxes in any observation-based product can serve as a tool to benchmark the seasonal fluxes in any ESM43. Nevertheless, our analysis reveals that where necessary observational constraints have been built in the past, e.g., in the North Atlantic basin, a solid understanding of the seasonal variability can aid in understanding the interannual variability of carbon fluxes.

Long-term observation strategies can help improve process understanding and close measurement gaps in regions like the Southern Ocean, where models and observations disagree on seasonal carbon fluxes. However, maintaining existing networks can still provide insight into improving relatively well-understood regions like the North Atlantic basin, where fluxes remain prone to observational and model uncertainty31. However, data scarcity and its limiting distribution are shortcomings that will not be resolved globally in the short term. In the interim, models offer the ability to study regions where observations are inherently limiting40. Therefore, recognising the limitations and advantages of both models and observation-based products is of fundamental importance while interpreting the evolution of the ocean carbon sink.

Methods

Observation-based data products

We use a set of regularly updated observation-based products where the air–sea carbon flux is computed (1990–2020). The products used in this study, LSCE-FFNN21, CSIR-ML626, MPI-SOMFFN57, Jena-MLS23, JMA-MLR25, and NIES-NN24 were initially collected by the Surface Ocean pCO2 Mapping (SOCOM17) intercomparison project and later extended, for example, within the Global Carbon Budget20. The observation-based products are interpolated to the same monthly 1°×1° grid and averaged by treating individual products as an ensemble. All pCO2-based products use measurements from the 2020 release of the Surface Ocean CO2 Atlas (SOCAT39) database and a data-based interpolation approach to fill measurement gaps. Most interpolation techniques rely on machine-learning approaches that reconstruct the non-linear relationship between well-observed driver quantities (e.g., sea surface temperature (SST), sea surface salinity (SSS), sea surface height (SSH), mixed layer depth (MLD), chlorophyll a (Chl a), etc.) and sea surface pCO2 measurements. The JENA-MLS method is one exception, which uses a mixed layer scheme and is not dependent on auxiliary driver data. The products make subjective choices for selecting the gas transfer formulation and wind product and use a quadratic wind-based gas transfer relationship to estimate the flux (more details in Table A3, GCP 20221 and refs. 21,23,24,25,26,57).

LSCE-FFNN is an ensemble of neural network models trained on 100 subsampled pCO2 datasets. It uses SSS, SST, SSH, MLD, atmospheric CO2 mole fraction, Chl a, pCO2 climatology, latitude, and longitude as predictors. CSIR-ML6 is an ensemble average of six machine-learning estimates of ocean pCO2 using two cluster-regression approaches (a) K-mean clustering and b) Fay and McKinley’s (2014) CO2 biomes) and three regression algorithms (a) gradient-boosted decision trees, b) a feed-forward neural network, and c) support vector regression), the product of which results in six estimates, with an ensemble mean and associated spread. MPI-SOMFFN uses a two-step neural network method clustering the ocean into 16 biogeochemical provinces using a self-organising map. The method leverages a non-linear relationship between pCO2 measurements and environmental predictor data (SST, SSS, MLD, Chl a, atmospheric CO2) using a feed-forward network. The established relationship is then used to fill existing data gaps. NIES-NN is a feed-forward neural network that uses SST, SSS, Chl a, MLD and monthly SST anomalies as predictor data. JMA-MLD uses fields of total alkalinity estimated using a multiple linear regression method based on data from GLODAP58 and satellite data of SST, SSS, SSH, MLD and Chl a to evaluate DIC in the ocean surface. A more detailed description of the individual products is available in their respective core reference, and their intercomparison in part can be found in Rödenbeck et al. (2015) and Fay and Gregor (2021).

Max Planck Institute for Meteorology Grand Ensemble

We contrast the observation-based ensemble mean to the Max Planck Institute for Meteorology (MPI-M) Grand Ensemble (MPI-GE59) mean over its 100 members, based on the Max Planck Institute Earth System Model (MPI-ESM) version 1.1 in low-resolution configuration. The ocean (MPIOM60) and atmosphere (ECHAM661) components are coupled daily without flux corrections62. The carbon flux is computed using a gas exchange parameterisation (based on 10 m above sea level wind speed) and the partial pressure difference at the air–sea interface63. MPI-GE is a large ensemble of a single state-of-the-art comprehensive climate model with ensemble members generated in 1850 from different starting conditions randomly selected from a preindustrial control simulation. The historical simulation (1850–2005) is not constrained by observations of the state of the atmosphere and ocean, except for forcing with greenhouse gases and volcanic aerosols, according to CMIP550. For 2005–2020, we used a moderate radiative forcing scenario (RCP 4.5) to extend the MPI-GE until 2020; however, one could also choose a more extreme forcing pathway (e.g., RCP 8.5). The results are insensitive to the RCP choice since MPI-GE does not diverge significantly over the study period (Supplementary Tables 1, 2) but reflects changes between the two periods in both ensembles (Supplementary Tables 36). Using a large ensemble, we study the forced response of carbon fluxes due to anthropogenic warming separate from internal variability54 and analyse the variability across the model ensemble space.

Methods

Annual and seasonal air-sea carbon flux

Mean annual air–sea carbon fluxes for both ensembles are computed from monthly values. We follow the convention that carbon uptake by the ocean (carbon outgassing into the atmosphere) is negative (positive). We define four distinct seasons as 3-month averages. For the North Atlantic basin (10–90°W, 15–60°N), we designate winter (DJF), spring (MAM), summer (JJA) and autumn (SON). These seasons are antiphase in the Southern Ocean (0–360°, 35–65°S), where we exclude the marginal ice zone by limiting the geographical scope at 65°S and use the 35°S latitude to mark the boundary of the subtropical front.

Probability density estimation

For each year in the study period, we plot the kernel density estimate for the annual and seasonal integrated flux for the ensembles. The distributions of the individual observation-based products show the variability among the products. The closer the values are clustered, the less variable they are in time.

Interannual variability

Summed over the study region, we obtain a time series with the mean annual and seasonal magnitude of the air-sea carbon flux. After averaging over all grid points, we calculate the standard deviation about the mean of a particular season over 1990–2020 to determine which portion of the annual cycle has the largest interannual variability.