Accounting for two-billion tons of stabilized soil carbon

https://doi.org/10.1016/j.scitotenv.2019.134615Get rights and content

Highlights

  • Environmental heterogeneity is a large source of soil carbon variability.

  • Strategic feature selection is used to identify critical, regional-scale relationships.

  • Precipitation, nitrogen, and soil moisture have the strongest association with topsoil C.

  • Parent material and elevation have the strongest association with subsoil C.

  • Approximately 2.6 Pg C are stored in the upper 1 m of production forestland soil.

Abstract

The pedosphere is the largest terrestrial reservoir of organic carbon, yet soil-carbon variability and its representation in Earth system models is a large source of uncertainty for carbon-cycle science and climate projections. Much of this uncertainty is attributed to local and regional-scale variability, and predicting this variation can be challenging if variable selection is based solely on a priori assumptions due to the scale-dependent nature of environmental determinants. Data mining can optimize predictive modeling by allowing machine-learning algorithms to learn from and discover complex patterns in large datasets that may have otherwise gone unnoticed, thus increasing the potential for knowledge discovery. In this analysis, we identify important, regional-scale determinants for top- and subsoil-carbon stabilization in production forestland across the southeastern US. Specifically, we apply recursive feature elimination to a large suite of socio-environmental data to strategically select a parsimonious, yet highly predictive covariate set. This is achieved by recursively considering smaller and smaller covariate sets—or features—by first training the estimator on the full set to obtain feature importance. The least important features are pruned, and the procedure is recursively repeated until a desired number of covariates is identified. We show that although carbon ranges from 0.3 to 8.2 kg m−2 in the topsoil (0 to 20 cm), and from 0.4 to 17.6 kg m−2 in the subsoil (20 to 100 cm), this variability is predictably distributed with precipitation, soil moisture, nitrogen and sand content, gamma ray emissions, mean annual minimum temperature, and elevation. From our spatial predictions, we estimate that 2.6 Pg of soil carbon is currently stabilized in the upper 100 cm of production forestland, which covers 34.7 million ha in the southeastern US.

Introduction

Soil is a critical component of the global climate system, serving as both a sink and a source of CO2 by actively exchanging carbon with the atmosphere (Davidson, 2016, Luo et al., 2015). Resource management, carbon cycle projections, and policy all rely on accurate representations of soil carbon, yet global estimates remain highly uncertain despite comprehensive efforts to quantify this large reservoir (Gianelle et al., 2010). Depending on the depth modeled, published estimates range from 863 to over 3800 Pg C (Batjes, 2014, Eglin et al., 2010, Sanderman et al., 2017, Watson et al., 2000). Much of the uncertainty is attributed to the vertical and horizontal variability of soil, the scale of analysis, and the subsequent loss of information that occurs during spatial aggregation (Jobbágy and Jackson, 2000, Ross et al., 2013, Xiong et al., 2015), which is particularly problematic when aggregating non-linear data over large areas (Easterling, 1997).

Because soil carbon stabilization is governed by a multitude of non-linear relationships, the strength of relationships can vary with the scale of analysis (Miller et al., 2015, Xiong et al., 2016). Across larger spatial scales, for example, carbon sequestration varies with plant productivity, which in turn is affected by atmospheric CO2 (Roy et al., 2016), growing season length (Hilton et al., 2017), and resource availability (Eskelinen and Harrison, 2015). However, soil carbon responds to socio-environmental conditions that can vary dramatically at different temporal scales and across regional and sub-regional scales. Factors affecting soil carbon persistence include temperature, precipitation, and acidity (Chen et al., 2018, Schmidt et al., 2011), as well as management (Noormets et al., 2015), and disturbance from land use change (Ross et al., 2016, Xiong et al., 2014b), fire (Godwin et al., 2017), and erosion (Pimentel, 2006). Characterizing these factors at regional scales may be required to upscale soil carbon to global estimates and to refine our understanding of soil carbon stabilization (Mulder et al., 2016).

A recent US Department of Agriculture funded Coordinated Agricultural Project referred to as PINEMAP (Pine Integrated Network: Education, Mitigation, and Adaptation Project) addressed this issue by establishing a monitoring network across the southeastern US to refine our understanding of carbon storage and dynamics in managed forests at the regional-scale (Will et al., 2015). Forests cover 99 million hectares of land in the southeast and account for almost one third of all forested lands in the conterminous US (Oswalt et al., 2014). Not only are these forests an economically-important resource—providing approximately 60% and 16% of the US and global industrial wood supply by volume (Oswalt et al., 2014)—but are ecologically important as well, and sequestered enough aboveground carbon (176 Tg C yr−1) to mitigate 42% of the regions CO2 emissions between 2000 and 2005 (Lu et al., 2015). About one third of the regions forests are pinelands, of which 19% are comprised of managed pine plantations (Wear and Greis, 2013). The most dominant species—loblolly pine (Pinus taeda L.)—accounts for more than two thirds of all planted tree species in the region (Wear and Greis, 2013).

Intense silvicultural production cycles in this region are a large source of land-cover change, which subsequently affects the region’s carbon cycle. An accurate estimate of soil-carbon distribution in southeastern production forestland is therefore a critical step towards further resolving carbon-cycle science in this region, and to identify factors potentially affecting soil carbon at the global scale. By identifying important regional-scale associations, we hypothesize that our models will provide improved estimates of soil carbon stock when compared with those derived from global models. In this analysis, we develop a data-driven approach to model topsoil (0 to 20 cm) and subsoil (20 to 100 cm) carbon, which is based on a regional compilation (N = 2,564) of soil samples collected from PINEMAP research sites. Variable selection is performed by applying recursive feature elimination to a comprehensive set (N = 73) of environmental predictors to identify parsimonious covariate sets (N = 5) for each depth interval, which are used with the random forest (RF) algorithm to produce soil carbon prediction maps for top- and subsoil depth intervals.

Section snippets

Study area

Our random forest models were trained on data collected from the PINEMAP Tier 2 network, which consisted of 106 research sites with 2 to 3 replicates (on average) at each site, for a total of 322 plots (Fig. 1). Tier 2 research sites were chosen to capture the region-wide variation in soil, landscape positions, and climate that characterize the native geographic range of loblolly pine.

Climate of the study area is classified as a warm and humid temperate region with hot summers (Kottek et al.,

Results

The concentration of soil carbon (%) generally declines with depth, and soil bulk density increases with depth (Fig. 3). However, the vertical and horizontal distribution of measured soil carbon is highly variable, both within and between PINEMAP research sites. A considerable amount of the observed variation is attributed to extreme, but infrequent values (Table 2).

Carbon contents across USDA soil taxonomy at the suborder level also exhibit a considerable amount of variability, with median

Discussion

This study integrates data mining with extensive field sampling to express the region-wide variation of soil carbon in pine plantations across the southeastern US. We identify a parsimonious, yet highly predictive covariate set by utilizing strategic feature selection. For our topsoil model, precipitation, nitrogen, and 40K had the largest effect for predicting carbon variability. Examination of the ALE plots indicates that precipitation and 40K have non-linear relationships with topsoil carbon

Conclusions

We demonstrate the application of strategic feature selection to identify covariates that are important for soil-carbon stabilization across a large and highly-variable region. We opted for a parsimonious covariate-set (N = 5) to increase model interpretation while avoiding the “curse of dimensionality”. Mean annual precipitation and gamma-ray emissions of 40K have non-linear associations with topsoil carbon, while sand content, nitrogen, and soil moisture show strong, positive associations.

Declaration of Competing Interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Acknowledgements

The Pine Integrated Network: Education, Mitigation, and Adaptation project (PINEMAP) was a Coordinated Agricultural Project funded by the USDA National Institute of Food and Agriculture [Award #2011-68002-30185]. We would like to thank all PINEMAP team members for their contributions, with a special thanks to Marshall A. Laviner, Madison K. Akers, Joshua Cucinella, Tom Fox, Risa Patarasuk, and Beijing Cao for their contributions to this research.

References (59)

  • X. Xiong et al.

    Assessing uncertainty in soil organic carbon modeling across a highly heterogeneous landscape

    Geoderma

    (2015)
  • X. Xiong et al.

    Holistic environmental soil-landscape modeling of soil organic carbon

    Environ. Model. Softw.

    (2014)
  • X. Xiong et al.

    Interaction effects of climate and land use/land cover change on soil organic carbon sequestration

    Sci. Total Environ.

    (2014)
  • J.T. Abatzoglou

    Development of gridded surface meteorological data for ecological applications and modelling

    Int. J. Climatol.

    (2013)
  • N.H. Batjes

    Total carbon and nitrogen in the soils of the world

    Eur. J. Soil Sci.

    (2014)
  • R. Bellman

    Curse of dimensionality

    (1961)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • S. Chen et al.

    Plant diversity enhances productivity and soil carbon storage

    Proc. Natl. Acad. Sci.

    (2018)
  • E.A. Davidson

    Projections of the soil-carbon deficit

    Nature

    (2016)
  • P. Domingos

    A few useful things to know about machine learning

    Commun ACM

    (2012)
  • Duval, J.S., Carson, J.M., Holman, P.B., Darnley, A.G., 2005. Terrestrial radioactivity and gamma-ray exposure in the...
  • T. Eglin et al.

    Historical and future perspectives of global soil carbon response to climate and land-use changes

    Tellus B

    (2010)
  • A. Eskelinen et al.

    Resource colimitation governs plant community responses to altered precipitation

    Proc. Natl. Acad. Sci.

    (2015)
  • D. Gianelle et al.

    Cataloguing soil carbon stocks

    Science

    (2010)
  • D.R. Godwin et al.

    Effects of fire frequency and soil temperature on soil CO2 efflux rates in old-field pine-grassland forests

    Forests

    (2017)
  • Y.N. Gonzalez et al.

    A billion tons of unaccounted for carbon in the southeastern United States

    Geophys. Res. Lett.

    (2018)
  • S. Grunwald et al.

    Digital soil mapping and modeling at continental scales: finding solutions for global issues

    Soil Sci. Soc. Am. J.

    (2011)
  • S.G. Haile et al.

    Contribution of trees to carbon storage in soils of silvopastoral systems in Florida, USA

    Glob. Change Biol.

    (2010)
  • T. Hengl et al.

    SoilGrids250m: Global gridded soil information based on machine learning

    PLOS One

    (2017)
  • Cited by (15)

    • A regional assessment of permanganate oxidizable carbon for potential use as a soil health indicator in managed pine plantations

      2022, Forest Ecology and Management
      Citation Excerpt :

      Minimum (Tmin) and maximum (Tmax) temperature ranged from 7 to 15 °C and 19 to 26 °C, respectively. Mean annual precipitation (MAP) varied from 1095 to 1635 mm (Ross et al., 2020). The selected sites were distributed over five soil orders: Ultisols, Alfisols, Spodosols, Inceptisols, and Entisols.

    • Holistic aboveground ecological productivity efficiency modeling using data envelopment analysis in the southeastern U.S

      2022, Science of the Total Environment
      Citation Excerpt :

      Soil samples were collected between 2012 and 2015. Fixed depth intervals were sampled for each location from most of the 322 plots, resulting in 2564 soil samples (Ross et al., 2019) at the following depths: 0 to 10, 10 to 20, 20 to 50, and 50 to 100 cm. Each sample was sieved (2 mm) to remove coarse fractions (roots, leaves, and stones).

    • Environmental covariates improve the spectral predictions of organic carbon in subtropical soils in southern Brazil

      2021, Geoderma
      Citation Excerpt :

      Likewise, we suggest focusing on obtaining more soil observations at a local and regional scale since much of the uncertainty in SOC predictions is attributed to variability at a local and regional scale (Mulder et al., 2016). The limitation is that predicting this variation can be challenging due to the scale-dependent nature of environmental determinants (Grunwald, 2009; Grunwald et al., 2015; Ross et al., 2020). For this reason, despite the advancement in the processing potential of computers in data mining, it is necessary to provide a robust set of data, so that the algorithms learn and find complex patterns, allowing for an “optimal” combination of the empirical relationships between covariates and soil properties that have pedological significance.

    View all citing articles on Scopus
    View full text