Selecting optimal calibration samples using proximal sensing EM induction and γ-ray spectrometry data: An application to managing lime and magnesium in sugarcane growing soil

https://doi.org/10.1016/j.jenvman.2021.113357Get rights and content

Highlights

  • Hybrid model outperforms ML and OK.

  • No silver bullet is found among sampling designs.

  • 180 calibration samples are required by topsoil Ca and Mg.

  • FSCS enables 120 samples to predict subsoil Ca and Mg.

  • DSM achieves a potential ~ 18 % and 86 % decrease in lime and Mg fertilisers cost.

Abstract

Calcium (Ca) and magnesium (Mg) are essential for growth of sugarcane leaves and roots, as well as respiration and nitrogen metabolism, respectively. To assist farmers decide suitable application rates of lime and Mg fertiliser, respectively, the Australian sugarcane industry established the Six-Easy-Steps nutrient management guidelines based on topsoil (0–0.3 m) Ca (cmol(+) kg−1) and Mg (cmol(+) kg−1). Given the heterogeneous nature of soil, digital soil mapping (DSM) methods can be employed to allow for the precise application rate to be determined. In this study, we examine statistical models (i.e., ordinary kriging [OK], linear mixed model [LMM], quantile regression forests [QRF], support vector machine [SVM], and Cubist regression kriging [CubistRK]) to predict topsoil and subsoil (0.6–0.9) Ca and Mg, employing digital data in combination (i.e., proximal sensing electromagnetic induction (EMI) [e.g., 1mPcon, 1mHcon, etc.], gamma-ray [γ-ray] spectrometry [i.e., TC, K, U and Th] and digital elevation model [DEM] derivatives). We also investigate various sampling designs (i.e., spatial coverage [SCS], feature space coverage [FSCS], conditioned Latin hypercube [cLHS] and simple random sampling [SRS]) and calibration sample size (i.e., n = 180, 150, 120, 90, 60 and 30). The predictions are assessed using Lin's concordance correlation coefficient (LCCC) and ratio of performance to interquartile distance (RPIQ) with an independent validation dataset (i.e., n = 40). The best results were for prediction of subsoil Mg, utilising CubistRK (LCCC = 0.82) with the largest calibration sample size (n = 180), followed by LMM (0.79), SVM (0.76), QRF (0.70) and OK (0.65). This was generally the case for topsoil and subsoil Ca. We also conclude that no single sampling design was universally better, and 180 samples were necessary for predicting topsoil Ca and Mg with moderate agreement (0.65 < LCCC < 0.80). However, with FSCS, a minimum of 120 samples were enough to calibrate a CubistRK model and achieve substantial (LCCC > 0.80) agreement for predicting subsoil Ca and Mg. With respect to soil use and management according to the Six-Easy-Steps, the sandy soil in the north and south require large application rate of lime (3.5 t/ha) and Mg (125 kg/ha), respectively. Conversely, varying amounts of fertiliser rates of lime (2.0, 1.5 and 1 t/ha) and Mg (50 kg/ha) were recommended where Vertosols were previously mapped.

Introduction

Sugarcane requires large amounts of calcium (Ca) because it is essential for development of leaves and roots to strengthen cell walls. Magnesium (Mg) is important for photosynthesis and movement of phosphorus because it is required for respiration and nitrogen metabolism and assimilation. Deficiency in Ca can lead to leaves showing pale green/yellow mottling and primary shoot death (White and Broadley, 2003), while for Mg it may cause leaf chlorosis and retardation (Cakmak and Kirkby, 2008). Therefore, Ca and Mg fertilisation is necessary to maintain sugarcane productivity. However, Ca fertilisation inconsistent with nutritional needs may cause soil pH to rise, which restricts zinc and copper uptake (Dontsova and Norton, 2002); whereas excess Mg could interfere with potassium uptake, and make soil hard, reducing water infiltration and drainage (Zhang and Norton, 2002). To improve Ca and Mg management, the sugarcane industry formulated the Six-Easy-Steps nutrient management guidelines (Schroeder et al., 2010), which aims at determining suitable fertiliser rates (i.e., lime and Mg) based on topsoil (0–0.3 m) Ca and Mg (cmol(+) kg−1). For example, Table 1 shows that small (<0.2) Ca requires a lime application rate of 4 t/ha, whereas large (>1.6) Ca requires 1 t/ha. Similarly, the guidelines indicate for small (<0.05) and large (>0.2) Mg, the rates should be 150 and 50 kg/ha, respectively. Therefore, knowledge about Ca and Mg is necessary to enable site-specific fertilisation to address infertility and mitigate environmental impacts.

While geostatistical models (e.g., ordinary kriging) have been implemented to map topsoil Ca and Mg (Behera and Shukla, 2015), increasingly digital soil mapping (DSM) approaches are being applied to predict Ca and Mg from limited soil data by augmenting this with easier to acquire digital data (Wang et al., 2020) using models. In terms of digital data, proximal sensing electromagnetic induction (EMI) and/or gamma-ray (γ-ray) spectrometer could be employed. For instance, in Australia Holland et al. (2017) quantified distribution of cation exchange capacity using γ-ray data in cropping lands of the Glenelg-Hopkins catchment. More recently, Li et al. (2019) made a DSM of topsoil (0–0.15 m) Ca and Mg in a sugarcane field in Burdekin, by synergistically combining γ-ray and EMI data. With respect to models, linear (e.g., linear mixed model [LMM]) and geostatistical (e.g., ordinary kriging [OK] and regression kriging [RK]) methods are commonly implemented (Keskin and Grunwald, 2018). When non-linear or multi-dimensional hierarchy relationships exist, machine learning (ML) techniques including, quantile regression forests (QRF) (Vaysse and Lagacherie, 2017), Cubist (Ma et al., 2017) and support vector machine (SVM) (Liao et al., 2014), and hybrid models that incorporate ML with geostatistical (e.g., CubistRK), may produce better prediction accuracy (Ma et al., 2017).

To fit these statistical models, sampling designs and number of calibration samples should also be considered (Brus, 2019). Regarding design of sampling, spatial coverage (SCS), feature space coverage (FSCS), conditioned Latin hypercube (cLHS) and simple random (SRS) are common (Brus, 2019). For example, Schmidt et al. (2014) evaluated several sampling designs, concluding cLHS with only 20 samples returned the most accurate predictions for topsoil (0–0.3 m) pH, soil organic carbon and particle fractions across a grassland in Germany, whereas Ng et al. (2018) inspected cLHS, FSCS and SRS at continental, regional, and local scale to predict CEC and clay and found FSCS was suitable when sample size was large (i.e., ~1000) but performed poorly when it was small. Moreover, cLHS appeared more robust regardless of sample size. However, the converse was shown by Wadoux et al. (2019) and Ma et al. (2020), who found the accuracy of FSCS was superior to cLHS and SRS over all sample sizes, for mapping topsoil (0–0.3 m) organic carbon and soil classes across continental and catchment scales, respectively.

In this study, we examine statistical models (i.e., OK, LMM, QRF, SVM, CubistRK) based on a large calibration sample size (i.e., n = 180) to predict topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg, using digital data in combination (i.e., proximal sensing EMI, γ-ray spectrometry and DEM derivatives). Different sampling designs (i.e., SCS, FSCS, cLHS and SRS) were investigated, along with sample size (i.e., n = 180, 150, 120, 90, 60, 30) evaluated using an independent validation dataset (i.e., n = 40), by Lin's concordance correlation coefficient (LCCC) and ratio of performance to interquartile distance (RPIQ). DSMs of topsoil and subsoil Ca and Mg were produced to recommend lime and Mg application rates according to the Six-Easy-Steps management guidelines for Herbert Valley.

Section snippets

Study site

The study site is located in the Herbert Valley, north Queensland, Australia (Fig. 1a). It covers 47 ha in the area known as the Lannercost Extension. The annual rainfall is 2218 mm, with approximately 75 % of rainfall in summer (i.e., December–March). The average annual temperature is 23.7 °C, with the warmest month on average being January (32.4 °C) and coldest in July (13.9 °C). The Herbert River is located on the northeastern margin and it gives rise to alluvial plains, derived from the

Descriptive data analysis

Table 2 shows summary statistics of measured topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg for the calibration (n = 180) and validation (n = 40) datasets. In the calibration data (n = 180), the minimum (0.05) topsoil Ca was small (<0.2) while the maximum (3.8) was very large (>2.0). The median (0.8) was intermediate-large (0.6–0.8). On average, and with respect to the Six-Easy-Steps nutrient management guidelines (Table 1a), it would require 2.5 t/ha of lime to be applied across the

Conclusions

The results show when employing all the proximal sensing digital data (i.e., EMI and γ-ray spectrometry) as well as DEM derivatives in combination and considering the largest sample size (i.e., 180), CubistRK generally performed best for predicting topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg, except that QRF preformed slightly better for topsoil Ca. Utilising CubistRK, no single sampling design was universally better, and 180 samples were required for predicting topsoil Ca and Mg with

Credit author statement

Jie Wang: Conceptualisation, Methodology, Formal analysis, Data curation, Resources, Writing – original draft. Xueyu Zhao: Data curation, Visualisation, Software, Validation. Dongxue Zhao: Investigation, Visualisation. John Triantafilis: Supervision, Project administration, Funding acquisition, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The Australian Federal Government's Sugar Research Australia (SRA) is acknowledged for the financial support to conduct EMI and γ-ray surveys, soil sampling and laboratory analysis as part of project (2017/014) entitled “Seeing is believing: managing soil variability, improve crop yield and minimising off-site impacts in sugarcane using digital soil mapping”. The first author acknowledges a University of New South Wales (UNSW) “University International Postgraduate Award (UIPA)” scholarship

References (53)

  • J. Padarian et al.

    Chile and the Chilean soil grid: a contribution to GlobalSoilMap

    Geoderma Regional

    (2017)
  • E.J. Pebesma

    Multivariable geostatistics in S: the gstat package

    Comput. Geosci.

    (2004)
  • K. Schmidt et al.

    A comparison of calibration sampling schemes at the field scale

    Geoderma

    (2014)
  • K. Vaysse et al.

    Using quantile regression forest to estimate uncertainty of digital soil mapping products

    Geoderma

    (2017)
  • R.A. Viscarra Rossel et al.

    A global spectral library to characterize the world's soil

    Earth Sci. Rev.

    (2016)
  • A.M.C. Wadoux et al.

    Sampling design optimization for soil mapping with random forest

    Geoderma

    (2019)
  • D.J. Walvoort et al.

    An R package for spatial coverage sampling and random sampling from compact geographical strata by k-means

    Comput. Geosci.

    (2010)
  • X. Zhang et al.

    Effect of exchangeable Mg on saturated hydraulic conductivity, disaggregation and clay dispersion of disturbed soils

    J. Hydrol.

    (2002)
  • D. Zhao et al.

    Soil exchangeable cations estimation using Vis-NIR spectroscopy in different depths: effects of multiple calibration models and spiking

    Comput. Electron. Agric.

    (2021)
  • K. Adhikari et al.

    Digital mapping of soil organic carbon contents and stocks in Denmark

    PloS One

    (2014)
  • S.K. Behera et al.

    Spatial distribution of surface soil acidity, electrical conductivity, soil organic carbon content and exchangeable potassium, calcium and magnesium in some cropped acid soils of India

    Land Degrad. Dev.

    (2015)
  • P.N. Bierwith

    Gamma-radiometrics, a remote sensing tool for understanding soils

    Australian Collaborative Land Evaluation Program Newsletter

    (1996)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • I. Cakmak et al.

    Role of magnesium in carbon partitioning and alleviating photooxidative damage

    Physiol. Plantarum

    (2008)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • D.L. Corwin et al.

    Field-scale apparent soil electrical conductivity

  • Cited by (11)

    • Proximal and remote sensor data fusion for 3D imaging of infertile and acidic soil

      2022, Geoderma
      Citation Excerpt :

      Concerning statistical models, several comparative studies (Vaysse and Lagacherie, 2017) found that random forest (RF), and its derivative quantile regression forest (QRF), have high predictive capacity for DSM, with the latter simple to implement and allowing quantification of prediction uncertainty (Kasraei et al., 2021). Furthermore, the number of calibration samples, upon a model is constructed, should also be considered (Wang et al., 2021; Wang et al., 2022). Under such modelling, a common two-dimensional (2D) procedure is to fit separate models for predefined depth intervals.

    • Unravelling drivers of field-scale digital mapping of topsoil organic carbon and its implications for nitrogen practices

      2022, Computers and Electronics in Agriculture
      Citation Excerpt :

      Moreover, location 1 was located at the nexus between small (<20 mS m−1) and intermediate (40–60 mS m−1) 2mHcon shown in Fig. 3c, with location 7 near the boundary between small and intermediate-small (20–40 mS m−1) 2mHcon. In addition, sites 3 and 12 were near the boundary of the study area and between two fields, respectively, and this could be attributable to poor data support for prediction at the field edges (Wang et al., 2021). Fig. 6a shows the plot of agreement versus calibration sample size for predicting topsoil (0–0.3 m) SOC at the validation sites and using the various statistical models considering only γ-ray (i.e., TC) data.

    View all citing articles on Scopus
    View full text