Selecting optimal calibration samples using proximal sensing EM induction and γ-ray spectrometry data: An application to managing lime and magnesium in sugarcane growing soil
Introduction
Sugarcane requires large amounts of calcium (Ca) because it is essential for development of leaves and roots to strengthen cell walls. Magnesium (Mg) is important for photosynthesis and movement of phosphorus because it is required for respiration and nitrogen metabolism and assimilation. Deficiency in Ca can lead to leaves showing pale green/yellow mottling and primary shoot death (White and Broadley, 2003), while for Mg it may cause leaf chlorosis and retardation (Cakmak and Kirkby, 2008). Therefore, Ca and Mg fertilisation is necessary to maintain sugarcane productivity. However, Ca fertilisation inconsistent with nutritional needs may cause soil pH to rise, which restricts zinc and copper uptake (Dontsova and Norton, 2002); whereas excess Mg could interfere with potassium uptake, and make soil hard, reducing water infiltration and drainage (Zhang and Norton, 2002). To improve Ca and Mg management, the sugarcane industry formulated the Six-Easy-Steps nutrient management guidelines (Schroeder et al., 2010), which aims at determining suitable fertiliser rates (i.e., lime and Mg) based on topsoil (0–0.3 m) Ca and Mg (cmol(+) kg−1). For example, Table 1 shows that small (<0.2) Ca requires a lime application rate of 4 t/ha, whereas large (>1.6) Ca requires 1 t/ha. Similarly, the guidelines indicate for small (<0.05) and large (>0.2) Mg, the rates should be 150 and 50 kg/ha, respectively. Therefore, knowledge about Ca and Mg is necessary to enable site-specific fertilisation to address infertility and mitigate environmental impacts.
While geostatistical models (e.g., ordinary kriging) have been implemented to map topsoil Ca and Mg (Behera and Shukla, 2015), increasingly digital soil mapping (DSM) approaches are being applied to predict Ca and Mg from limited soil data by augmenting this with easier to acquire digital data (Wang et al., 2020) using models. In terms of digital data, proximal sensing electromagnetic induction (EMI) and/or gamma-ray (γ-ray) spectrometer could be employed. For instance, in Australia Holland et al. (2017) quantified distribution of cation exchange capacity using γ-ray data in cropping lands of the Glenelg-Hopkins catchment. More recently, Li et al. (2019) made a DSM of topsoil (0–0.15 m) Ca and Mg in a sugarcane field in Burdekin, by synergistically combining γ-ray and EMI data. With respect to models, linear (e.g., linear mixed model [LMM]) and geostatistical (e.g., ordinary kriging [OK] and regression kriging [RK]) methods are commonly implemented (Keskin and Grunwald, 2018). When non-linear or multi-dimensional hierarchy relationships exist, machine learning (ML) techniques including, quantile regression forests (QRF) (Vaysse and Lagacherie, 2017), Cubist (Ma et al., 2017) and support vector machine (SVM) (Liao et al., 2014), and hybrid models that incorporate ML with geostatistical (e.g., CubistRK), may produce better prediction accuracy (Ma et al., 2017).
To fit these statistical models, sampling designs and number of calibration samples should also be considered (Brus, 2019). Regarding design of sampling, spatial coverage (SCS), feature space coverage (FSCS), conditioned Latin hypercube (cLHS) and simple random (SRS) are common (Brus, 2019). For example, Schmidt et al. (2014) evaluated several sampling designs, concluding cLHS with only 20 samples returned the most accurate predictions for topsoil (0–0.3 m) pH, soil organic carbon and particle fractions across a grassland in Germany, whereas Ng et al. (2018) inspected cLHS, FSCS and SRS at continental, regional, and local scale to predict CEC and clay and found FSCS was suitable when sample size was large (i.e., ~1000) but performed poorly when it was small. Moreover, cLHS appeared more robust regardless of sample size. However, the converse was shown by Wadoux et al. (2019) and Ma et al. (2020), who found the accuracy of FSCS was superior to cLHS and SRS over all sample sizes, for mapping topsoil (0–0.3 m) organic carbon and soil classes across continental and catchment scales, respectively.
In this study, we examine statistical models (i.e., OK, LMM, QRF, SVM, CubistRK) based on a large calibration sample size (i.e., n = 180) to predict topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg, using digital data in combination (i.e., proximal sensing EMI, γ-ray spectrometry and DEM derivatives). Different sampling designs (i.e., SCS, FSCS, cLHS and SRS) were investigated, along with sample size (i.e., n = 180, 150, 120, 90, 60, 30) evaluated using an independent validation dataset (i.e., n = 40), by Lin's concordance correlation coefficient (LCCC) and ratio of performance to interquartile distance (RPIQ). DSMs of topsoil and subsoil Ca and Mg were produced to recommend lime and Mg application rates according to the Six-Easy-Steps management guidelines for Herbert Valley.
Section snippets
Study site
The study site is located in the Herbert Valley, north Queensland, Australia (Fig. 1a). It covers 47 ha in the area known as the Lannercost Extension. The annual rainfall is 2218 mm, with approximately 75 % of rainfall in summer (i.e., December–March). The average annual temperature is 23.7 °C, with the warmest month on average being January (32.4 °C) and coldest in July (13.9 °C). The Herbert River is located on the northeastern margin and it gives rise to alluvial plains, derived from the
Descriptive data analysis
Table 2 shows summary statistics of measured topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg for the calibration (n = 180) and validation (n = 40) datasets. In the calibration data (n = 180), the minimum (0.05) topsoil Ca was small (<0.2) while the maximum (3.8) was very large (>2.0). The median (0.8) was intermediate-large (0.6–0.8). On average, and with respect to the Six-Easy-Steps nutrient management guidelines (Table 1a), it would require 2.5 t/ha of lime to be applied across the
Conclusions
The results show when employing all the proximal sensing digital data (i.e., EMI and γ-ray spectrometry) as well as DEM derivatives in combination and considering the largest sample size (i.e., 180), CubistRK generally performed best for predicting topsoil (0–0.3 m) and subsoil (0.6–0.9 m) Ca and Mg, except that QRF preformed slightly better for topsoil Ca. Utilising CubistRK, no single sampling design was universally better, and 180 samples were required for predicting topsoil Ca and Mg with
Credit author statement
Jie Wang: Conceptualisation, Methodology, Formal analysis, Data curation, Resources, Writing – original draft. Xueyu Zhao: Data curation, Visualisation, Software, Validation. Dongxue Zhao: Investigation, Visualisation. John Triantafilis: Supervision, Project administration, Funding acquisition, Writing – review & editing.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The Australian Federal Government's Sugar Research Australia (SRA) is acknowledged for the financial support to conduct EMI and γ-ray surveys, soil sampling and laboratory analysis as part of project (2017/014) entitled “Seeing is believing: managing soil variability, improve crop yield and minimising off-site impacts in sugarcane using digital soil mapping”. The first author acknowledges a University of New South Wales (UNSW) “University International Postgraduate Award (UIPA)” scholarship
References (53)
- et al.
Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy
Trac. Trends Anal. Chem.
(2010) Sampling for digital soil mapping: a tutorial supported by R scripts
Geoderma
(2019)- et al.
Scoping for scale-dependent relationships between proximal gamma radiometrics and soil properties
Catena
(2017) - et al.
Regression kriging as a workhorse in the digital soil mapper's toolbox
Geoderma
(2018) - et al.
How far can the uncertainty on a Digital Soil Map be known?: a numerical experiment using pseudo values of clay content obtained from Vis-SWIR hyperspectral imagery
Geoderma
(2019) - et al.
Digital soil mapping based site-specific nutrient management in a sugarcane field in Burdekin
Geoderma
(2019) - et al.
Mapping key soil properties to support agricultural production in Eastern China
Geoderma Regional
(2017) - et al.
Comparison of conditioned Latin hypercube and feature space coverage sampling for predicting soil classes using simulation from soil maps
Geoderma
(2020) - et al.
A conditioned Latin hypercube method for sampling in the presence of ancillary information
Comput. Geosci.
(2006) - et al.
Relative prediction intervals reveal larger uncertainty in 3D approaches to predictive digital soil mapping of soil properties with legacy data
Geoderma
(2019)
Chile and the Chilean soil grid: a contribution to GlobalSoilMap
Geoderma Regional
Multivariable geostatistics in S: the gstat package
Comput. Geosci.
A comparison of calibration sampling schemes at the field scale
Geoderma
Using quantile regression forest to estimate uncertainty of digital soil mapping products
Geoderma
A global spectral library to characterize the world's soil
Earth Sci. Rev.
Sampling design optimization for soil mapping with random forest
Geoderma
An R package for spatial coverage sampling and random sampling from compact geographical strata by k-means
Comput. Geosci.
Effect of exchangeable Mg on saturated hydraulic conductivity, disaggregation and clay dispersion of disturbed soils
J. Hydrol.
Soil exchangeable cations estimation using Vis-NIR spectroscopy in different depths: effects of multiple calibration models and spiking
Comput. Electron. Agric.
Digital mapping of soil organic carbon contents and stocks in Denmark
PloS One
Spatial distribution of surface soil acidity, electrical conductivity, soil organic carbon content and exchangeable potassium, calcium and magnesium in some cropped acid soils of India
Land Degrad. Dev.
Gamma-radiometrics, a remote sensing tool for understanding soils
Australian Collaborative Land Evaluation Program Newsletter
Random forests
Mach. Learn.
Role of magnesium in carbon partitioning and alleviating photooxidative damage
Physiol. Plantarum
Support-vector networks
Mach. Learn.
Field-scale apparent soil electrical conductivity
Cited by (11)
Synergistic use of proximally sensed and time series remotely sensed imagery to map soil sodicity
2024, Computers and Electronics in AgricultureMapping cation exchange capacity and exchangeable potassium using proximal soil sensing data at the multiple-field scale
2023, Soil and Tillage ResearchProximal and remote sensor data fusion for 3D imaging of infertile and acidic soil
2022, GeodermaCitation Excerpt :Concerning statistical models, several comparative studies (Vaysse and Lagacherie, 2017) found that random forest (RF), and its derivative quantile regression forest (QRF), have high predictive capacity for DSM, with the latter simple to implement and allowing quantification of prediction uncertainty (Kasraei et al., 2021). Furthermore, the number of calibration samples, upon a model is constructed, should also be considered (Wang et al., 2021; Wang et al., 2022). Under such modelling, a common two-dimensional (2D) procedure is to fit separate models for predefined depth intervals.
Unravelling drivers of field-scale digital mapping of topsoil organic carbon and its implications for nitrogen practices
2022, Computers and Electronics in AgricultureCitation Excerpt :Moreover, location 1 was located at the nexus between small (<20 mS m−1) and intermediate (40–60 mS m−1) 2mHcon shown in Fig. 3c, with location 7 near the boundary between small and intermediate-small (20–40 mS m−1) 2mHcon. In addition, sites 3 and 12 were near the boundary of the study area and between two fields, respectively, and this could be attributable to poor data support for prediction at the field edges (Wang et al., 2021). Fig. 6a shows the plot of agreement versus calibration sample size for predicting topsoil (0–0.3 m) SOC at the validation sites and using the various statistical models considering only γ-ray (i.e., TC) data.
Assessing and geostatistical mapping of metal contamination in the polar arable plot (Yamal-Nenets Autonomous District, Russia)
2024, International Journal of Environmental Science and Technology