Modelling bioaccumulation of heavy metals in soil-crop ecosystems and identifying its controlling factors using machine learning

https://doi.org/10.1016/j.envpol.2020.114308Get rights and content

Highlights

  • Three methods were used to model accumulation of heavy metals in soil-crop system.

  • Random forest performed best on modelling heavy metals transport in soil-crop system.

  • The model performed best on Zn (R2 = 0.84) followed by Cu, Cr, Ni, Hg, Cd, As, Pb.

  • Plant type was the primary control of heavy metals transport in soil-crop system.

Abstract

The prediction and identification of the factors controlling heavy metal transfer in soil-crop ecosystems are of critical importance. In this study, random forest (RF), gradient boosted machine (GBM), and generalised linear (GLM) models were compared after being used to model and identify prior factors that affect the transfer of heavy metals (HMs) in soil-crop systems in the Yangtze River Delta, China, based on 13 covariates with 1822 pairs of soil-crop samples. The mean bioaccumulation factors (BAFs) for all crops followed the order Cd > Zn > As > Cu > Ni > Hg > Cr > Pb. The RF model showed the best prediction ability for the BAFs of HMs in soil-crop ecosystems, followed by GBM and GLM. The R2 values of the RF models for the BAFs of Zn, Cu, Cr, Ni, Hg, Cd, As, and Pb were 0.84, 0.66, 0.59, 0.58, 0.58, 0.51, 0.30, and 0.17, respectively. The primary controlling factor in soil-to-crop transfer of all HMs under study was plant type, followed by soil heavy metal content and soil organic materials. The model used herein could be used to assist the prediction of heavy metal contents in crops based on heavy metal contents in soil and other covariates, and can significantly reduce the cost, labour, and time requirements involved with laboratory analysis. It can also be used to quantify the importance of variables and identify potential control factors in heavy metal bioaccumulation in soil-crop ecosystems.

Introduction

Soil plays a critical role in ecosystems and is the basic resource for food production, however, soil is threatened by many pollutants, among which heavy metals (HMs) are of great concern (Hu and Cheng, 2013; Zhao et al., 2014; Shao et al., 2018; Hu et al., 2019; Peng et al., 2019). In contrast to organic matter, HMs can reside in soils for long periods of time (Hu et al., 2017a; Jia et al., 2019). The wide distribution of HM-polluted soils has resulted in profound environmental and health issues (Brus et al., 2009; Hu et al., 2017b; Hu et al., 2020). HMs in soils can be slowly but consistently accumulated in the human body via different pathways such as inhalation, ingestion, and dermal contact (De Miguel et al., 2007; Wei and Yang, 2010; Hui et al., 2016). Moreover, the intake of HMs through soil-crop ecosystems has proven to be an important pathway for human exposure to HMs (Liu et al., 2007; Qian et al., 2010; Liu et al., 2013).

In recent four decades, China has experienced rapid economic growth and has undergone a significant transformation from the traditional agriculture-based economy to a manufacturing-based economy (Cheng and Hu, 2010). Industrial development and fast urban expansion have resulted in sharp increases in the amount of sewage sludge produced by industrial, transportation-related, and municipal activities nationwide (Cheng and Hu, 2012). The Yangtze River Delta (YRD), located in Eastern China, is the most developed area in the country and yields 23.6% of the national gross domestic product (Yan et al., 2018). With a population of more than 150 million, the YRD is one of the most densely populated regions in China (Shao et al., 2018; Hu et al., 2017b). It is also an important production area for many important agricultural products such as rice, tea, vegetables, and certain fruits. Many studies have revealed significant accumulation of HMs in soil and crops in the YRD region (Chen and Pu, 2007; Hu et al., 2017c; Fei et al., 2019; Xia et al., 2019). The accumulation of HMs in soil and their subsequent bioaccumulation through the food chain poses great potential health risks to citizens in the YRD. Therefore, it is of great importance to explore HM bioaccumulation in soil-crop systems and identify its controlling factors.

Bioaccumulation of HMs in soil by plants is mainly governed by the uptake mechanism, the physical and chemical properties of the soil, the chemical characteristics of the HMs, physiological characteristics of the plants, and other environmental factors (Peijnenburg et al., 2007). Previous studies analysing the factors affecting the bioaccumulation of HMs in soil-plant systems have mainly focused on these aforementioned factors and have employed traditional statistical analyses (Jin et al., 2005; Liu et al., 2017; Liu et al., 2018). These models only consider quantitative variables, focusing mostly on critical soil properties such as soil pH, soil organic matter content (SOM), and soil HM content. However, they cannot be used to quantitatively analyse the effect of qualitative variables such as soil type and soil parent materials which have also been shown to exert substantial effects on HM accumulation in soil-crop systems. Few studies have taken different factors into consideration or quantitatively analysed the detailed effects of each potential factor and the spatial characteristics of bioaccumulation of different HMs. Further, few researchers have succeeded in building a comprehensive model that is able to predict bioaccumulation factors (BAFs) of HMs in soil-crop systems. Moreover, the methods used in previous studies were unable to identify the importance of the different variables in the modelling process, which is vital for decision-makers in identifying the essential factors controlling HM pollution control and remediation in soils. Aiming to fill this research gap, in this study, we used and compared three kinds of widely used machine learning methods, namely the gradient-boosted model (GBM) (Friedman, 2002), random forest (RF) (Liaw and Wiener., 2002), and the classic linear statistic model referred to as the generalised linear model (GLM) (Nelder and Wedderburn., 1972), to predict the BAFs of different HMs and define the quantitative effects of different factors on the bioaccumulation of eight different HMs. Therefore, there were three main aims of this study. The first was to investigate the HM content and the BAFs of each HM in soil-plant ecosystems. The second objective was to build and compare the models constructed using the GBM, RF, and GLM methods to be used for predicting the BAFs of HMs in soil-plant ecosystems, and to identify the optimal method. The third aim was to identify potential controlling factors in the transfer of HMs in soil-crop ecosystems. This study could contribute to the regulation of HM contamination in soil-crop ecosystems and toward guaranteeing the safety of agricultural products.

Section snippets

Study area

The region under study (28°51′–30°33′N and 120°55′–122°16′E) lies at the southern part of the YRD, the most developed region in China (Fig. 1). The study region has a population of roughly 6 million people and an approximate area of 9800 km2. It is well known for industrial and commercial activities as well as foreign trade. It of interest to note that the area is home to the largest international port in the world. In recent decades, the studied region has been experiencing rapid and intensive

Summary statistics of HMs in soil-crop ecosystems

The basic descriptive statistics of HM contents in the soil samples are listed in Table S5. The average contents of HMs in soil followed the order Zn (111.16 mg kg−1) > Cr (69.64 mg kg−1) > Pb (42.89 mg kg−1) > Cu (35.50 mg kg−1) > Ni (29.99 mg kg−1) > As (6.67 mg kg−1) > Hg (0.31 mg kg−1) > Cd (0.20 mg kg−1). The contents of all HMs in soil were higher than their background levels in soils (Table S5) but significantly lower than the national limit values (Table S6) (CNEMC., 1990). The

HM content in soil-crop systems

The contents of the eight HMs targeted in this study in soil were significantly higher than the corresponding background values in soil in the research area (Table S5), which clearly demonstrates accumulation caused by anthropogenic contributions. For HMs in plants, there is still no complete national standard on the regulation of HM concentrations in different kinds of crops. This makes it difficult to assess the levels of HMs in crops.

In the research area studied, crops more easily absorbed

Conclusions

In this study, we analysed HM contents in 1822 paired soil-crop samples. The GBM, RF, and GLM models were adopted to predict the BAFs of different HMs in soil-crop ecosystems based on 13 auxiliary variables, and the importance of the different variables in the models were quantified. The method used in this study could contribute to the prediction of HM contents in crops based on the HM contents in soil and other available information about the soil. This could result in significant savings in

CRediT authorship contribution statement

Bifeng Hu: Conceptualization, Data curation, Formal analysis, Methodology, Writing - original draft, Writing - review & editing. Jie Xue: Data curation. Yin Zhou: Formal analysis, Writing - original draft. Shuai Shao: Data curation, Formal analysis. Zhiyi Fu: Formal analysis, Methodology. Yan Li: Supervision. Songchao Chen: Methodology, Writing - review & editing. Lin Qi: Data curation. Zhou Shi: Conceptualization, Funding acquisition, Supervision, Writing - original draft, Writing - review &

Declaration of competing interest

The authors have declared that no competing interests exist.

Acknowledgment

This work was supported by the National Key Research and Development Program of China (2018YFC1800105) and the Key Research and Development Project of Zhejiang Province, China (2015C02011). We also acknowledge the support received by Bifeng Hu from the China Scholarship Council (under grant agreement No. 201706320317) for 3 years’ Ph.D. study in INRA and Orléans University.

References (81)

  • B.F. Hu et al.

    Identifying heavy metal pollution hot spots in soil-rice systems: a case study in South of Yangtze River Delta, China

    Sci. Total Environ.

    (2019)
  • B.F. Hu et al.

    Composite assessment of human health risk from potentially toxic elements through multiple exposure routes: a case study in farmland in an important industrial city in East China

    J. Geochem. Explor.

    (2020)
  • X.L. Jia et al.

    A methodological framework for identifying potential sources of soil heavy metal pollution based on machine learning: a case study in the Yangtze Delta, China

    Envrion. Pollut.

    (2019)
  • C.W. Jin et al.

    Lead contamination in tea garden soils and factors affecting its bioavailability

    Chemosphere

    (2005)
  • K. Kalbitz et al.

    Mobilization of heavy metals and arsenic in polluted wetland soils and its dependence on dissolved organic matter

    Sci. Total Environ.

    (1998)
  • P.S. Kidd et al.

    Bioavailability and plant accumulation of heavy metals and phosphorus in agricultural soils amended by long-term application of sewage sludge

    Chemosphere

    (2007)
  • N. Li et al.

    Concentration and transportation of heavy metals in vegetables and risk assessment of human exposure to bioaccessible heavy metals in soil near a waste-incinerator site, South China

    Sci. Total Environ.

    (2015)
  • X. Liu et al.

    Human health risk assessment of heavy metals in soil-vegetable system: a multi-medium analysis

    Sci. Total Environ.

    (2013)
  • B. Liu et al.

    Assessment of the bioavailability, bioaccessibility and transfer of heavy metals in the soil-grain-human systems near a mining and smelting area in NW China

    Sci. Total Environ.

    (2017)
  • C. Liu et al.

    Cadmium accumulation in edible flowering cabbages in the Pearl River Delta, China: critical soil factors and enrichment models

    Environ. Pollut.

    (2018)
  • M. Malandrino et al.

    Accumulation of heavy metals from contaminated soil to plants and evaluation of soil remediation by vermiculite

    Chemosphere

    (2011)
  • D.S. Manta et al.

    Heavy metals in urban soils: a case study from the city of Palermo (Sicily), Italy

    Sci. Total Environ.

    (2002)
  • M. Meng et al.

    Accumulation of total mercury and methylmercury in rice plants collected from different mining areas in China

    Environ. Pollut.

    (2014)
  • W.J.G.M. Peijnenburg et al.

    Monitoring metals in terrestrial environments within a bioavailability framework and a focus on soil extraction

    Ecotoxicol. Environ. Saf.

    (2007)
  • J. Peng et al.

    Estimating soil salinity from remote sensing and terrain data in southern Xinjiang Province, China

    Geoderma

    (2019)
  • N. Pouladi et al.

    Mapping soil organic matter contents at field level with Cubist, Random Forest and kriging

    Geoderma

    (2019)
  • Y.Z. Qian et al.

    Concentrations of cadmium, lead, mercury and arsenic in Chinese market milled rice and associated population health risk

    Food Contr.

    (2010)
  • N. Rascio et al.

    Heavy metal hyperaccumulating plants: how and why do they do it? And what makes them so interesting?

    Plant Sci.

    (2011)
  • C.Y. Sun et al.

    Multivariate and geostatistical analyses of the spatial distribution and sources of heavy metals in agricultural soil in Dehui, Northeast China

    Chemosphere

    (2013)
  • P. Wang et al.

    Effects of Pb on the oxidative stress and antioxidant response in a Pb bioaccumulator plant Vallisneria natans

    Ecotoxicol. Environ. Saf.

    (2012)
  • B. Wei et al.

    A review of heavy metal contaminations in urban soils, urban road dusts and agricultural soils from China

    Microchem. J.

    (2010)
  • B. Wen et al.

    Zn, Ni, Mn, Cr, Pb and Cu in soil-tea ecosystem: the concentrations, spatial relationship and potential control

    Chemosphere

    (2018)
  • L. Xiao et al.

    The influence of bioavailable heavy metals and microbial parameters of soil on the metal accumulation in rice grain

    Chemosphere

    (2017)
  • D. Xue et al.

    Comparative proteomic analysis provides new insights into cadmium accumulation in rice grain under cadmium stress

    J. Hazard Mater.

    (2014)
  • Q.W. Yang et al.

    Cadmium in soil–rice system and health risk associated with the use of untreated mining wastewater for irrigation in Lechang, China

    Agric. Water Manag.

    (2006)
  • Y. Yang et al.

    Assessing cadmium exposure risks of vegetables with plant uptake factor and soil property

    Environ. Pollut.

    (2018)
  • F. Zang et al.

    Accumulation, spatio-temporal distribution, and risk assessment of heavy metals in the soil-corn system around a polymetallic mining area from the Loess Plateau, northwest China

    Geoderma

    (2017)
  • F. Zeng et al.

    The influence of pH and organic matter content in paddy soil on heavy metal availability and their uptake by rice plants

    Environ. Pollut.

    (2011)
  • J.R. Zhang et al.

    Bioavailability and soil-to-crop transfer of heavy metals in farmland soils: a case study in the Pearl River Delta, South China

    Environ. Pollut.

    (2018)
  • K. Zhao et al.

    Heavy metal contaminations in a soil–rice system: identification of spatial dependence in relation to soil properties of paddy fields

    J. Hazard Mater.

    (2010)
  • Cited by (139)

    View all citing articles on Scopus

    This paper has been recommended for acceptance by Wen-Xiong Wang.

    View full text