Skip to main content

Advertisement

Log in

Novel approach for soil classification using machine learning methods

  • Original Paper
  • Published:
Bulletin of Engineering Geology and the Environment Aims and scope Submit manuscript

Abstract

In this study, we have proposed a new classification method for determining different soil classes based on three machine learning approaches, namely: support vector classification (SVC), multilayer perceptron (MLP), and random forest (RF) models. For the development of models, we have used a database of 4888 soil samples obtained from Vietnam projects. In the model’s study, 15 soil properties factors (variables) have been selected as input parameters for classifying soil samples into 5 soil classes: lean clay (CL), elastic silt (MH), fat clay (CH), clayey sand (SC), and silt (ML). To evaluate and analyze the results quantitatively and qualitatively, various methods such as learning curve (time and number of training samples), confusion matrix, and several statistical metrics such as precision, recall, accuracy, and F1-score were used. Results indicated that performance of all the three models (average accuracy score = 0.968) is good but of the SVC model (accuracy score = 0.984) is best in accurate classification of soils.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  • Ahmad MW, Reynolds J, Rezgui Y (2018) Predictive modelling for solar thermal energy systems: a comparison of support vector regression, random forest, extra trees and regression trees. J Clean Prod 203:810–821

    Article  Google Scholar 

  • Albon C (2018) Machine learning with python cookbook: practical solutions from preprocessing to deep learning. O’Reilly Media, Inc

  • Ao Y, Li H, Zhu L, Ali S, Yang Z (2019) The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J Pet Sci Eng 174:776–789

    Article  Google Scholar 

  • Archer A (1970) Standardization of the size classification of naturally occurring particles. Geotechnique 20:103–107

    Article  Google Scholar 

  • Atterberg A (1911) Über die physikalishe Bodenuntersuchung und über die Plastizität der Tone. Int Mitt Boden 1:10–43

    Google Scholar 

  • Barman U, Choudhury RD (2020) Soil texture classification using multi class support vector machine. Inf Process Agric 7:318–332

    Google Scholar 

  • Barnett V, Lewis T (1984) Outliers in statistical data. osd

  • Beucher A, Møller AB, Greve MH (2019) Artificial neural networks and decision tree classification for predicting soil drainage classes in Denmark. Geoderma 352:351–359

    Article  Google Scholar 

  • Bhargavi P, Jyothi S (2009) Applying naive bayes data mining technique for classification of agricultural land soils. Int J Comput Sci Netw Secur 9:117–122

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Brevik EC, Calzolari C, Miller BA, Pereira P, Kabala C, Baumgarten A, Jordán A (2016) Soil mapping, classification, and pedologic modeling: history and future directions. Geoderma 264:256–274

    Article  Google Scholar 

  • Campbell DJ (1976) Plastic limit determination using a drop-cone penetrometer. J Soil Sci 27:295–300

    Article  Google Scholar 

  • Cao Z-J, Zheng S, Li D-Q, Phoon K-K (2019) Bayesian identification of soil stratigraphy based on soil behaviour type index. Can Geotech J 56(4):570–586

    Article  Google Scholar 

  • Carter M, Bentley SP (2016) Soil properties and their correlations. John Wiley & Sons

    Book  Google Scholar 

  • Casagrande A (1948) Classification and identification of soils. Trans Asce 113:901–991

    Google Scholar 

  • Chepil W (1955) Factors that influence clod structure and erodibility of soil by wind: IV. Sand, silt, and clay. Soil Sci 80:155–162

    Article  Google Scholar 

  • Costache R, Bui DT (2019) Spatial prediction of flood potential using new ensembles of bivariate statistics and artificial intelligence: a case study at the Putna river catchment of Romania. Sci Total Environ 691:1098–1118

    Article  Google Scholar 

  • Das BM, Sivakugan N (2016) Fundamentals of geotechnical engineering. Cengage Learning

    Google Scholar 

  • Debella-Gilo M, Etzelmüller B (2009) Spatial prediction of soil classes using digital terrain analysis and multinomial logistic regression modeling integrated in GIS: examples from Vestfold County, Norway. CATENA 77:8–18

    Article  Google Scholar 

  • Gambill DR, Wall WA, Fulton AJ, Howard HR (2016) Predicting USCS soil classification from soil property variables using random forest. J Terramechanics 65:85–92

    Article  Google Scholar 

  • Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media

  • Hackeling G (2017) Mastering machine learning with scikit-learn. Packt Publishing Ltd

  • Hassanien AE, Moftah HM, Azar AT, Shoman M (2014) MRI breast cancer diagnosis hybrid approach using adaptive ant-based segmentation and multilayer perceptron neural networks classifier. Appl Soft Comput 14:62–71

    Article  Google Scholar 

  • Heung B, Ho HC, Zhang J, Knudby A, Bulmer CE, Schmidt MG (2016) An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping. Geoderma 265:62–77

    Article  Google Scholar 

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844

    Article  Google Scholar 

  • Hurwitz J, Kirsch D (2018) Machine learning for dummies. IBM Ltd. Ed. 75

  • Kelleher JD, Mac Namee B, D’arcy A (2020) Fundamentals of machine learning for predictive data analytics: algorithms, worked examples, and case studies. MIT press

  • Kempen B, Brus DJ, Heuvelink GB, Stoorvogel JJ (2009) Updating the 1: 50,000 Dutch soil map using legacy soil data: a multinomial logistic regression approach. Geoderma 151:311–326

    Article  Google Scholar 

  • Kovačević M, Bajat B, Gajić B (2010) Soil type classification and estimation of soil properties using support vector machines. Geoderma 154:340–347

    Article  Google Scholar 

  • Lavanya D, Rani KU (2012) Ensemble decision tree classifier for breast cancer data. Int J Inf Technol Converg Serv 2:17

    Google Scholar 

  • Li H, Ji G, Ma Z (2007) A nonlinear predictive model based on multilayer perceptron network. Presented at the 2007 IEEE Int Conf Autom Log IEEE pp. 2686–2690

  • Lim T-S, Loh W-Y, Shih Y-S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach Learn 40:203–228

    Article  Google Scholar 

  • Liu T, Abd-Elrahman A, Morton J, Wilhelm VL (2018) Comparing fully convolutional networks, random forest, support vector machine, and patch-based deep convolutional neural networks for object-based wetland mapping using images from small unmanned aircraft system. Giscience Remote Sens 55:243–264

    Article  Google Scholar 

  • Liu YH (2017) Python machine learning by example. Packt Publishing Ltd

  • Mansuy N, Thiffault E, Paré D, Bernier P, Guindon L, Villemaire P, Poirier V, Beaudoin A (2014) Digital mapping of soil properties in Canadian managed forests at 250 m of resolution using the k-nearest neighbor method. Geoderma 235:59–73

    Article  Google Scholar 

  • Meier M, Souza ED, Francelino MR, Fernandes Filho EI, Schaefer CE (2018) Digital soil mapping using machine learning algorithms in a tropical mountainous area. Rev Bras Ciênc Solo 42

  • Moreno-Maroto JM, Alonso-Azcárate J, O’Kelly BC (2021) Review and critical examination of fine-grained soil classification systems based on plasticity. Appl Clay Sci 200:105955

    Article  Google Scholar 

  • Müller AC, Guido S (2016) Introduction to machine learning with Python: a guide for data scientists. O’Reilly Media, Inc

  • Murphy KP (2012) Machine learning: a probabilistic perspective. MIT press

  • Ng A, Ngiam J, Foo CY, Mai Y, Suen C, Coates A, Maas A, Hannun A, Huval B, Wang T (2015) Deep learning tutorial. Univ Stanf

  • Pham BT, Nguyen MD, Bui K-TT, Prakash I, Chapi K, Bui DT (2019) A novel artificial intelligence approach based on multi-layer perceptron neural network and biogeography-based optimization for predicting coefficient of consolidation of soil. CATENA 173:302–311

    Article  Google Scholar 

  • Polat H, Mehr HD, Cetin A (2017) Diagnosis of chronic kidney disease based on support vector machine by feature selection methods. J Med Syst 41:55

    Article  Google Scholar 

  • Popovici V, Chen W, Gallas BD, Hatzis C, Shi W, Samuelson FW, Nikolsky Y, Tsyganova M, Ishkin A, Nikolskaya T (2010) Effect of training-sample size and classification difficulty on the accuracy of genomic predictors. Breast Cancer Res 12:R5

    Article  Google Scholar 

  • Priori S, Bianconi N, Costantini EA (2014) Can γ-radiometrics predict soil textural data and stoniness in different parent materials? A comparison of two machine-learning methods. Geoderma 226:354–364

    Article  Google Scholar 

  • Silveira CT, Oka-Fiori C, Santos LJC, Sirtoli AE, Silva CR, Botelho MF (2013) Soil prediction using artificial neural networks and topographic attributes. Geoderma 195:165–172

    Article  Google Scholar 

  • Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree Landslides 1–18 https://doi.org/10.1007/s10346-015-0557-6

  • Tzotsos A, Argialas D (2008) Support vector machine classification for object-based image analysis. Object-Based Image Analysis. Springer, pp. 663–677

  • Urban G, Tripathi P, Alkayali T, Mittal M, Jalali F, Karnes W, Baldi P (2018) Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology 155:1069–1078

    Article  Google Scholar 

  • Xiao T, Zou H-F, Yin K-S, Du Y, Zhang L-M (2021) Machine learning-enhanced soil classification by integrating borehole and CPTU data with noise filtering. Bull Eng Geol Env 80(12):9157–9171

    Article  Google Scholar 

  • Yang W, Si Y, Wang D, Guo B (2018) Automatic recognition of arrhythmia based on principal component analysis network and linear support vector machine. Comput Biol Med 101:22–32

    Article  Google Scholar 

  • Zhao T, Wang Y (2020) Interpolation and stratification of multilayer soil property profile from sparse measurements using machine learning methods. Eng Geol 265:105430

    Article  Google Scholar 

Download references

Acknowledgements

This study was funded by the Ministry of Education and Training under grant number B2020-GHA-03, chaired by the University of Transportation. The authors would like to thank the support of the Department of Science, Technology, and Environment (Ministry of Education and Training), the University of Transport and Communications, and other agencies for providing data used in this research.

Author information

Authors and Affiliations

Authors

Contributions

Manh Duc Nguyen: conceptualization, methodology, software, writing—review and editing, validation, and supervision. Romulus Costache: conceptualization, methodology, software, writing—review and editing, validation, and supervision. An Ho Sy: data curation, writing—original draft, software, and validation. Peyman Yariyan: data curation, writing—original draft, software, and validation. Hassan Ahmadzadeh: data curation, writing—original draft, software, and validation. Hiep Van Le: data curation, writing—original draft, software, and validation. Indra Prakash: writing—review and editing, validation, and supervision. Binh Thai Pham: conceptualization, methodology, software, writing—review and editing, validation, and supervision.

Corresponding author

Correspondence to Binh Thai Pham.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, M.D., Costache, R., Sy, A.H. et al. Novel approach for soil classification using machine learning methods. Bull Eng Geol Environ 81, 468 (2022). https://doi.org/10.1007/s10064-022-02967-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10064-022-02967-7

Keywords