Skip to main content

Advertisement

Log in

Spatial prediction using random forest spatial interpolation with sample augmentation: a case study for precipitation mapping

  • Research
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

Spatial prediction(SP) based on machine learning(ML) has been applied to soil water quality, air quality, marine environment, etc. However, there are still deficiencies in dealing with the problem of small samples. Normally, ML requires large amounts of training samples to prevent underfitting. And the data augmentation(DA) methods of mixup and synthetic minority over-sampling technique(SMOTE) ignore the similarity of geographic information. Therefore, this paper proposes a modified upsampling method and combines it with the random forest spatial interpolation(RFSI) to deal with the small sample problem in geographical space. The modified upsampling is mainly reflected in the following two aspects. Firstly, in the process of selecting the nearest points, it is to select points with similar geographic information in some aspects of the category after classification. Secondly, the selected difference is the difference of each category. In order to verify the effectiveness of the proposed method, we use daily precipitation data for January 2018 in Chongqing. The experimental results show that the combination of the modified upsampling method and RFSI effectively improves the accuracy of SP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

For data and materials in this paper, please contact 2020112038@chd.edu.cn

References

  • Accion A, Arguello F, Heras DB (2020) Dual-window Superpixel data augmentation for hyperspectral image classification. Appl Sci-Basel 10(24):8833. https://doi.org/10.3390/app10248833

    Article  Google Scholar 

  • Alvarez O, Guo Q, Klinger RC, Li W, Doherty P (2014) Comparison of elevation and remote sensing derived products as auxiliary data for climate surface interpolation. Int J Climatol 34(7):2258–2268. https://doi.org/10.1002/joc.3835

    Article  Google Scholar 

  • Behrens T, Schmidt K, RAV R, Gries P, Scholten T, RA MM (2018) Spatial modelling with Euclidean distance fields and machine learning. Eur J Soil Sci 69(5):757–770. https://doi.org/10.1111/ejss.12687

    Article  Google Scholar 

  • Berndt C, Rabiei E, Haberlandt U (2014) Geostatistical merging of rain gauge and radar data for high temporal resolutions and various station density scenarios. J Hydrol 508:88–101. https://doi.org/10.1016/j.jhydrol.2013.10.028

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  • Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Their Appl 13(4):18–28

    Article  Google Scholar 

  • Hengl T, Nussbaum M, Wright MN, Heuvelink GBM, Graeler B (2018) Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. Peerj 6:e5518. https://doi.org/10.7717/peerj.5518

    Article  Google Scholar 

  • Huang C, Shibuya A (2020) High accuracy geochemical map generation method by a spatial autocorrelation-based mixture interpolation using remote sensing data. Remote Sens 12(12):1991. https://doi.org/10.3390/rs12121991

    Article  Google Scholar 

  • Kwak H, Lee WK, Saborowski J, Lee SY, Won MS, Koo KS, Lee MB, Kim SN (2012) Estimating the spatial pattern of human-caused forest fires using a generalized linear mixed model with spatial autocorrelation in South Korea. Int J Geogr Inf Sci 26(9):1589–1602. https://doi.org/10.1080/13658816.2011.642799

    Article  Google Scholar 

  • Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw 8(1):98–113

    Article  Google Scholar 

  • Lee H, Kim J, Kim EK, Kim S (2020) Wasserstein generative adversarial networks based data augmentation for radar data analysis. Appl Sci-Basel 10(4):1449. https://doi.org/10.3390/app10041449

    Article  Google Scholar 

  • Li HT, Shao ZD (2019) Review of spatial interpolation analysis algorithm. Comput Syst Appl 28(07):1–8. https://doi.org/10.15888/j.cnki.csa.006988

  • Li W, Chen C, Zhang MM, Li HC, Du Q (2019) Data augmentation for hyperspectral image classification with deep CNN. IEEE Geosci Remote Sens Lett 16(4):593–597. https://doi.org/10.1109/lgrs.2018.2878773

    Article  Google Scholar 

  • Li YS, Peng C, Ran XJ, Xue LF, Chai SL (2022) Soil geochemical prospecting prediction method based on deep convolutional neural networks-taking Daqiao gold deposit in Gansu Province, China as an example. China. Geology 5(1):71–83. https://doi.org/10.31035/cg2021044

    Article  Google Scholar 

  • Matheron G (1963) Principles Geostat Econ Geol 58(8):1246–1266

    Article  Google Scholar 

  • Mohanasundaram S, Udmale P, Shrestha S, Baghel T, Doshi SC, Narasimhan B, Kumar GS (2020) A new trend function-based regression kriging for spatial modeling of groundwater hydraulic heads under the sparse distribution of measurement sites. Acta Geophysica 68(3):751–772. https://doi.org/10.1007/s11600-020-00427-y

    Article  Google Scholar 

  • Mohsenzadeh Karimi S, Kisi O, Porrajabali M, Rouhani-Nia F, Shiri J (2020) Evaluation of the support vector machine, random forest and geo-statistical methodologies for predicting long-term air temperature. ISH J Hydraulic Eng 26(4):376–386

    Article  Google Scholar 

  • Nelder JA, Wedderburn RW (1972) Generalized linear models. J Royal Stat Soc: Series A (General) 135(3):370–384

    Article  Google Scholar 

  • Sekulic A, Kilibarda M, Heuvelink GBM, Nikolic M, Bajat B (2020) Random Forest Spatial Interpolation Remote Sensing 12(10):1687. https://doi.org/10.3390/rs12101687

    Article  Google Scholar 

  • Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46(sup1):234–240

    Article  Google Scholar 

  • Waske B, van der Linden S, Benediktsson JA, Rabe A, Hostert P (2010) Sensitivity of support vector machines to random feature selection in classification of hyperspectral data. IEEE Trans Geosci Remote Sens 48(7):2880–2889

    Article  Google Scholar 

  • Willmott CJ, Rowe CM, Philpot WD (1985) Small-scale climate maps: a sensitivity analysis of some common assumptions associated with grid-point interpolation and contouring. Am Cartographer 12(1):5–16

    Article  Google Scholar 

  • Wu TJ, Luo JC, Gao LJ, Sun YW, Yang YP, Zhou YN, Dong W, Zhang X (2021) Geoparcel-based spatial prediction method for grassland fractional vegetation cover mapping. IEEE JSelect Topics Appl Earth Observat Remote Sensing 14:9241–9253. https://doi.org/10.1109/jstars.2021.3110896

    Article  Google Scholar 

  • Xuan Thanh N, Ba Tung N, Khac Phong D, Quang Hung B, Thi Nhat Thanh N, Van Quynh V, Thanh Ha L (2015) Spatial interpolation of Meteorologic variables in Vietnam using the kriging method. J Inform Process Syst 11(1):134–147. https://doi.org/10.3745/jips.02.0016

    Article  Google Scholar 

  • Yan JB, Wu B, He QH (2021) An anisotropic IDW interpolation method with multiple parameters cooperative optimization. Acta Geodetica et Cartographica Sinica 50(5):675–684

    Google Scholar 

  • Yang N, Zhang Z, Yang J, Hong Z (2022) Applications of data augmentation in mineral prospectivity prediction based on convolutional neural networks. Comput Geosci 165:105075. https://doi.org/10.1016/j.cageo.2022.105075

    Article  Google Scholar 

  • Zhan AY, Du F, Chen ZZ, Yin GX, Wang M, Zhang YJ (2022) A traffic flow forecasting method based on the GA-SVR. J High Speed Net 28(2):97–106. https://doi.org/10.3233/jhs-220682

    Article  Google Scholar 

  • Zhang X, Yang X (2020) Building small sample error correction model by DE-SVR during coal prediction, 4th IEEE information technology. Networking, Electronic and Automation Control Conference (ITNEC), Electr Network:2323–2326

  • Zhang HY, Cisse M, Dauphin YN, Lopez-Paz D (2017) Mixup: beyond empirical risk minimization. arXiv:1710.09412[cs.LG]. https://doi.org/10.48550/arXiv.1710.09412

  • Zhu L, Chen YS, Ghamisi P, Benediktsson JA (2018) Generative adversarial networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens 56(9):5046–5063

    Article  Google Scholar 

Download references

Funding

This work was supported in part by National Natural Science Foundation of China under Grant 42071316, the Science and Technology Project of Inner Mongolia Autonomous Region under Grant 2021ZD0045, the Project of Chongqing Agricultural Industry Digital Map under Grant 21C00346, Key Research and Development Program of Shaanxi under Grant 2021NY-170, National Key Research and Development Program, Under Grant 2021YFB3900905 and 2021YFB3901300, National Natural Science Foundation of China under Grant 12001057, Fundamental Research Funds for the Central Universities, CHD under Grant 300102122101 and 300102269103.

Author information

Authors and Affiliations

Authors

Contributions

Jiao Sijia carried out the data preparation, performed the experiments, experimental analysis, and wrote the manuscript. Wu Tianjun outlined there search topic, proposed there search methodology, and designed the experiments. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Wu Tianjun.

Ethics declarations

Conflict of interest

There is no conflict of interest.

Additional information

Communicated by: H. Babaie

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sijia, J., Tianjun, W., Jiancheng, L. et al. Spatial prediction using random forest spatial interpolation with sample augmentation: a case study for precipitation mapping. Earth Sci Inform 16, 863–875 (2023). https://doi.org/10.1007/s12145-023-00936-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-023-00936-6

Keywords

Navigation