Skip to main content
Log in

Data mining techniques applied to statistical prediction of monthly precipitation in Gran Chaco Argentina

  • Original Paper
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

Data mining techniques are currently a powerful tool to address with the seasonal time-scales forecasting. In this work, neural networks, support vector regression and generalized additive models are considered besides the most commonly used multiple linear regression methodology, to obtain precipitation forecasting models in the area of “Gran Chaco Argentino”. The results indicate that data mining techniques improve forecasts derived from other methodologies, although the efficiency of the different methodologies is highly dependent on the month and the region. The non-linear techniques improve the forecasts and show lower mean square error than the multiple linear regression and support vector regression. The root mean square error is higher east of study area than in the west because precipitation is higher. The coefficient of variation is quite low in all the months in the central and southwest parts of the area. The precipitation interval with the highest probability of occurrence showed a value of 1.5. In addition, the possibility of generating ensemble means of several models and deriving categorical forecasts is a highly advisable alternative for prediction in this region of Argentina. The use of ensemble means is recommended. The derived forecasts improve the dynamic world center models only in some regions of the study area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.

Code availability

Not applicable.

References

  • Barnston A, Kumar A, Goddard L, Hoerling M (2005) Improving seasonal prediction practices through attribution of climate variability. BAMS 86(1):59–72. https://doi.org/10.1175/BAMS-86-1-59

    Article  Google Scholar 

  • Barreiro M (2009) Influence of ENSO and the South Atlantic Ocean on climate predictability over Southeastern South America. Clim Dyn. https://doi.org/10.1007/s00382-009-0666-9

    Article  Google Scholar 

  • Boukabara S, Krasnopolsky V, Stewart JQ, Maddy ES, Shahroudi N and Hoffman RN (2019) Leveraging modern artificial intelligence for remote sensing and NWP: benefits and challenges. Bull Am Meteorol Soc 100(12):ES473–ES491. Burkov

  • Chollet et al (2022) Keras. https://keras.io. Accessed 13 Sept 2022

  • Coelho C, Stephenson D, Balmaseda M, Doblas Reyes F, Oldenborge G (2005) Towards an integrated seasonal forecasting system for South America. J Climate 19:3704–3721

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018

    Article  Google Scholar 

  • Ebert-Uphoff I, Hilburn K (2020) Evaluation, tuning and interpretation of neural networks for working with images in meteorological applications. Bull Am Meteor Soc. https://doi.org/10.1175/BAMS-D-20-0097.1

    Article  Google Scholar 

  • FAO (2011) State of the world’s forests. Food and Agriculture Organization of the United Nations, Rome, Italy

    Google Scholar 

  • Goddard L, Barnston A and Mason S (2003) Evaluation of the IRI´s “net assessment” seasonal climate forecasts. 1997–2001. BAMS. 1761–1781

  • Gonzalez MH, Rolla AL (2019) Comparison between statistical precipitation prediction in northern Patagonia (Argentina) using ERA- INTERIM and NCEP reanalysis datasets. In: Prathamesh Gorawala y Srushti Mandhari (eds) Agricultural Research Updates. 27 4: 117–128, NOVA Science Publications, New York, USA

  • Hartigan JA (1985) Statistical theory in clustering. J Classif 2:63–76. https://doi.org/10.1007/BF01908064

    Article  Google Scholar 

  • Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. Appl Stat 28:100–108. https://doi.org/10.2307/2346830

    Article  Google Scholar 

  • Hyndman RJ, Athanasopoulos G (2022) Forecasting principles and practice. OTexts: Melbourne, Australia. http://otexts.org/fpp2/. Accessed 13 Sept 2022

  • Kalnay E, Kanamitsu M, Kistler R, Collins W, Deaven D, Gandin L, Iredell M, Saha S, White G, Woollen J, Zhu I, Chelliah M, Ebisuzaki W, Higgings W, Janowiak J, Mo KC, Ropelewski C, Wang J, Leetmaa, Reynolds R, Jenne R, Joseph D (1996) The NCEP/NCAR reanalysis 40 years- project. Bull Am Meteorol Soc 77:437–471

    Article  Google Scholar 

  • Kaski S (2011) Self-Organizing Maps. Sammut C., In: Webb GI (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_746

  • Kumar A (2006) On the interpretation and utility of skill information for seasonal climate predictions. Mon Wea Rev 135:1974–1984

    Article  Google Scholar 

  • Lee Y, Hall D, Stewart J and Govett M (2018) Machine learning for targeted assimilation of satellite data. Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer. 53–68

  • Leetmaa A (2003) Seasonal forecasting. Innov Practice Institut BAMS 84:1686–1691

    Google Scholar 

  • Murtagh F, Legendre P (2014) Ward’s hierarchical agglomerative clustering method: which algorithms implement Ward’s criterion? J Classif 31:274–295. https://doi.org/10.1007/s00357-014-9161-z

    Article  Google Scholar 

  • Nobre C, Marengo J, Cavalcanti I, Obregon G, Barros V, Camilloni I, Campos N, Ferreira A (2005) Seasonal to decadal predictability and prediction of South America Climate. J Climate 19(23):5988–6004

    Article  Google Scholar 

  • Reichstein M, Camps-Valls G, Stevens V, Jung M, Denzler J, Carvalhais N et al (2019) Deep learning and process understanding for data-driven earth system science. Nature 566(7743):195–204

    Article  Google Scholar 

  • Tibshibari R (1996) Regression Shrinkage and Selection via the Lasso. J Roy Stat Soc 58(1):267–288

    Google Scholar 

  • Wilks DS (2011) Statistical methods in the atmospheric sciences, 3rd edn. Academic Press, San Diego, California, USA, p 704

    Google Scholar 

  • Wood S (2006) Generalized additive models: an introduction with R. 2nd edn, CRC Press Taylor & Francis, 474

Download references

Acknowledgements

Rainfall data was provided by the National Meteorological Service of Argentina (SMN) and the National Institute of Agricultural Technology (INTA). Thanks to the Copernicus Climate Change Service for providing the dynamical models data. The forecasts from the US National Centers for Environmental Prediction (NCEP), the Japan Meteorological Agency (JMA), and Environment and Climate Change Canada (ECCC) are in-kind contributions to the Copernicus Climate Change Service, which we acknowledge with gratitude.

Funding

This work was supported by 2020–2022 UBACyT 20020190100090BA and 2018–2020 UBACyT 20620170100012BA projects.

Author information

Authors and Affiliations

Authors

Contributions

All the authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by all the authors. All the authors read and approved the final manuscript.

Corresponding author

Correspondence to Alfredo L. Rolla.

Ethics declarations

Ethics approval

Not applicable.

Consent to participate

All the authors have consented to participate in the study.

Consent for publication

All the authors have consented to publish the study.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

González, M.H., Rolla, A.L. Data mining techniques applied to statistical prediction of monthly precipitation in Gran Chaco Argentina. Theor Appl Climatol 150, 1027–1043 (2022). https://doi.org/10.1007/s00704-022-04209-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-022-04209-y

Navigation