Abstract
Bayesian networks (BNs) are being increasingly applied to environmental research. Nonetheless, most of the literature related to environmental sciences use discrete or discretized data, which entails a loss of information. We propose a novel methodology based on continuous BNs to predict the probability that surface waters do not meet the standards, in relation to nitrate concentration, established by the European Water Framework Directive. In order to achieve our purpose, a Tree Augmented Naive Bayes (TAN), was developed and applied to estimate and map the risk of failing to meet the European standards established. The TAN models were tested by means of the k-fold cross validation method. The results revealed that the TAN model performed proper risk maps and suggested that poor water quality is highly probable in watersheds dominated by irrigated herbaceous crops. On the contrary, “good surface water status” is more likely to occur in areas where forest is notably present.









Similar content being viewed by others
Notes
The DTM, with grid width 200 m, was provided by the Spanish National Geographic Institute (http://www.ign.es/ign/layoutIn/modeloDigitalTerreno.do).
Risk level classes: very low=[0–0.1], low=(0.1–0.3], moderate=(0.3–0.5], high=(0.5–0.8] and very high=(0.8–1] (Dlamini 2011).
References
Aalders I, Hough RL, Towers W (2011) Risk of erosion in peat soils—an investigation using Bayesian belief networks. Soil Use Manag 27:538–549
Aguilera PA, Fernández A, Reche F, Rumí R (2010) Hybrid Bayesian network classifiers: application to species distribution models. Environ Model Softw 25(12):1630–1639
Aguilera PA, Fernández A, Fernández R, Rumí R, Salmerón A (2011) Bayesian networks in environmental modelling. Environ Model Softw 26:1376–1388
Aguilera PA, Fernández A, Ropero RF, Molina L (2013) Groundwater quality assessment using data clustering based on hybrid Bayesian networks. Stoch Environ Res Risk Assess 27(2):435–447
Ames DP, Neilson BT, Stevens DK, Lall U (2005) Using Bayesian networks to model watershed management decisions: an East Canyon Creek case study. J Hydroinformatics 7:267–282
Barca E, Passarella G (2008) Spatial evaluation of the risk of groundwater quality degradation: a comparison between disjunctive kriging and geostatistical simulation. Environ Monit Assess 137:261–273
Borin M, Vianello M, Morari F, Zanin G (2005) Effectiveness of buffer strips in removing pollutants in runoff from a cultivated field in North-East Italy. Agric Ecosyst Environ 101:101–114
Borsuk ME, Stow CA, Reckhow KH (2004) A Bayesian network of eutrophication models for synthesis, prediction, and uncertainty analysis. Ecol Model 173:219–239
Bressan GM, Oliveira VA, Hruschka ER, Nicoletti MC (2009) Using Bayesian networks with rule extraction to infer risk of weed infestation in a corn-crop. Eng Appl Artif Intell 22:579–592
Causapé J, Quílquez D, Aragüés R (2006) Irrigation efficiency and quality of irrigation return flows in the Ebro river basin: an overview. Environ Monit Assess 117:451–461. doi:10.1007/s10661-006-0763-8
Chan T, Ross H, Hoverman S, Powell B (2010) Participatory development of a Bayesian network model for catchment-based water resource management. Water Resour Res. doi:10.1029/2009WR008,848
Diamantino C, Henriques MJ, Oliveira MM, Lobo-Ferreira JP (2005) Methodologies for pollution risk assessment of water resources systems. In: The fourth inter-celtil colloquium on hydrology and management of water resources
Dlamini WM (2011) Application of Bayesian networks for fire risk mapping using GIS and remote sensing data. GeoJurnal 76:283–296
Dyer F, ElSawah S, Croke B, Griffiths R, Harrison E, Lucena-Moya P, Jakeman A (2014) The effects of climate change on ecologically-relevant flow regime and water quality attributes. Stoch Environ Res Risk Assess 28:67–82
Eimers L, Weaver JC, Terziotti S, Midgette RW (2000) Methods of rating unsaturated zone and watershed characteristics of public water supplies in North Carolina. Technical report, U.S. Geological Survey. Water-Resources Investigations
Elvira-Consortium (2002) Elvira: An environment for probabilistic graphical models. In: Proceedings of the first European workshop on probabilistic graphical models (PGM’02), pp 222–230
Fernández A, Salmerón A (2008) Extension of Bayesian network classifiers to regression problems. In: Geffner H, Prada R, Alexandre IM, David N (eds) Advances in artificial intelligence—IBERAMIA 2008. Lecture Notes in Artificial Intelligence, , vol 5290. Springer, Lisbon, pp 83–92
Fernández A, Morales M, Salmerón A (2007) Tree augmented naïve Bayes for regression using mixtures of truncated exponentials: applications to higher education management. IDA’07. Lect Notes Comput Sci 4723:59–69
Fernández-Escobar R, Marin L, Sánchez-Zamora MA, García-Novelo JM, Molina-Soria C, Parra MA (2009) Long-term effects of N fertilization on cropping and growth of olive trees and on N accumulation in soil profile. Eur J Agron 31:223–232. doi:10.1016/j.eja.2009.08.001
Fienen MN, Masterson JP, Plant NG, Guitierrez BT, Thieler ER (2013) Bridging groundwater models and decision support with a Bayesian network. Water Resour Res 49:6459–6473
Foley JA, DeFries R, Asner GP, Barford C, Bonan G, Carpenter SR, Chapin FS, Coe MT, Daily GC, Gibbs HK, Helkowski JH, Holloway T, Howard EA, Kucharik CJ, Monfreda C, Patz JA, Prentice IC, Ramankutty N, Snyder PK (2005) Global consequences of land use. Science 309:570–574. doi:10.1126/science.1111772
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Fytilis N, Rizzo DM (2013) Coupling self-organizing maps with a Naïve Bayesian classifier: stream classification studies using multiple assessment data. Water Resour Res 49:7747–7762
Gillentine J (2000) Source Water Assessment and Protection Program. Technical report, New Mexico Environmental Department. Drinking Water Bureau, Appendix E - WRASTIC Index: Watershed vulnerability estimation using WRASTIC
Giupponi C, Eiselt E, Ghetti P (1999) A multicriteria approach for mapping risks of agricultural pollution for water resources: the Venice Lagoon watershed case study. J Environ Manag 56:259–269
Grimm JW, Lynch J (2005) Improved daily precipitation nitrate and ammonium concentration models for the Chesapeake Bay Watershed. Environ Pollut 135:445–455
Huang C (2009) Integration degree of risk in terms of scene and application. Stoch Environ Res Risk Assess 23:473–484
Lahr J, Kooistra L (2010) Environmental risk mapping of pollutants: state of the art and communication aspects. J Total Environ 408:3899–3907
Langseth H, Nielsen TD, Rumí R, Salmerón A (2012) Mixtures of truncated basis functions. Int J Approx Reason 53(2):212–227
Langseth H, Nielsen T, Pérez-Bernabé I, Salmerón A (2014) Learning mixtures of truncated basis functions from data. Int J Approx Reason 55:940–956
Larrañaga P, Moral S (2011) Probabilistic graphical models in artificial intelligence. Appl Soft Comput 11:1511–1528. doi:10.1016/j.asoc.2008.01.003
Lauritzen SL (1992) Propagation of probabilities, means and variances in mixed graphical association models. J Am Stat Assoc 87:1098–1108
Lee SW, Hwang SJ, Lee SB, Hwang HS, Sung HC (2009) Landscape ecological approach to the relationships of land use patterns in watersheds to water quality characteristics. Landsc Urban Plann 92:80–89
Liang Wj, Zhuang Df, Jiang D, Pan Jj, Ren Hy (2012) Assessment of debris flow hazards using a Bayesian Network. Geomorphology 171–172:94–100
Liao Y, Wang J, Guo Y, Zheng X (2010) Risk assessment of human neural tube defects using a Bayesian belief network. Stoch Environ Res Risk Assess 24:93–100
Markus M, Hejazi MI, Bajcsy P, Giustolisi O, Savic DA (2010) Prediction of weekly nitrate-N fluctuations in a small agricultural watershed in Illinois. J Hydroinformatics 12(3):251–261. doi:10.2166/hydro.2010.064
Moral S, Rumí R, Salmerón A (2001) Mixtures of truncated exponentials in hybrid bayesian networks. In: Benferhat S, Besnard P (eds) Symbolic and quantitative approaches to reasoning with uncertainty. Lecture notes in artificial intelligence, vol 2143. Springer, Lisbon, pp 156–167
Morales M, Rodríguez C, Salmerón A (2007) Selective naïve Bayes for regression using mixtures of truncated exponentials. Int J Uncertain Fuzz Knowl Based Syst 15:697–716
Moratalla A, Gómez-Alday JJ, Sanz D, Castano SC, de las Heras J (2011) Evaluation of a GIS-Based integrated vulnerability risk assessment for the Mancha Oriental System (SE Spain). Water Resour Manag 25:3677–3697
Moreno JL, Navarro C, de las Heras JD (2006) Abiotic ecotypes in south-central Spanish rivers: Reference conditions and pollution. Environ Pollut 143:388–396. doi:10.1016/j.envpol.2005.12.012
Palmsten ML, Holland KT, Plant NG (2013) Velocity estimation using a Bayesian network in a critical-habitat reach of the Kootenai River, Idaho. Water Resour Res 49:5865–5879
Passarella G, Vurro M, D’Agostino V, Giuliano G, Barcelona M (2002) A probabilistic methodology to assess the risk of groundwater quality degradation. Environ Monit Assess 79:57–74
Payraudeau S, van der Werf HM (2005) Environmental impact assessment for farming region: a review of methods. Agric Ecosyst Environ 107:1–19
Pérez-Miñana E, Krause PJ, Thornton J (2012) Bayesian Network fot the management of greenhouse gas emissions in the British agricultural sector. Environ Model Softw 35:132–148
Pollino CA, Woodberry O, Nicholson A, Korb K, Hart BT (2007) Parameterisation and evaluation of a Bayesian network for use in an ecological risk assessment. Environ Model Softw 22:1140–1152
Quinn JM, Monaghan RM, Bidwell VJ, Harris SR (2013) A Bayesian Belief Network approach to evaluating complex effects of irrigation-driven agricultural intensification scenarios on future aquatic environmental and economic values in a New Zealand catchment. Mar Freshw Res 64:460–474. doi:10.1071/MF12141
Rennie SE, Brandt A, Plant N (2007) A probabilistic expert system approach for sea mine burial prediction. IEEE J Ocean Eng 32:260–272
Ropero RF, Aguilera PA, Fernández A, Rumí R (2014) Regression using hybrid Bayesian networks: modelling landscape-socioeconomy relationships. Environ Model Softw 54:127–137
Ruiz R, Riquelme J, Aguilar-Ruiz JS (2006) Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognit 39:2383–2392
Scalon BR, Jolly I, Sophocleous M, Zhang L (2007) Global impacts of conversions from natural to agricultural ecosystems on water resources: quantity versus quality. Water Resour Res 43(W03):437
Shenoy PP, West JC (2011) Inference in hybrid Bayesian networks using mixtures of polynomials. Int J Approx Reason 52(5):641–657
Shenton W, Hart BT, Chan TU (2014) A Bayesian network approach to support environmental flow restoration decisions in the Yarra River, Australia. Stoch Environ Res Risk Assess 28:57–65
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B (Methodol) 36(2):111–147
Sun R, Chen L, Chen W, Ji Y (2013) Effect of land use patterns on total nitrogen concentration in the upstream regions of the Haihe River Basin, China. Environ Manag 51:45–58
Tilman D, Fargione J, Wolf B, D’Antonio C, Dobson A, Howarth R, Schindler D, Schlesinger WH, Simberloff D, Swackhamer D (2001) Forecasting agriculturally driven global environmental change. Science 292:281–284. doi:10.1126/science.1057544
Troldborg M, Aalders I, Towers W, Hallett PD, McKenizie BM, Bengough AG, Lilly A, Ball BC, Hough RL (2013) Application of Bayesian Belief Networks to quantify and map areas at risk to soil threats: using soil compaction as an example. Soil Tillage Res 132:56–68
Uusitalo L (2007) Advantages and challenges of Bayesian networks in environmental modelling. Ecol Model 203:312–318
Verro R, Calliera M, Maffioli G, Auteri D, Sala S, Finizio A, Vighi M (2002) GIS-Based system for surface water risk assessment of agricultural chemicals. 1. Methodological approach. Environ Sci Technol 36:1532–1538
Verro R, Finizio A, Otto S, Vighi M (2009) Predicting pesticide environmental risk in intensive agricultural areas I: screeneing level risk assessment of individual chemicals in surface waters. Environ Sci Technol 43:522–529
Wang QJ, Robertson DE, Haines CL (2009) A Bayesian network approach to knowledge integration and representation of farm irrigation: 1. Model development. Water Resour Res. doi:10.1029/2006WR005,419
Wang Y, Witten IH (1997) Induction of model trees for predicting continuous cases. In: Proceedings of the poster papers of the European conference on machine learning, pp 128–137
Zhang NL, Poole D (1996) Exploiting causal independence in Bayesian network inference. J Artif Intell Res 5:301–328
Zhang W, Li H, Sun D, Zhou L (2012) A statistical assessment of the impact of agricultural land use intensity on regional surface water quality at multiple scales. Environ Res Public Health 9:4170–4186. doi:10.3390/ijerph9114170
Zou Q, Zhou J, Zhou C, Song L, Guo J (2013) Comprehensive flood risk assessment based on set pair analysis-variable fuzzy sets model and fuzzy AHP. Stoch Environ Res Risk Assess 27:525–546
Acknowledgments
This work has been supported by the Spanish Ministry of Economy and Competitiveness, through project TIN2013-46638-C3-1-P , by Junta de Andalucía through project P11-TIC-7821 and by ERDF-FEDER funds. A.D. Maldonado is being supported by the Spanish Ministry of Education, Culture and Sport through an FPU research grant, FPU2013/00547.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property
Appendix: Land uses
Appendix: Land uses
Land use variables incorporated into the data matrix. Percentage of occupation of each land use in the study area is shown in brackets.
1. | Urban (2.19 %) includes urban, industrial and commercial areas, landfills, mining deposits, communication infrastructure, parks, recreational and sport facilities and areas under construction |
2. | Water (2.45 %) comprises surface waters in Andalusia, including rivers, artificial channels, lakes and reservoirs. For the study purposes, only waterbodies corresponding to rivers were taken into account |
3. | Forest (47.52 %) includes forest, shrub and grassland cover |
4. | Rainfed herbaceous crops (13.89 %) consist of non-irrigated herbaceous monocultures, with cereals (wheat, barley, oats) and leguminous crops (peas, chickpeas, beans) being the most copious crops |
5. | Olive grove (21.08 %) is the main crop of inland Andalusia and consists of non-irrigated monocultures, excluding wild olive trees |
6. | Vineyard1 a (0.28 %) consists of non-irrigated monocultures, devoted to grape production |
7. | Rainfed woody crops a (0.67 %) comprise woody monocultures under rainfed conditions, such as almond, carob, fig, walnut or chestnut trees, excluding plots dedicated to logging activities |
8. | Rainfed olive grove and vineyard crops a (0.03 %) are composed of mixtures of vine and olive trees under dry farming conditions |
9. | Rainfed woody heterogeneous crops a (0.03 %) comprise vine and olive tree associations with other rainfed woody crops, where no dominance of any of the crops exists |
10. | Abandoned olive grove a (0.19 %) comprises abandoned plots of woody crops, patently dominated by olive grove |
11. | Abandoned woody crops a (0.03 %) include abandoned plots of undifferentiated woody crops |
12. | Paddy fields b (0.07 %) consist of flooded parcels devoted to rice cultivation |
13. | Greenhouse crops b (0.10 %) consist of high yield crops under controlled conditions |
14. | Irrigated herbaceous crops b (1.96 %) comprise permanently irrigated intensive herbaceous monocultures, including lettuce, asparagus, carrot, onion and garlic crops |
15. | Partly irrigated herbaceous crops b (2.07 %) comprise both irrigated and non-irrigated (but liable to be irrigated) plots where herbaceous crops are grown |
16. | Non-irrigated herbaceous crops b (0.49 %) consist of irrigated herbaceous crop areas that were not being watered at the moment of taking the image |
17. | Partly irrigated woody crops c (0.14 %) are composed of both irrigated and non-irrigated (but liable to be irrigated) plots where woody crops are grown |
18. | Citrus cropsc (0.64 %) include orange, lemon, mandarin and grapefruit trees, among other irrigated woody species |
19. | Irrigated olive grove c (1.17 %) consists of irrigated olive tree monocultures |
20 | Tropical cropsc (0.00003 %) include avocado, cherimoya, mango and medlar trees, among other irrigated woody species |
21. | Irrigated woody crops c (0.13 %) include other irrigated woody crops not aforementioned |
22. | Irrigated woody heterogeneous crops c (0.1 %) consist of undifferentiated woody crop mixtures under irrigated conditions |
23. | Rainfed herbaceous and woody crops d (0.63 %) consist of annual herbaceous crops associated with permanent woody crops under dry farming conditions |
24. | Irrigated herbaceous and woody crops d (0.23 %) consist of annual herbaceous crops associated with permanent woody crops under irrigated conditions |
25. | Partly irrigated herbaceous and woody crops d (0.02 %) comprise woody and herbaceous crop mixtures under either dry farming or irrigated conditions |
26. | Non-irrigated herbaceous and woody crops d (0.01 %) are composed of woody and herbaceous crop mixtures which are situated on non-irrigated plots at the moment of taking the image |
27. | Rainfed and irrigated herbaceous crops d (2.58 %) comprise undifferentiated herbaceous crop mixtures under either rainfed or irrigated conditions |
28. | Rainfed and irrigated herbaceous and woody crops d (0.31 %) are composed of undifferentiated woody and herbaceous crop mixtures under either rainfed or irrigated conditions |
29. | Rainfed and irrigated woody crops d (0.01 %) comprise undifferentiated woody crop mixtures under either rainfed or irrigated conditions |
30. | Herbaceous crops and grasslands e (0.25 %) consists of land mainly occupied by undifferentiated herbaceous crops, with significant areas of grassland |
31. | Herbaceous crops and woody natural vegetation e (0.19 %) is composed of land mainly covered by herbaceous crops, with a important portion occupied by woody natural vegetation |
32. | Woody crops and grasslands e (0.02 %) consists of land mainly occupied by undifferentiated woody crops, with significant areas of grassland |
33. | Woody crops and woody natural vegetation e (0.43 %) is composed of land mainly covered by woody crops, with a important portion occupied by woody natural vegetation |
34. | Herbaceous and woody crops and natural vegetation e (0.19 %) includes other undifferentiated crop mixtures associated with natural vegetation, not aforementioned |
Rights and permissions
About this article
Cite this article
Maldonado, A.D., Aguilera, P.A. & Salmerón, A. Continuous Bayesian networks for probabilistic environmental risk mapping. Stoch Environ Res Risk Assess 30, 1441–1455 (2016). https://doi.org/10.1007/s00477-015-1133-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-015-1133-2