Abstract
The general problem of data integration is expressed as that of combining probability distributions conditioned to each individual datum or data event into a posterior probability for the unknown conditioned jointly to all data. Any such combination of information requires taking into account data interaction for the specific event being assessed. The nu expression provides an exact analytical representation of such a combination. This representation allows a clear and useful separation of the two components of any data integration algorithm: individual data information content and data interaction, the latter being different from data dependence. Any estimation workflow that fails to address data interaction is not only suboptimal, but may result in severe bias. The nu expression reduces the possibly very complex joint data interaction to a single multiplicative correction parameter ν 0, difficult to evaluate but whose exact analytical expression is given; availability of such an expression provides avenues for its determination or approximation. The case ν 0=1 is more comprehensive than data conditional independence; it delivers a preliminary robust approximation in presence of actual data interaction. An experiment where the exact results are known allows the results of the ν 0=1 approximation to be checked against the traditional estimators based on assumption of data independence.
References
Benediktsson, Swain (1992) Consensus theoretic classification methods. IEEE Trans Systems Man Cybern 22:688–704
Bordley RF (1982) A multiplicative formula for aggregating probability assessments. Manag Sci 28(10):1137–1148
Caers J, Hoffman TB (2006) The probability perturbation method: an alternative Bayesian approach for solving inverse problems. Math Geol 38(1):81–100
Deutsch C, Journel AG (1998) GSLIB: Geostatistical software library and user’s guide. Oxford University Press, New York
Fleiss JL (1981) Statistical methods for rated and proportions, 2nd edn. Wiley, New York
Galton F (1894) Natural Inheritance, 5th edn. Macmillan, New York
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85:398–409
Genest C, Zidek JV (1986) Combining probability distributions: A critique and an annotated bibliography. Stat Sci 1:114–118
Journel AG (1983) Nonparametric estimation of spatial distributions. Math Geol 15(3):793–806
Journel AG (2002) Combining knowledge from diverse data sources: an alternative to traditional data independence hypothesis. Math Geol 34(5):573–596
Krishnan S (2005). Combining diverse and partially redundant information in the earth sciences. PhD thesis, Department of Geological and Environmental Sciences, Stanford University, USA
Strebelle S (2001) Conditional simulation of complex geological structures using multiple-point statistics. Math Geol 34(1):1–22
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Polyakova, E.I., Journel, A.G. The Nu Expression for Probabilistic Data Integration. Math Geol 39, 715–733 (2007). https://doi.org/10.1007/s11004-007-9117-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-007-9117-5