Elsevier

Spatial Statistics

Volume 16, May 2016, Pages 53-71
Spatial Statistics

Conditioning multiple-point statistics simulations to block data

https://doi.org/10.1016/j.spasta.2016.02.005Get rights and content

Abstract

Multiple-points statistics (MPS) allows to generate random fields reproducing spatial statistics derived from a training image. MPS methods consist in borrowing patterns from the training set. Therefore, the simulation domain is assumed to be at the same resolution as the conceptual model, although geometrical deformations can be handled by such techniques. Whereas punctual conditioning data corresponding to the scale of the grid node can be easily integrated, accounting for data available at larger scales is challenging. In this paper, we propose an extension of MPS able to deal with block data, i.e. target mean values over subsets of the simulation domain. Our extension is based on the direct sampling algorithm and consists to add a criterion for the acceptance of the candidate node scanned in the training image to constrain the simulation to block data. Likelihood ratios are used to compare the averages of the simulated variable taken on the informed nodes in the blocks and the target mean values. Moreover, the block data may overlap and their support can be of any shape and size. Illustrative examples show the potential of the presented algorithm for practical applications.

Introduction

The multiple-point statistics (MPS) methods have become very popular in earth sciences, because they allows to generate highly heterogeneous random fields reproducing the spatial statistics of a conceptual geological model, the training image, given by the user. These methods overcome some limitations of classical geostatistical simulation techniques based on two-point statistics: variogram-based methods such as sequential Gaussian simulation, sequential indicator simulation (sisim) (Deutsch and Journel, 1998), transition probability based approaches such as TProgs (Carle, 1996), or Markovian-type categorical prediction (MCP) based on a maximum entropy principle (Allard et al., 2011). Among the existing MPS simulation algorithms, snesim (Strebelle, 2002) and impala (Straubhaar et al., 2011, Straubhaar et al., 2013) successively populate each node of the simulation grid by randomly drawing a facies category according to a probability distribution conditioned to the data event centered at the simulated node, computed from a catalog of patterns found in the training image. A storage based on a tree structure is used in snesim ensuring computational time efficiency, and a list-based catalog employed in impala guarantees low memory requirements. Using a catalog implies to consider only categorical variables and patterns of fixed geometry. A multiple grid approach (Tran, 1994) is employed in theses algorithms to capture large scale structures while keeping data events of reduced size. On the other hand, the direct sampling algorithm (Mariethoz et al., 2010) is a distance-based MPS algorithm. To simulate a node, the method consists in randomly scanning the training image until the pattern in the training image is compatible with the pattern retrieved from the simulation grid and centered at the simulated node. Then, the central node value is copied and pasted from the training image to the simulation grid. The compatibility between two patterns is related to a distance. This basic simulation principle leads to a very flexible method. Indeed, not using any catalog of patterns, categorical as well as continuous variables can be considered by defining an appropriate distance between data events, and the geometry of the patterns can vary during the simulation allowing to reproduce large scale structures without using a multiple grid approach. In particular, punctual conditioning data can be simply assigned in the simulation grid at the beginning of the simulation, whereas methods based on a multiple grid approach implies some precautions to properly address punctual data (Straubhaar and Malinverni, 2014). Distance-based MPS algorithms include also techniques consisting in pasting patches of the training image in the simulation grid at a time instead of only one pixel value, such as filtersim (Zhang and Journel, 2006) or simpat (Arpat and Caers, 2007). These latter methods use patterns database built from the training image and are also based on multiple grid approaches. Other patch-based MPS algorithms not using multiple grids nor databases consist in pasting overlapping boxes of pixels along a raster path, by minimizing a cross-correlation function over the overlapping region in the algorithm ccsim (Tahmasebi et al., 2012), or by minimizing an error between the common area followed by an optimal cut through this area in the algorithm conditional image quilting (CIQ) (Mahmud et al., 2014). Theses methods allow to better model the connectivity of the structures, but make the conditioning difficult. Therefore, the direct sampling method is appealing due to its simplicity and its flexibility. In particular it can easily be extended to the simulation of multivariate fields (Mariethoz et al., 2010, Mariethoz et al., 2012), providing an intuitive tool to manage various types of nonstationarities.

No matter what MPS technique is considered, the simulation domain is filled by borrowing patterns from the training image, which is assumed to have the same resolution as the simulation grid. Hence, whereas punctual conditioning data (corresponding to the scale of the simulation domain) can be straightforwardly handled, conditioning a simulation at local scale with data defined at a larger scale is quite challenging. Classical parametric methods based on covariance models can be used to integrate data with different support sizes (Liu and Journel, 2009, Journel, 1999). Essentially based on cokriging theory, such techniques nevertheless require point-to-point, point-to-block and block-to-block (cross-) covariance models and imply Gaussian assumptions.

In this paper, we propose an extension of the direct sampling algorithm able to deal with block data, i.e. target values for the average of the simulated variable on subsets of the simulation domain. The principle is to use the block data as a criterion for accepting a candidate location in addition to the comparison of the patterns. The current averages accounting for the already simulated pixels in the blocks plus the candidate value are compared to the target mean values and then a related mismatch for each block data is computed. Indeed, we follow a similar strategy as for multivariate simulations (Mariethoz et al., 2010), where a mismatch (distance) between the training and simulated patterns is computed for each variable. As several mismatches are computed, the condition for accepting a candidate node has to be defined. One can define a global misfit as a weighted average of the considered mismatches and check if it is below a certain threshold, as in the original direct sampling algorithm. This requires to specify a weight per mismatch and a global threshold. Another option is to specify a threshold per compared quantity, and accept the candidate value when each mismatch is less than its corresponding threshold. To develop our extension, we use an implementation of the direct sampling, called DeeSse (Straubhaar, 2015), based on this latter alternative.

It is important to highlight the differences between block data and local probability constraints. Local probability constraints are classically given by probability maps, which gives at any point of the simulation domain the probability of having a given category (facies) in a certain neighborhood. These maps are related to an underlying fixed support size, and represent moving local probabilities or proportions. Such constraints are designed for categorical variables and are handled in classical techniques (Liu, 2006, Krishnan, 2008) by using probability aggregation methods (Allard et al., 2012). Mariethoz et al. (2015) proposed a method to constrain distance-based MPS methods, such as the direct sampling algorithm, to such local probability constraints. On the other hand, although block data for binary variables can be viewed as proportion constraints, block data are defined on fixed groups of nodes of any shape, on which target mean values are given. Moreover, for block data conditioning to make sense, the simulated variable can be either continuous or discrete, provided that it does not represent arbitrary category codes.

Downscaling or increasing the resolution of an image is a classical example where the support of the data (input coarse scale image) is larger than the resolution of the output image. Mariethoz et al. (2011) developed a super-resolution method which consists in using the direct sampling algorithm to generate small-scale structures from the patterns found in the coarse scale (training) image. This tools assumes that the spatial structures have a property of scale invariance also called fractal property. Tang et al. (2015) propose to downscale remotely sensed images accounting for available images at different resolutions, using filtersim as MPS algorithm at coarse scale, combined with an area-to-point cokriging method to integrate the fine scale information.

More generally, the algorithm presented in this paper for MPS simulation accounting for block data can be used in a range of applications, not restricted to downscaling, where a conceptual model for the fine scale is known and the available conditioning data are defined on support of varying size and shape, larger than the resolution of the simulation domain. In its current implementation, the proposed method assumes that the block data are defined as the arithmetic mean of the values located within a block of known geometry. In practice and in certain situations, the block values are defined in a more complex manner, especially for non-additive variables such as permeability. Nevertheless, our method can still be useful to generate simulations that offer an approximation of the fine-scale structures.

This paper is organized as follows. In Section  2, some background information on the direct sampling algorithm is given. Then, in Section  3, block data are defined and we present how the algorithm is extended to account for block data. The manner the block data constraints are treated is presented in detail in Section  4. In Section  5, a multiGaussian test case is presented. MultiGaussian simulation offers a well-known framework, where the random fields are entirely described by an analytical covariance model which is used for the simulation. Hence, expected results are known and provide a point of comparison to study the performances of the proposed method. Then, application examples displaying more complex structures and justifying the use of MPS are given in Section  6. As illustrations, we apply the proposed method to simulate log-permeability fields in a downscaling context and in a situation where conditioning data are available at three different scales. In another synthetic example, we propose to model the subsurface geology conditionally to geophysical data. Finally, the method is discussed in Section  7.

Section snippets

Basic direct sampling algorithm

The basic principle of the direct sampling technique (Mariethoz et al., 2010) for simulating a node x in the simulation grid is to compare the pattern (or data event) d(x) centered at x with patterns of same geometry d(y) centered at random locations in the training image (TI), until d(x) and d(y) are sufficiently similar. For univariate simulation, the direct sampling algorithm requires in input a TI with a unique variable Z, a simulation grid (SG), a normalized distance D used for comparing

Algorithm accounting for block data

Initially designed for continuous variables, block data can be considered for categorical variables as well. In this situation, a unique numerical value is assigned to each category. These values should have a physical signification, so that they can be arranged in order. Under these conditions, the distance (3) can still be used for comparing patterns.

Considering a variable Z in the SG, a block data is defined by a triplet (B,μB,tB) where:

  • B is a set of nodes (pixels) in the SG,

  • μB is a target

Defining the target interval and the error for a block data constraint

In this section, the error EB(8) required by the method is defined in detail.

MultiGaussian test case

In this section, we use our method for generating multiGaussian fields conditioned to block data. The aim is to better understand how the method performs in a situation where the theoretical solution is known because it can be expressed analytically. As a point of comparison, we will also use multiGaussian simulations based on all the point-to-point, point-to-block, and block-to-block covariances. The MPS technique is expected to work less efficiently in this situation because it does not use

Application examples

In this section, the method is applied on synthetic cases, and the computation of the local standard deviation (18) required to handle error on a block data constraint is illustrated.

Discussion and conclusion

In this paper, a MPS simulation method allowing to account for data at different scales is proposed. The technique is based on the direct sampling algorithm (Mariethoz et al., 2010) which consists in randomly searching in the TI for a pattern compatible to the one centered at the simulated node. Several constraints in addition to the similarity of patterns can be handled. When scanning the TI, a misfit according to each constraint is computed and the scan is stopped as soon as all misfits are

Acknowledgments

We are grateful to Fabio Oriani for helpful discussions, and we also thank Denis Allard and an anonymous reviewer for their constructive criticism giving us the opportunity to improve the quality of this paper.

References (31)

  • P. Bayer et al.

    High resolution multi-facies realizations of sedimentary reservoir and aquifer analogs

    Sci. Data

    (2015)
  • S.F. Carle

    A transition probability-based approach to geostatistical characterization of hydrostratigraphic architecture

    (1996)
  • S. Demir et al.

    On the adaptive Nadaraya–Watson kernel regression estimators

    Hacet. J. Math. Stat.

    (2010)
  • C. Deutsch et al.

    GSLIB: Geostatistical Software Library and User’s Guide

    (1998)
  • C.R. Dietrich et al.

    A fast and exact method for multidimensional Gaussian stochastic simulations: Extension to realizations conditioned on direct and indirect measurements

    Water Resour. Res.

    (1996)
  • Cited by (25)

    • An enhanced direct sampling (DS) approach to model the geological domain with locally varying proportions: Application to Golgohar iron ore mine, Iran

      2021, Ore Geology Reviews
      Citation Excerpt :

      It should be noted that the idea of the proposed approach used here is adopted from Straubhaar et al. (2016). Straubhaar et al. (2016) used an extension of MPS based on two thresholds to compare the averages of the simulated variable taken on the informed nodes in the blocks and the target mean values. As mentioned above, the selection of a representative TI is one of the foremost challenges in the practical application of MPS methods.

    View all citing articles on Scopus
    View full text