Kernel principal component analysis for stochastic input model generation

https://doi.org/10.1016/j.jcp.2011.05.037Get rights and content

Abstract

Stochastic analysis of random heterogeneous media provides useful information only if realistic input models of the material property variations are used. These input models are often constructed from a set of experimental samples of the underlying random field. To this end, the Karhunen–Loève (K–L) expansion, also known as principal component analysis (PCA), is the most popular model reduction method due to its uniform mean-square convergence. However, it only projects the samples onto an optimal linear subspace, which results in an unreasonable representation of the original data if they are non-linearly related to each other. In other words, it only preserves the first-order (mean) and second-order statistics (covariance) of a random field, which is insufficient for reproducing complex structures. This paper applies kernel principal component analysis (KPCA) to construct a reduced-order stochastic input model for the material property variation in heterogeneous media. KPCA can be considered as a nonlinear version of PCA. Through use of kernel functions, KPCA further enables the preservation of higher-order statistics of the random field, instead of just two-point statistics as in the standard Karhunen–Loève (K–L) expansion. Thus, this method can model non-Gaussian, non-stationary random fields. In this work, we also propose a new approach to solve the pre-image problem involved in KPCA. In addition, polynomial chaos (PC) expansion is used to represent the random coefficients in KPCA which provides a parametric stochastic input model. Thus, realizations, which are statistically consistent with the experimental data, can be generated in an efficient way. We showcase the methodology by constructing a low-dimensional stochastic input model to represent channelized permeability in porous media.

Highlights

KPCA is used to construct a reduced order stochastic model of permeability. ► A new approach is proposed to solve the pre-image problem in KPCA. ► Polynomial chaos is used to provide a parametric stochastic input model. ► Flow in porous media with channelized permeability is considered.

Introduction

Over the past few decades there has been considerable interest among the scientific community in studying physical processes with stochastic inputs. These stochastic input conditions arise from uncertainties in boundary and initial conditions as well as from inherent random material heterogeneities. Material heterogeneities are usually difficult to quantify since it is physically impossible to know the exact property at every point in the domain. In most cases, only a few statistical descriptors of the property variation or only a set of samples can be experimentally determined. This limited information necessitates viewing the property variation as a random field that satisfies certain statistical properties/correlations, which naturally results in describing the physical phenomena using stochastic partial differential equations (SPDEs).

In the past few years, several numerical methods have been developed to solve SPDEs, such as Monte Carlo (MC) method, perturbation approach [1], [2], generalized polynomial chaos expansion (gPC) [3], [4], [5], [6] and stochastic collocation method [7], [8], [9], [10], [11], [12], [13]. However, implicit in all these developments is the assumption that the uncertainties in the SPDEs have been accurately characterized as random variables or random fields through the finite-dimensional noise assumption [11]. The most common choice is the Karhunen–Loève (K–L) expansion [2], [3], [14], [15], which is also known as linear principal component analysis or PCA [16]. Through K–L expansion, the random field can be represented as a linear combination of the deterministic basis functions (eigenfunctions) and the corresponding uncorrelated random variables (random coefficients). The computation of the K–L expansion requires the analytic expression of the covariance matrix of the underlying random field. In addition, the probability distribution of the random variables is always assumed known a priori. These two assumptions are obviously not feasible in realistic engineering problems. In most cases, only a few experimentally obtained realizations of the random field are available. Reconstruction techniques are then applied to generate samples of the random field after extracting the statistical properties of the random field through these limited experimental measurements. These processes are quite expensive and numerically demanding if thousands of samples are needed. This leads to the problem of probabilistic model identification or stochastic input model generation, where the purpose is to find a parametric representation of the random filed through only limited experimental data.

To this end, a polynomial chaos (PC) representation of the random field through experimental data was first proposed in [17] and improved in subsequent papers [18], [19], [20], [21], [22], [23]. This framework consists of three steps: (1) computing the covariance matrix of the data numerically using the available experimental data; (2) stochastic reduced-order modeling with a K–L expansion to obtain a set of deterministic basis (eigenfunctions) and the corresponding random expansion coefficients (called K–L projected random variables); (3) a polynomial chaos representation of these random coefficients is constructed given the realizations of these coefficients which are calculated from data. These realizations are then used for the estimation of the deterministic coefficients in the PC representation, where several methods have been proposed. In the pioneering work [17], maximum likelihood estimation was used to find the PC representation of the K–L random variables. In [18], a Bayesian inference approach was used instead to construct the PC representation of the random field. However, these two works did not take into account the dependencies between various components of the K–L projected random variables. In [19], the Rosenblatt transformation [24] was used to capture these dependencies and maximum entropy approach together with Fisher information matrix was used for the estimation of the PC coefficients. In [20], [21], [25], a non-intrusive projection approach with Rosenblatt transformation was developed for the PC estimation. Apart from PC representation, in [26], the distribution of the K–L random variables was assumed directly to be uniform within the range of the realizations of these coefficients from data. Later, in [27], the distribution of the K–L random variables was still assumed to be uniform. However, the range of the uniform distribution was found through enforcing the statistical constraints of the random field and solving the resulting optimization problems. In [28], the uncertain input parameters are modeled as independent random variables, whose distributions are estimated using a diffusion-mixed-based estimator. Except the work in [28], all the previous developments rely heavily on the K–L expansion for the reduced-order modeling. However, the K–L expansion has one major drawback. The K–L expansion based stochastic model reduction scheme constructs the closest linear subspace of the high-dimensional input space. In other words, it only preserves the mean and covariance of the random field, and therefore is suitable for multi-Gaussian random fields. But most of the random samples contain essential non-linear structures, e.g. higher-order statistics. This directly translates into the fact that the standard linear K–L expansion tends to over-estimate the actual intrinsic dimensionality of the underlying random field. Hence, one needs to go beyond a linear representation of the high-dimensional input space to accurately access the effect of its variability on the output variables.

To resolve this issue, the authors in [29] proposed a nonlinear model reduction strategy for generating data-driven stochastic input models. This method is based on the manifold learning method, where multidimensional scaling (MDS) is utilized to map the space of viable property variations to a low-dimensional region. Then isometric mapping from this high-dimensional space to a low-dimensional, compact, connected set is constructed via preserving the geodesic distance between data using the IsoMap algorithm [30]. However, this method has two major issues. First, after dimensionality reduction, it only gives us a set of low-dimensional points. It does not give us the inherent patterns (similar to the eigenfunctions as in the K–L expansion) in the embedded random space. Therefore, it cannot provide us a mathematical parametric input model as in the K–L expansion. Here, we are interested in a representation of the form y = f(ξ), where the vector y is a realization of a discrete random field, and the vector ξ, of dimension much smaller than the original input stochastic dimension, is a set of independent random variables with a specified distribution. In addition, when new experimental data becomes available, this nonlinear mapping needs to be recomputed. Secondly, the IsoMap algorithm requires the computation of the geodesic distance matrix among data. In general, this matrix may be not well defined for real data. Even if it is well defined, the algorithm is computationally expensive.

Both problems of the K–L expansion and the non-linear model reduction algorithm in [29] can be solved with kernel principal component analysis (KPCA), which is a nonlinear form of PCA [31], [32]. KPCA has proven to be a powerful tool as a nonlinear feature extractor of classification algorithm [31], pattern recognization [33], image-denosing [34] and statistical shape analysis [35]. The basic idea is to map the data in the input space to a feature space F through some nonlinear map Φ, and then apply the linear PCA there. Through the use of a suitably chosen kernel function, the data becomes more linearly related in the feature space F. In the context of stochastic modeling, there are two pioneering works [36], [37]. In [36], KPCA was used to construct the prior model of the unknown permeability field and then gradient-based algorithm is used to solve the history-matching problem. However, the random expansion coefficients of the linear PCA in the feature space are assumed i.i.d. standard normal random variables. This choice clearly does not capture the statistical information from the data. In [37], KPCA is used in the feature space F for the selection of a subset of representative realizations containing similar properties to the larger set.

Motivated by all the above mentioned works, in this paper, a stochastic non-linear model reduction framework based on KPCA is developed. To be specific, the KPCA is first used to construct the stochastic reduced-order model in the feature space. The random coefficients of the linear PCA are then represented via PC expansion. Because the K–L expansion is performed in the feature space, the resulting realizations lie in the feature space, and therefore, the mapping from the feature space back to the input space is needed. This is called the “pre-image problem” [34], [38]. The pioneering work in solving the pre-image problem is Mika’s fixed-point iterative optimization algorithm [34]. However, this method suffers from numerical instabilities. It is sensitive to the initial guess and is likely to get stuck in local minima. Therefore, it is not suitable for our stochastic modeling. It is noted that this algorithm was also used in the work [36], [37] to find the pre-image. In [38], the authors determine a relationship between the distances in the feature space and the distances in the input space. The MDS is used to find the inverse mapping and thus the pre-image. This method is expensive since it involves a singular value decomposition on a matrix of nearest neighbors. Recently [39], it was shown that Euclidean distances in the feature space correspond to field-dependent distances in the input space. In this work, we propose a new approach to find the pre-image. It is based on local linear interpolation among n-nearest neighbors using only the distances in the input space, which is similar to the method proposed in our earlier work [29].

This paper is organized as follows: In the next section, the mathematical framework of KPCA is considered. In Section 3, the PC representation of the random coefficients is developed. In Section 4, the new pre-image algorithm is introduced. An Example with channelized permeability is given in Section 5. Finally, concluding remarks are provided in Section 6.

Section snippets

Problem definition

Let us define a complete probability space (Ω,F,P) with sample space Ω which corresponds to the outcomes of some experiments, F2Ω is the σ-algebra of subsets in Ω and P:F[0,1] is the probability measure. Also, let us define D as a two-dimensional bounded domain. Denote a(x, ω) the random field (property) used to describe and provide a mathematical model of the available experimental data. The random field a(x, ω) in general belongs to an infinite-dimensional probability space. However, in most

Polynomial chaos representation of the stochastic reduced-order model

The problem is now to identify a random vector ξ:FRr, given N independent samples ξ(i)i=1N. For this purpose, we use a polynomial chaos (PC) expansion to represent ξ. Several methods have been proposed to compute the expansion coefficients in the resulted PC expansion, such as maximum likelihood [17], Bayesian inference [18], maximum entropy method [19] and non-intrusive projection method [21], [25]. As mentioned before, the components of random vector ξ are uncorrelated but not independent.

The pre-image problem in KPCA

The simulated realizations of the random field from Eq. (39) are in the feature space F. However, we are interested in obtaining realizations in the physical input space. Therefore, an inverse mapping is required as y = Φ−1(Y). This is known as the pre-image problem [34], [38]. As demonstrated in [34], this pre-image may not exist or even if it exists, it may be not unique. Therefore, we can only settle for an approximate pre-image yˆ such that Φ(yˆ)Yr.

Numerical example

In this section, we apply kernel PCA on modeling random permeability field of complex geological channelized structures. This structure is characterized by multipoint statistics (MPS), which expresses the joint variability at more than two locations [43]. Therefore, only mean and correlation structure (two-point statistics) are not enough to capture the underlying uncertainty of the random field and thus linear PCA or standard K–L expansion is expected to fail.

Conclusion

In this paper, a new parametric stochastic reduced-order modeling method is proposed, which can efficiently preserve higher-order statistics of the random field given limited experimental data. This method relies on the theory of kernel PCA to transform the data into a feature space, in which the data is more linearly related than in the original input space. PC expansion is then utilized to identify the random coefficients in the kernel PCA formulation. A simple approximate pre-image algorithm

Acknowledgments

This research was supported by the US Department of Energy, Office of Science, Advanced Scientific Computing Research, the Computational Mathematics program of the National Science Foundation (NSF) (award DMS- 0809062) and an OSD/AFOSR MURI09 award on uncertainty quantification. The computing for this research was supported by the NSF through TeraGrid resources provided by NCSA under Grant No. TG-DMS090007.

References (47)

  • C. Soize

    Identification of high-dimension polynomial chaos expansions with random coefficients for non-gaussian tensor-valued random fields using partial and limited experimental data

    Computer Methods in Applied Mechanics and Engineering

    (2010)
  • B. Ganapathysubramanian et al.

    Modeling diffusion in random heterogeneous media: data-driven models, stochastic collocation and the variational multiscale method

    Journal of Computational Physics

    (2007)
  • B. Ganapathysubramanian et al.

    A non-linear dimension reduction methodology for generating data-driven stochastic input models

    Journal of Computational Physics

    (2008)
  • D. Venturi

    A fully symmetric nonlinear biorthogonal decomposition theory for random fields

    Physica D

    (2011)
  • X. Ma et al.

    A stochastic mixed finite element heterogeneous multiscale method for flow in porous media

    Journal of Computational Physics

    (2011)
  • D. Zhang

    Stochastic Method for Flow in Porous Media: Coping with Uncertainties

    (2002)
  • R. Ghanem et al.

    Stochastic Finite Elements: A Spectral Approach

    (1991)
  • D. Xiu et al.

    The Wiener–Askey polynomial chaos for stochastic differential equations

    SIAM Journal on Scientific Computing

    (2002)
  • I. Babuška et al.

    A stochastic collocation method for elliptic partial differential equations with random input data

    SIAM Journal on Numerical Analysis

    (2007)
  • D. Xiu et al.

    High-order collocation methods for differential equations with random inputs

    SIAM Journal on Scientific Computing

    (2005)
  • F. Nobile et al.

    A sparse grid stochastic collocation method for partial differential equations with random input data

    SIAM Journal on Numerical Analysis

    (2008)
  • X. Ma et al.

    An efficient bayesian inference approach to inverse problems based on an adaptive sparse grid collocation method

    Inverse Problems

    (2009)
  • M. Loève

    Probability Theory

    (1977)
  • Cited by (0)

    View full text