Learning multivariate functions with low-dimensional structures using polynomial bases
Introduction
The approximation of high-dimensional functions is an active research topic and of high relevance in numerous applications. We assume a setting where we are given scattered data about an unknown function. The related approximation problem is generally referred to as scattered data approximation. Classical methods suffer from the curse of dimensionality in this setting, i.e., the amount of required data increases exponentially with the spatial dimension. Finding ways to circumvent the curse poses the main challenge in this high-dimensional setting. Besides finding an approximation there is the ever more important question of interpretability. In many applications one wishes to understand how important the different dimensions and dimension interactions are in order to interpret the results.
In this paper we consider functions defined over the cube with a high spatial dimension . Given scattered data about , i.e., a finite sampling set and evaluations , we aim to construct an approximation of and simultaneously understand its structure, i.e., how important variables and their interactions are. As opposed to black-box approximation or active learning, we may not choose the location of the nodes in . This prohibits us from using well-established spatial discretizations such as sparse grids, see [1], [2], or rank-1 lattices, see [3], [4], [5], that use low-dimensional structures in the node set. Our approach to circumvent the curse of dimensionality is to assume sparsity in the (analysis of variance) ANOVA decomposition of the function, i.e., we assume that is dominated by a small number of low-complexity interactions. This may also be referred to as sparsity-of-effects, see e.g. [6].
We focus on complete orthonormal systems in where the functions are tensor products of univariate polynomials, e.g., the Chebyshev polynomials. Any function from the weighted Lebesgue space can then be written as a series with coefficients , . Our method focuses on approximations using partial sums of the type , with grouped finite index sets that reflect the low-dimensional structure of . Determining a frequency index set that yields a good approximation while not scaling exponentially in poses one of the main challenges.
The method presented here uses the classical ANOVA decomposition, see [2], [7], [8], [9], as a main tool. The decomposition is important in the analysis of the dimensions for multivariate, high-dimensional functions. It has also been used in understanding the reason behind the success of certain quadrature methods for high-dimensional integration [10], [11], [12] and also infinite-dimensional integration [13], [14], [15]. The unique and orthogonal ANOVA decomposition decomposes a -variate function in ANOVA terms where each term belongs to a subset of . The terms depend only on the variables in the corresponding subset and the number of these variables is the order of the ANOVA term.
Our method assumes sparsity by restricting the number of possible simultaneous dimension interactions. The knowledge that the function has a structure such that it can be well approximated using this sparsity assumption is the only information we require a-priori. The approach allows us to learn the basis coefficients by solving a least-squares problem. The problem is hard to solve in general since we are dealing with a large system matrix, but we are able to apply the concept of grouped transformation, see [16], to tackle this issue. In summary, we present a method for the approximation of high-dimensional functions with a low-dimensional structure using possibly noisy scattered data.
The outline of the paper is as follows. In Section 2 we introduce some necessary preliminaries for weighted Lebesgue spaces with complete orthonormal systems of polynomials. Moreover, we discuss the non-equispaced fast cosine transform and the fast polynomial transform for the evaluation of Chebyshev partial sums and computing the basis exchange from any polynomial bases to the Chebyshev system, respectively. In Section 3 we consider the properties of the ANOVA decomposition in the previously explained setting of weighted Lebesgue spaces. The approximation method itself is discussed in Section 4 with numerical examples in Section 5.
Section snippets
Prerequisites, notation and orthogonal polynomials
Let be a non-negative weight function with then we define the weighted Lebesgue space with the inner product Moreover, we consider a complete orthonormal system of polynomials in . Here, we have with denoting the set of polynomials of degree . Taking the products we find that the system is an orthonormal basis in the tensor
Classical analysis of variance decomposition on the interval
In this section we introduce the ANOVA decomposition in the setting of weighted Lebesgue spaces with orthonormal polynomials as bases. See also [2], [7], [9], [20], [21]. For a given spatial dimension we denote with the set of coordinate indices and subsets as bold small letters, e.g., . The complement of those subsets is always with respect to , i.e., . For a vector we define . Furthermore, we use the -norm (or quasi norm) of a vector which is
Approximation method
In this section, we present a method for the approximation of functions with a high spatial dimension such that . In scattered data approximation, the data consists of a finite set of sampling nodes and a vector of values . Now, we assume that , i.e., the entries of are noisy evaluations of the function. Here, it is especially important that we cannot choose the location of the nodes . The space and the
Numerical experiments
In this section we apply the proposed approximation method to high-dimension benchmark functions. We start with an -dimensional function that is the sum of products of B-splines in Section 5.1. A similar function has been considered in [4]. In Section 5.2 we consider the well-known Friedman benchmark functions which have previously been used as an example for a synthetic regression problem, cf. [32], [33], [34], [35]. The method has been implemented as a Julia package [36]. The padding
Summary
In this paper we considered the classical ANOVA decomposition for functions in weighted Lebesgue spaces with orthogonal polynomials as bases. Specifically, we proved relations between the basis coefficients of the projections , the ANOVA terms , and the function . Furthermore, we considered sensitivity analysis and truncating the ANOVA decomposition to a certain subset of terms.
We introduced a method to determine important ANOVA terms, i.e., terms with a high global
Acknowledgments
We thank Tino Ullrich and Toni Volkmer for fruitful discussions on the contents of this paper. Daniel Potts acknowledges funding by Deutsche Forschungsgemeinschaft (German Research Foundation) – Project-ID 416228727 – SFB 1410. Michael Schmischke is supported by the BMBF, Germany grant 01S20053A.
References (36)
- et al.
Dimension-wise integration of high-dimensional functions with applications to finance
J. Complexity
(2010) - et al.
Infinite-dimensional integration and the multivariate decomposition method
J. Comput. Appl. Math.
(2017) - et al.
Computing Fourier transforms and convolutions on the 2–sphere
Adv. Appl. Math.
(1994) Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates
Math. Comput. Simulation
(2001)- et al.
The smoothing effect of the ANOVA decomposition
J. Complexity
(2010) - et al.
Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDE
J. Math. Pures Appl.
(2015) - et al.
The support vector machine under test
Neurocomputing
(2003) - et al.
Fast high-dimensional approximation with sparse occupancy trees
J. Comput. Appl. Math.
(2011) Sparse grids and related approximation schemes for higher dimensional problems
Multivariate sparse FFT based on rank-1 Chebyshev lattice sampling
Experiments - Planning, Analysis, and Optimization
Valuation of mortgage-backed securities using Brownian bridges to reduce effective dimension
J. Comput. Finance
General foundations of high dimensional model representations
J. Math. Chem.
Estimating mean dimensionality of analysis of variance decompositions
J. Amer. Statist. Assoc.
Cited by (13)
Numerical Fourier Analysis
2023, Applied and Numerical Harmonic AnalysisError Guarantees for Least Squares Approximation with Noisy Samples in Domain Adaptation
2023, SMAI Journal of Computational Mathematics