Gaussian mixture discriminant analysis and sub-pixel land cover characterization in remote sensing

https://doi.org/10.1016/S0034-4257(02)00172-4Get rights and content

Abstract

Mixture analysis is a necessary component for capturing sub-pixel heterogeneity in the characterization of land cover from remotely sensed images. Mixture analysis approaches in remote sensing vary from conventional linear mixture models to nonlinear neural network mixture models. Linear mixture models are fairly simple and generally result in poor mixture analysis accuracy. Neural network models can achieve much higher accuracy, but typically lack interpretability. In this paper we present a mixture discriminant analysis (MDA) model for inferring land cover fractions within forest stands from Landsat Thematic Mapper images. Specifically, individual class distributions are modeled as mixtures of subclasses of Gaussian distributions, and land cover fractions are estimated using the corresponding posterior probabilities. Compared to a benchmark study on accuracy of mixture models with Plumas National Forest data, this MDA model easily outperforms traditional linear mixture models and parallels the performance of the ARTMAP neural network mixture model. In other words, the MDA model is observed to successfully combine the performance characteristics of more complex neural network models (due to the nonlinear nature of its classification rules), with the ease of interpretation associated with linear mixture models (due to its relatively simple structure). MDA models therefore offer an attractive alternative for addressing the mixture modeling problem in remote sensing.

Introduction

The extraction of land cover information from remote sensing images traditionally is viewed as a classification problem which labels each pixel in the image as one of only a few possible classes. However, in reality, all degrees of mixing of pure land cover classes within pixels can be found due to the continuum of variation found in the landscape (Foody, 1996b) and the intrinsic mixed nature of most land covers (Schowengerdt, 1996). Hence, discretization of land cover into a limited number of categories contributes to a loss of information. Alternatively, mixture modeling in remote sensing predicts the respective fractions of land cover classes within pixels and characterizes land cover more accurately by decomposing a pixel into a small number of “pure” land cover classes. The resulting mixture map represents the fractions of pure land covers within pixels. For example, a pixel may be denoted in such maps by the following mixture fractions—80% conifer, 10% hardwood, and 10% brush. A traditional thematic map would label it as conifer by majority rule. This mixture information is very important for forestry, wildlife conservation (Woodcock, Gopal, & Albert, 1996), and global climate modeling (DeFries, Townshend, & Hansen, 1999). Classification and mixture analysis are not mutually exclusive. One nontrivial use of the mixture information is that discrete classification maps of any type can be produced out of the continuous land cover information if desired. For example, Adams et al. (1995) classified remote sensing images according to the dominant ground cover inferred from mixture analysis. In both classification and mixture analysis, the intra-class variability caused by factors such as age, health condition, and species uniformity often adds complexity to the task.

Several attempts have been made to characterize land cover at sub-pixel level using remote sensing data, including linear mixture models Adams et al., 1995, Roberts et al., 1998, Roberts et al., 1993, Smith et al., 1990, neural networks Atkinson et al., 1997, Carpenter et al., 1999, Foody, 1998, Foody et al., 1997, fuzzy classifiers (Foody, 1996a), maximum likelihood classifiers Foody et al., 1992, Häme et al., 2001, Schowengerdt, 1996, regression trees (DeFries et al., 1997), and decision trees (McIver & Friedl, 2002). Prior research has shown that linear mixture models, which yield simple linear decision rules, often generate poor to moderate results. In addition, as demonstrated by researchers Borel & Gerstl, 1994, Ray & Murray, 1996, linear mixture models may not be suitable in cases when multiple scattering results in nonlinear mixing. In this context, a nonlinear decision rule can produce better results. Atkinson et al. (1997) applied a mixture model based on a Multilayer Perceptron neural network to decompose AVHRR imagery. The mixture information from their model was more accurate compared to that generated through linear mixture models and fuzzy c-means classifiers. Carpenter et al. (1999) presented an algorithm for mixture estimation based on an ARTMAP neural network, and applied it for identifying life form components of the vegetation mixture from Landsat Thematic Mapper (TM) imagery. The ARTMAP-based mixture model was able to capture nonlinear boundaries between classes and thus performed better in terms of accuracy compared with the conventional linear mixture models.

For purposes of comparison, we will take methods based on linear mixture models and ARTMAP models as representative extremes from this literature. The commonly used linear mixture model1 is effectively a constrained linear model, wherein the spectra of a mixed pixel is modeled as a linear combination of spectra of “pure” land covers called “endmembers” (with weights constrained to be positive and sum to one) Adams et al., 1995, Smith et al., 1990. The assumptions are independent sampling and a single common, multivariate Gaussian error model. In addition, least squares is used to estimate class fractions from observed data. The end result captures the effect of land cover mixing at the level of the mean, but is often unable to truly match the multi-modal nature of the underlying data. Correspondingly, these methods tend to exhibit a relatively poor level of accuracy in most situations. For example, in the study by Carpenter et al. (1999), the spectra of endmembers of each of the four land cover classes varied to a great degree, resulting in poor accuracy. That prompted a selection of two sets of endmembers, “exterior” endmembers and “interior” endmembers. The “exterior” endmembers were the pixels whose spectral values were at the exterior of the scatter plots of TM Bands 3 and 4 and TM Bands 4 and 5 formed by endmembers of all classes. The “interior” endmembers were at the interior of the scatter plots. These “exterior” and “interior” endmembers were averaged respectively to get two sets of mean spectra. While the two sets of endmembers could give fairly good results for some classes, neither could produce good overall accuracy. Neural network models, however, like those based on the ARTMAP architecture of Carpenter et al., 1999 have been reported to estimate fractional coverage with much higher accuracy, due to their ability to represent highly complex nonlinear functions. However, this feature has also been accompanied by criticism that neural networks can be difficult to use and yield little in the way of explanation or interpretation, despite the fact that researchers in the neural network community have attempted to interpret its “black box” nature to some degree (e.g. Liu, Gopal, & Woodcock, 2001, Chap. 12).

In this paper, we develop a mixture discriminant analysis (MDA) model for estimating fractions of four land cover classes, conifer, hardwood, barren, and brush within forest stands from Landsat TM images. While MDA models have been around informally for some years now in fields like statistics and pattern recognition, they seem to have been explored formally only recently by Hastie and Tibshirani (1996) within statistics and, to the best of our knowledge, have found no application to land cover characterization in remote sensing to date. Employing this MDA framework, we model each land cover class distribution as a mixture of subclasses of multivariate Gaussian distributions. We discuss the training of this model and propose an estimator based on the posterior distribution of classes, given data. In the spirit of Carpenter et al. (1999), and using the same data, we conduct a numerical study in which we compare our MDA-based method with the methods based on the linear mixture and the ARTMAP neural network mixture models described by Carpenter et al., 1999. We find that with little loss in simplicity and interpretability over the linear mixing approach, our MDA approach is able to nearly match the performance of the neural network approach. In addition, the results of the MDA method are more interpretable and statistically based compared with the neural network approach.

The outline of this paper is as follows. Section 2 describes the MDA model for the mixture problem in remote sensing. Section 3 describes the Plumas National Forest data used in this study, and illustrates how MDA is trained and applied for mixture analysis. Section 4 compares the performance of these mixture methods in terms of mixture analysis accuracy. Finally this paper ends with some discussions and conclusions in 5 Discussion, 6 Conclusions.

Section snippets

Model description: mixture discriminant analysis (MDA)

In this section, we briefly describe the mixture discriminant analysis (MDA) modeling framework, as outlined in Hastie and Tibshirani (1996). These authors consider MDA in some generality, focusing in particular on a number of variations on the basic modeling and fitting strategy, and consider its application to tasks such as the recognition of handwritten digits. Our focus is on the adaptation of this framework to sub-pixel land cover characterization in remote sensing. MDA can be viewed as an

Field measurement and satellite sensor data

The study area, Plumas National Forest of California, is characterized by temperate conifer forests mixed with chaparral brush fields and hardwood forests. For the purpose of forest management, the quantification of conifer, hardwood, and brush within stands is useful (Carpenter et al., 1999). As a result, four land cover classes are identified, i.e., conifer, hardwood, brush, and barren.

The data used in this study consist of two components: field measurements of land cover fractions and the

Results

We compare the results of MDA with those of the two linear mixture models and the ARTMAP mixture model previously published by Carpenter et al. (1999). In that study, a linear mixture model was tested using two different sets of endmembers—exterior endmembers and interior endmembers—which were chosen to address the variability among the “pure” endmembers. Exterior endmembers were the means of the exterior sites in spectral measurement space of all the pure sites while interior endmembers are

Discussion

Classification of land cover is one of the primary objectives of the use of remote sensing data. Increasingly, global climate models and terrestrial ecosystem models require specification of mixtures of land covers DeFries et al., 1999, Woodcock et al., 1996. Fraction estimation is a difficult task given the spectral overlap of the land cover classes and the spectral variability within classes, as is manifested in the scatter plot (Fig. 1) of the TM Bands 3 and 4, and the histograms of Bands 4

Conclusions

The MDA approach captures intra-class variability by modeling each class distribution as a mixture of Gaussian subclass distributions. The posterior probabilities from MDA are assumed to represent the sub-pixel land cover fractions. MDA outperforms linear mixture models and is similar in performance to the ARTMAP neural network mixture model due to the nonlinear nature of its decision boundaries, but without losing the ease of interpretation due to its relatively simple structure. MDA models

Acknowledgements

This research was supported by NSF Grant BCS 0079077 and ONR Award N00014-99-1-0219. We would like to thank Curtis Woodcock, Gail Carpenter, and the staff at the Region 5 Remote Sensing Laboratory of the U.S. Forest Service for providing the data. We also thank the two anonymous reviewers whose valuable comments and suggestion greatly improved this manuscript.

References (30)

  • A.H Strahler

    The use of prior probabilities in maximum likelihood classification of remotely sensed data

    Remote Sensing of Environment

    (1980)
  • P.M Atkinson et al.

    Mapping sub-pixel proportional land cover with AVHRR imagery

    International Journal of Remote Sensing

    (1997)
  • C.A Bateson et al.

    Endmember bundles: a new approach to incorporating endmember variability into spectral mixture analysis

    IEEE Transactions on Geoscience and Remote Sensing

    (2000)
  • G.A Carpenter et al.

    A neural network method for mixture estimation for vegetation mapping

    Remote Sensing of Environment

    (1999)
  • R.S DeFries et al.

    Continuous fields of vegetation characteristics at the global scale at 1-km resolution

    Journal of Geophysical Research

    (1999)
  • Cited by (114)

    • Mapping multi-layered mangroves from multispectral, hyperspectral, and LiDAR data

      2021, Remote Sensing of Environment
      Citation Excerpt :

      While the swampy environment and inaccessibility to mangrove forests often hinder field investigation, remote sensing technology has been applied in mangrove studies for the past three decades (Blasco et al., 1992; Chun et al., 2015; Green et al., 1998; Wang et al., 2004b, 2015), which is still a timely and efficient tool for mangrove mapping and monitoring. The advent of high spatial resolution sensors such as SPOT (HVR, HRVIR, or HRG), Quickbird, IKONOS, and GeoEye that provide meter-level resolution images improved vegetation classification at the species level (Giri et al., 2014; Ju et al., 2003; Wang et al., 2004b, 2004a; Zhou et al., 2009). New generation sensors provide images with both high spatial resolution and novel bands at some spectral regions such as the red edge band which was found strongly related to chlorophyll concentration in leaves (Clevers, 1999; Mutanga and Skidmore, 2007).

    • Groundwater spring potential assessment using new ensemble data mining techniques

      2020, Measurement: Journal of the International Measurement Confederation
    View all citing articles on Scopus
    View full text