Building detection by fusion of airborne laser scanner data and multi-spectral images: Performance evaluation and sensitivity analysis

https://doi.org/10.1016/j.isprsjprs.2007.03.001Get rights and content

Abstract

In this paper, we describe the evaluation of a method for building detection by the Dempster–Shafer fusion of airborne laser scanner (ALS) data and multi-spectral images. For this purpose, ground truth was digitised for two test sites with quite different characteristics. Using these data sets, the heuristic models for the probability mass assignments are validated and improved, and rules for tuning the parameters are discussed. The sensitivity of the results to the most important control parameters of the method is assessed. Further we evaluate the contributions of the individual cues used in the classification process to determine the quality of the results. Applying our method with a standard set of parameters on two different ALS data sets with a spacing of about 1 point/m2, 95% of all buildings larger than 70 m2 could be detected and 95% of all detected buildings larger than 70 m2 were correct in both cases. Buildings smaller than 30 m2 could not be detected. The parameters used in the method have to be appropriately defined, but all except one (which must be determined in a training phase) can be determined from meaningful physical entities. Our research also shows that adding the multi-spectral images to the classification process improves the correctness of the results for small residential buildings by up to 20%.

Introduction

Point clouds generated by airborne laser scanning (ALS) are well suited for the automatic detection of buildings. Building detection essentially requires a classification of the input data that separates points situated on buildings from those on other objects, especially trees. In order to accomplish this classification, cues such as the height of ALS points above the terrain or the roughness of the surface described by the ALS points can be used. Additional information can be considered in order to overcome problems occurring with buildings which consist of roof planes that are small in relation to the ALS resolution. These include the height differences between the first and the last echoes of the laser pulse and multi-spectral images of the area. The normalised difference vegetation index (NDVI), derived from multi-spectral images, is well suited for classification in this context (Lu et al., 2006).

Various classification techniques have been applied for building detection, e.g., unsupervised classification (Haala and Brenner, 1999), rule-based classification (Rottensteiner and Briese, 2002), Bayesian networks (Brunn and Weidner, 1997, Stassopoulou et al., 2000), and fuzzy logic (Vögtle and Steinle, 2003, Matikainen et al., 2003). The probabilistic approaches among the cited ones face the difficulty of having to model all a priori probabilities, which is problematic if the assumption of a normal distribution of the data vectors is unrealistic, e.g. in built-up areas (Gorte, 1999). The theory of Dempster–Shafer can help to overcome these problems, because its capability to handle incomplete information provides a tool to reduce the degree to which assumptions about the distribution of the data have to be made (Klein, 1999).

Achieving good results for building detection using an algorithm demonstrates that the method works for the particular application, but it is also important to know how the parameters used in the algorithm are selected. For instance, if parameter selection is based on trial-and-error only, the reproducibility of the results for another data set is questionable. We consider the evaluation of algorithms to be important in order to make different approaches comparable. However, most authors give detection rates and false-alarm rates for the detected buildings, but fail to give a more thorough evaluation of their algorithms. Questions remaining unanswered in this context are related to the dependency of the results on scene and sensor characteristics, the availability of different input data sets, or on the appropriate selection of sensor models and the tuning of the model parameters. In this paper, we especially want to deal with the evaluation of a method of automatic building detection. We have given an extensive overview of classification methods for building detection in Rottensteiner et al. (2005). Therefore this paper will commence with a review of papers dealing with the evaluation of building detection methods.

For an evaluation of automatic feature extraction using a reference data set, two numbers of interest are the completeness and the correctness of the results (Heipke et al., 1997):Completeness=TPTP+FNCorrectness=TPTP+FP.

In Eqs. (1), (2), TP denotes the number of true positives, i.e., the number of entities found to be available in both data sets. FN is the number of false negatives, i.e., the number of entities in the reference data set that were not detected automatically, and FP is the number of false positives, i.e., the number of entities that were detected, but do not correspond to an entity in the reference data set.

Vögtle and Steinle (2003) evaluated their method of building detection using two test data sets of 1 m resolution and achieved detection rates of 93% and 96%, respectively. The authors state that the classification accuracy decreases with the building size, without quantifying this effect. Matikainen et al. (2003) used ALS data for building change detection. Their method detected 90% of all building pixels in a reference map, with a false-alarm rate of 15%. On a per-building basis, completeness and correctness are 91% and 84%, respectively, for buildings larger than 200 m2. The respective values for buildings between 0 and 200 m2 are 42.1% and 34.9%. A minimum percentage overlap of 70% between a detected building and a building in the reference data set is required for the building to be classified as a true positive.

Vosselman et al. (2004) first separate bare earth ALS points from other points and then further classify the other points according to whether they belong to buildings or trees. They apply their classification to the original ALS point clouds. Their results for points on buildings correspond to a completeness of 85% and a correctness of 92%. In their conclusions they state that using additional colour information increased the classification accuracy for buildings by 3%.

In our previous work, we have presented a method for fusing first and last pulse ALS and multi-spectral image data based on the theory of Dempster–Shafer. Completeness and correctness were evaluated for a test site in Australia (Rottensteiner et al., 2005). The main goals of this paper are to present that method in its revised form and to thoroughly evaluate that method using two test sites of different land cover and sensor characteristics. From that evaluation, we want to assess the applicability of our method to different scenes and data from sensors having different characteristics, by finding answers to questions that are not commonly investigated by other authors:

  • How realistic are the model assumptions about the properties of the sensor data?

  • How can the control parameters be tuned?

  • How sensitive are the results to the settings of these control parameters?

  • How do the individual cues used in data fusion contribute to the quality of the classification results?

  • How do the classification results deteriorate with decreasing sensor resolution?

We start with a description of the two test data sets in Section 2. In Section 3, we will give an outline of our previous work, describing the original algorithm for building detection. In Section 4, we will present how that algorithm has been improved. In this context, the statistical models used for classification are evaluated and the rules for parameter tuning will be presented. This is followed by an evaluation of the building detection results in Section 5. We not only present results obtained for standard parameter settings, but also include a sensitivity analysis with respect to the input parameters and the resolution of the input data, and we assess the impact of the individual classification cues. Section 6 will give the conclusions.

Section snippets

Test data sets

We have used two test data sets. The first data set, captured over Fairfield (Australia) using an Optech ALTM 3025 laser scanner, was also used in the earlier study. The second data set was captured over Memmingen (Germany) with a TopoSys scanner. Both cover an area of 2 × 2 km2, and both contain the first and the last echoes of the laser beam. The characteristics of the two test areas are quite different. Fairfield covers a suburban area with low density of development in the southwest half of

The original algorithm for building detection

The input to our method comprises four data sets that have to be generated from the raw data by pre-processing: the two DSM grids corresponding to the first and the last pulse data; a Digital Terrain Model (DTM); and the NDVI. The DTM can be derived from the last pulse DSM by hierarchic morphologic filtering (Rottensteiner et al., 2005) or by robust linear prediction (Rottensteiner and Briese, 2002). In this research we generate the DTM in a pre-processing step using the second method. The

Evaluation of the statistical model and improvements to the algorithm

In this section, we will describe changes to the models for the probability masses in the initial land cover classification that are essential so that the models are more realistic, and a new post-classification technique. The section also contains a discussion on the selection of the model parameters and an empirical validation of our theoretical models.

Methodology

The methodology for evaluation is based on the technique for comparing the classification results with a reference data set described in Rottensteiner et al. (2005), namely on a comparison of two label images: the “automatic label image”, i.e. the output of the building detection algorithm, and the “reference label image” that is generated by rasterizing the reference polygons. We are interested in determining completeness and correctness according to Eqs. (1), (2) for two types of entities.

Conclusion

We have presented a method for building detection based on Dempster–Shafer fusion of ALS data and multi-spectral images. We have validated the assumptions of the model for assigning probability masses using two data sets comprising both different sensor and scene characteristics. For the pixel-based classification we found simple rules for setting the parameters of that model if an estimate for the area covered by trees is known. This was made possible by a re-parameterisation of the model for

Acknowledgements

This work was supported by the Australian Research Council (ARC) under Discovery Project DP0344678 and Linkage Project LP0230563. The Fairfield data set was provided by AAM Hatch (www.aamhatch.com.au). The Memmingen data set was provided by TopoSys (www.toposys.com). The authors want to thank the students Christian Eberhöfer, Werner Mücke, and Gerhard Summer for carrying out the experiments described in this paper.

References (14)

There are more references available in the full text version of this article.

Cited by (0)

View full text