Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

https://doi.org/10.1016/j.isprsjprs.2015.01.010Get rights and content

Abstract

Downed dead wood is regarded as an important part of forest ecosystems from an ecological perspective, which drives the need for investigating its spatial distribution. Based on several studies, Airborne Laser Scanning (ALS) has proven to be a valuable remote sensing technique for obtaining such information. This paper describes a unified approach to the detection of fallen trees from ALS point clouds based on merging short segments into whole stems using the Normalized Cut algorithm. We introduce a new method of defining the segment similarity function for the clustering procedure, where the attribute weights are learned from labeled data. Based on a relationship between Normalized Cut’s similarity function and a class of regression models, we show how to learn the similarity function by training a classifier. Furthermore, we propose using an appearance-based stopping criterion for the graph cut algorithm as an alternative to the standard Normalized Cut threshold approach. We set up a virtual fallen tree generation scheme to simulate complex forest scenarios with multiple overlapping fallen stems. This simulated data is then used as a basis to learn both the similarity function and the stopping criterion for Normalized Cut. We evaluate our approach on 5 plots from the strictly protected mixed mountain forest within the Bavarian Forest National Park using reference data obtained via a manual field inventory. The experimental results show that our method is able to detect up to 90% of fallen stems in plots having 30–40% overstory cover with a correctness exceeding 80%, even in quite complex forest scenes. Moreover, the performance for feature weights trained on simulated data is competitive with the case when the weights are calculated using a grid search on the test data, which indicates that the learned similarity function and stopping criterion can generalize well on new plots.

Introduction

Lying dead wood is considered an important component of forest ecosystems from an ecological perspective due to their role in providing habitat for various plants and animals (Siitonen et al., 2000), in facilitating tree regeneration (Weaver et al., 2009) as well as in forest nutrient cycles through contributing to soil organic material (Harmon et al., 1986). Also, dead wood is a significant element of the total carbon stock in forests (Woodall et al., 2008). For these reasons, qualitative and quantitative information about the spatial distribution of dead wood in forests is important for any organizations interested in monitoring biodiversity, carbon sequestration and wildlife habitats.

During the last decade, Airborne Laser Scanning (ALS) has become an established technique for carrying out forest inventory tasks (Hyyppä et al., 2012). The emergence of new, commercially available hardware LiDAR platforms has stimulated research in this field, leading to the development of methods for estimating various forest parameters, thereby offering an alternative to time-consuming field inventories. Airborne LiDAR has been successfully deployed for deriving parameters of entire stands, such as leaf area index (LAI) (Morsdorf et al., 2006), forest land cover characteristics (Antonarakis et al., 2008), mean tree height (Naesset, 1997) and 3D vegetation structure (Lindberg et al., 2012). This is usually done by building a regression model with independent variables derived from characteristics of the ALS points and the dependent variable corresponding to the forest parameter of interest, where the training data is obtained using field works. With the gradual increase in precision of the laser measurement hardware and hence in the density of the ALS point cloud, single tree approaches were made feasible. Consequently, there has been a multitude of contributions dealing with the detection and delineation of individual trees from the LiDAR data. One of the fundamental research topics concerns tree species classification (Reitberger et al., 2008, Heurich, 2008, Holmgren and Persson, 2004). Obtaining the species label for each individual tree is not only a goal in itself, as deriving species-specific models for other forest parameters of interest may lead to an increased overall estimation accuracy. Other single-tree method applications include detection of standing dead trees (Yao et al., 2012b), estimation of stem volume and DBH (Yao et al., 2012a) as well as biomass (Kankare et al., 2013).

While there have been several contributions related to area-based estimation of the proportion (Vehmas et al., 2011) and volume (Pesonen et al., 2008) of downed stems using the aforementioned regression techniques, the problem of detecting individual fallen trees from ALS point clouds is relatively new. Among the first methodological studies was the work of Blanchard et al. (2011). This approach relies on rasterizing the point cloud with respect to different metrics which quantify elevation and scene information. A vector thematic layer is created and polygons are manually masked in order to retain polygons that best match the shape of a downed stem. Refinement of this layer is achieved by setting various thresholds determined through visual interpretation. Finally, a multiresolution, bottom-up pairwise region merging algorithm is applied on the raster and vector layers to produce an object-level segmentation. Each object is then classified into one of three categories: downed logs, canopy cover and ground using a total of 113 individual refinement and classification rules. The authors report a classification accuracy of 73% and note that their approach has some difficulty in situations like proximity of the stems to ground vegetation as well as large clusters of logs. Also, the task of constructing the classification rules and setting the various thresholds based on visual inspection may take days of the analyst’s time and its result is not easily transferable to new plots. Although the evaluation of the results was carried out in an automatic fashion based on spatial overlap of polygons, the fallen tree reference data itself does not originate from field works – instead, it was derived from the input LiDAR point cloud in combination with ortophotographs. Muecke et al. (2013) also perform the classification on a vectorized layer derived from the binarized point cloud filtered based on distance to the DTM. Additionally, to remove ground vegetation and shrubs, a pulse echo width filter is applied. The authors show that it is possible to reliably detect stems which distinguish themselves well upon the DTM. However, they note that applying the pulse echo width filter together with the vectorized object shape features is not always enough to separate stems from densely intertwined ground vegetation and piles of twigs. The classification relies on the value of the area-perimeter ratio of the polygons lying within a specified range which corresponds to the shape of elongated objects. This is a potential limitation of their method, since for multiple overlapping stems lying in close proximity, the vectorized polygon may have a shape that is not consistent with such an area-perimeter ratio. The evaluation of the results is performed manually by a human expert, which introduces a degree of subjectivity. The next study, due to Lindberg et al. (2013), is based on analyzing height homogeneity of points on raster cells with two resolutions. A line template matching algorithm is proposed, where the templates have a rectangular shape with fixed width and height. A voting scheme is then employed, wherein each raster cell votes for line parameters mostly supported by the points inside it. An important methodological advancement is the attempt to model not only the candidate stem, but also its immediate surroundings in order to distinguish between actual fallen trees and ground vegetation or other non-relevant objects. A potential drawback of this approach is that the entire modeling of shape is done explicitly using quite simple features and based on a multitude of user-defined thresholds. The authors note themselves that the choice of parameters was geared towards the characteristics of the study area. Also, this method only detects tree segments of equal length and there is no attempt to merge individual segments. This precludes successful detection of trees whose parts are not ideally linear. Finally, an automatic validation of the obtained results was made difficult by positioning errors in the field measurements, which resulted in a rather lenient criterion for matching the detected and reference trees (10 m horizontal distance and 30° deviation in planimetric direction). In a recent study, Nyström et al. (2014) also apply line template matching. In contrast to (Lindberg et al., 2013), they first derive an object height model (OHM) based on the active contour surface algorithm (Elmqvist, 2002). This OHM layer is defined on a raster grid with a resolution of 10 cm and represents the DTM-normalized heights of objects close to the ground. This can be seen as an alternative to the moving least squares interpolation used in Muecke et al. (2013). The matching is done on the OHM raster cells with rectangular templates of fixed length and 4 width levels. The templates having the highest correlation with the OHM are iteratively picked as stem candidates. Each candidate undergoes a classification step using linear discriminant analysis manually trained on a set of negative and positive examples based on a set of simple point height statistics within the template. The purpose of this step is to distinguish trees from other linear objects, e.g. edges, ditches and roads. The fixed length of the templates and the lack of any merging step make it impossible for this approach to detect trees of different length and of a more complex shape. Finally, similarly to the previous study, the evaluation method for linking reference and detected trees seems too permissive, as it allows a maximum distance of 12 m and up to 30° of directional deviation.

The quantitative results presented in the aforementioned studies seem to indicate that correctly detecting individual stems in the presence of ground vegetation and within complicated scenarios of multiple overlapping trees is a challenging task (see Fig. 1). We believe that one of the shortcomings of the presented methods is that they attempt to model quite complex objects using either multiple user-defined thresholds, or inadequate features which fail to capture the full extent of the variability of the target objects’ appearance. Another problem is the use of fixed-length templates, which further constrains the class of fallen tree shapes that can be detected. To mitigate these deficiencies, we propose the use of highly expressive 3D shape descriptors in a machine learning framework where the appearance can be learned from reference data either labeled by the user or obtained via simulation. Our approach consists of two major parts. In the first step, stem segments of equal length are detected, yielding a set of primitives. In the second step, these segments are clustered together to form entire fallen trees. This is achieved by means of the Normalized Cut algorithm (Shi and Malik, 2000), a well-established spectral clustering method which has been successfully applied in numerous fields, including computer vision, speech processing, and also remote sensing (Reitberger et al., 2009). To employ Normalized Cut, a domain-specific similarity function must be specified on the space of input object pairs. Usually, the original exponential model reported by Shi and Malik is used, where each feature possesses an associated weight that controls the magnitude of its contribution. However, the individual feature weights are often very application-specific, and finding a satisfactory set of weights for a new clustering scenario may require significant effort. The clustering method itself does not provide any recipe for finding the correct weights, and hence this search is in practice conducted on the basis of intuition, trial and error, or on a coarse grid of discretized weight values. These approximate approaches are all either computationally expensive, or do not provide a substantial improvement with respect to generating the weights in a random fashion. This motivates research towards finding ways to automatically learn the similarity function from reference segmentations. Several contributions attempt to incorporate prior knowledge (Maji et al., 2011) or constraints (Yu and Shi, 2001) into the clustering process. Perhaps the most relevant to our work is (Bach and Jordan, 2003), where the authors propose a new formulation of the k-way Normalized Cut optimization objective which makes it possible to learn the similarity matrix from a known reference segmentation. Their approach relies on approximately optimizing the spectral clustering cost function directly with respect to the similarity matrix, which can lead to difficult eigenvalue problems. In this paper, the Normalized Cut similarity function is viewed as a probabilistic binary classifier which, given an input object pair, estimates the likelihood that both objects belong to the same cluster. We build upon the assumption that a similarity function which achieves a high accuracy as a classifier will also yield high-quality clustering results when used with Normalized Cut. Such a relationship would be useful in that it would allow us to reduce the problem of learning a similarity function to the much simpler and well-studied task of learning a classifier from labeled data. Another important aspect of the Normalized Cut algorithm is the partition stopping criterion. The standard approach is to use a threshold on the Ncut value, which can be seen as a measure of similarity between the two partitioned subsets calculated using the object similarity matrix. A high Ncut value indicates that the proposed subdivision results in two parts that are not very distinct, which prompts the end of the partitioning. One of the problems with this method is that the Ncut value is tightly coupled to the scale of the weights matrix, and hence it is hard to predict whether a threshold value can be applicable for diverse input data. Instead, we postulate the use of a stopping criterion that is purely based on the appearance of the portion of the scene which is currently being partitioned. This can also be regarded as a classification problem where we want to detect the situation when the point cloud subset represents a single target object. We define a virtual sample generation scheme for synthesizing complex stem scenarios and use the generated exemplars as training data to learn the both the similarity function and stopping criterion for merging segments into entire fallen trees.

We consider the main contributions of this paper to be (i) the idea to approximate the problem of learning the similarity function and stopping criterion for Normalized Cut with the problem of training binary classifiers, (ii) the physical simulation of fallen stems which serves as a basis for learning the similarity function and stopping criterion and (iii) the entire pipeline for fallen stem detection which is able to generalize across different plot conditions. Also, this is the first study that we know of where the reference data is accurate enough to enable a detailed automatic evaluation of the detected stems. This paper is an extended version of Polewski et al. (2014a). Here, we additionally introduce the adaptive stopping criterion as well as the skeletonization algorithm, and evaluate the performance of the complete detection workflow against ground truth obtained from field works, including an error analysis of selected pipeline steps. The remainder of this work is structured as follows: in Section 2 we explain the details of our approach including the mathematical background. Section 3 describes the study area, experiments and evaluation strategy. The results are presented and discussed in Section 4. Finally, the conclusions are stated in Section 5.

Section snippets

Overview of strategy for detecting fallen trees

We develop an integrated, data driven method for detecting single tree stems from unstructured ALS point clouds given by the three-dimensional coordinates of their constituent points. The approach is suitable for both discrete and full waveform data. The output of our method is a list of subsets of the original points which correspond to individual detected fallen trees. We assume that reference data in the form of accurate fallen stem positions, lengths and diameters is available. The entire

Material

We tested the proposed method on five sample plots located in the Bavarian Forest National Park (49°319N, 13°129E), which is situated in South-Eastern Germany along the border to the Czech Republic. The study was carried out in the mountain mixed forests zone consisting mostly of Norway spruce (Picea abies) and European beech (Fagus sylvatica). From 1988 to 2010, a total of 5800 ha of the Norway spruce stands died off because of a bark beetle (Ips typographus) infestation (Lausch et al., 2013

Evaluation of entire pipeline

Fig. 11, Fig. 12 show examples of detection results for all test plots. The detection performance of the entire pipeline is depicted by Fig. 13, Fig. 14 for the deciduous (1–3) and coniferous (4–5) plots, respectively. The three deciduous datasets share the characteristic that at a detection threshold of 40% of the tree’s length, over 80% trees can be detected with an accuracy of ca. 0.8. For Plots 1 and 3, we can further improve the 40% completeness to over 0.9 by trading for some correctness.

Conclusions and outlook

In this work we have presented an integrated method for detecting fallen stems from ALS point clouds. The method proceeds in a bottom-up fashion by building up an understanding of the scene starting from the single point level, through the segment level, to the object (fallen tree) perspective. A set of differential features has been proposed which forms the basis for merging stem segments into entire stems using the Normalized Cut algorithm. We have then shown that manipulating the attribute

References (40)

  • M. Nyström et al.

    Detection of windthrown trees using airborne laser scanning

    Int. J. Appl. Earth Obs.

    (2014)
  • A. Pesonen et al.

    Airborne laser scanning-based prediction of coarse woody debris volumes in a conservation area

    For. Ecol. Manage.

    (2008)
  • J. Reitberger et al.

    3D segmentation of single trees exploiting full waveform LIDAR data

    ISPRS J. Photogramm.

    (2009)
  • J. Siitonen et al.

    Coarse woody debris and stand characteristics in mature managed and old-growth boreal mesic forests in southern Finland

    For. Ecol. Manage.

    (2000)
  • J.K. Weaver et al.

    Decaying wood and tree regeneration in the Acadian Forest of Maine, USA

    For. Ecol. Manage.

    (2009)
  • C. Woodall et al.

    National inventories of down and dead woody material forest carbon stocks in the United States: challenges and opportunities

    For. Ecol. Manage.

    (2008)
  • W. Yao et al.

    Tree species classification and estimation of stem volume and DBH based on single tree extraction by exploiting airborne full-waveform LiDAR data

    Rem. Sens. Environ.

    (2012)
  • B. Aronov et al.

    Polyline fitting of planar points under min-sum criteria

  • Bach, F.R., Jordan, M.I., 2003. Learning spectral clustering. In: Adv. Neur. In., vol....
  • S.D. Blanchard et al.

    Object-based image analysis of downed logs in disturbed forested landscapes using lidar

    Rem. Sens.

    (2011)
  • Cited by (75)

    • A new weakly supervised approach for ALS point cloud semantic segmentation

      2022, ISPRS Journal of Photogrammetry and Remote Sensing
      Citation Excerpt :

      The 3D coordinates and associated attributes (e.g., laser reflectance and return count information) are usually contained in ALS point clouds. To fully interpret a complex geographical scene, the key step is to acquire semantic information as a valuable cue utilized in various remote sensing applications, such as land cover surveys (Yan et al., 2015), forest monitoring (Yao et al., 2012; Polewski et al., 2015), change detection (Okyay et al., 2019), and 3D mapping (Fan et al., 2014; Zhang et al., 2018). Semantic segmentation, or classification, which usually assigns a label to each point, is an indispensable solution for point cloud parsing.

    • GraNet: Global relation-aware attentional network for semantic segmentation of ALS point clouds

      2021, ISPRS Journal of Photogrammetry and Remote Sensing
      Citation Excerpt :

      Airborne laser scanning (ALS), as one of the most important systems using the light detection and ranging (LiDAR) technique. By carrying LiDAR devices on an aircraft or UAV, ALS has the advantages of quickly acquiring large-scale and high-precision ground information (Vosselman et al., 2017; Li et al., 2019a), which enable it to be utilized in a wide variety of applications such as 3D city modeling (Moussa and El-Sheimy, 2010; Lafarge and Mallet, 2012; Yang et al., 2013), land cover and land use mapping (Yan et al., 2015), forestry monitoring (Reitberger et al., 2009; Polewski et al., 2015), construction monitoring (Bosché et al., 2015; Xu et al., 2018; Huang et al., 2020a), change detection (Hebel et al., 2013), powerline inspection (Clode and Rottensteiner, 2005; Guo et al., 2015), and deformation monitoring (Alba et al., 2006; Olsen et al., 2010). However, acquired ALS point clouds usually provide points with 3D coordinates and attributes (e.g., intensities, incident angles, or numbers of returns), but without semantic information indicating labels of the scanned ground objects (Huang et al., 2020b), which hinders further applications like urban mapping or building reconstruction.

    View all citing articles on Scopus
    View full text