Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

doi:10.1016/j.isprsjprs.2015.01.010

ISPRS Journal of Photogrammetry and Remote Sensing

Volume 105, July 2015, Pages 252-271

https://doi.org/10.1016/j.isprsjprs.2015.01.010 Get rights and content

Abstract

Downed dead wood is regarded as an important part of forest ecosystems from an ecological perspective, which drives the need for investigating its spatial distribution. Based on several studies, Airborne Laser Scanning (ALS) has proven to be a valuable remote sensing technique for obtaining such information. This paper describes a unified approach to the detection of fallen trees from ALS point clouds based on merging short segments into whole stems using the Normalized Cut algorithm. We introduce a new method of defining the segment similarity function for the clustering procedure, where the attribute weights are learned from labeled data. Based on a relationship between Normalized Cut’s similarity function and a class of regression models, we show how to learn the similarity function by training a classifier. Furthermore, we propose using an appearance-based stopping criterion for the graph cut algorithm as an alternative to the standard Normalized Cut threshold approach. We set up a virtual fallen tree generation scheme to simulate complex forest scenarios with multiple overlapping fallen stems. This simulated data is then used as a basis to learn both the similarity function and the stopping criterion for Normalized Cut. We evaluate our approach on 5 plots from the strictly protected mixed mountain forest within the Bavarian Forest National Park using reference data obtained via a manual field inventory. The experimental results show that our method is able to detect up to 90% of fallen stems in plots having 30–40% overstory cover with a correctness exceeding 80%, even in quite complex forest scenes. Moreover, the performance for feature weights trained on simulated data is competitive with the case when the weights are calculated using a grid search on the test data, which indicates that the learned similarity function and stopping criterion can generalize well on new plots.

Introduction

Lying dead wood is considered an important component of forest ecosystems from an ecological perspective due to their role in providing habitat for various plants and animals (Siitonen et al., 2000), in facilitating tree regeneration (Weaver et al., 2009) as well as in forest nutrient cycles through contributing to soil organic material (Harmon et al., 1986). Also, dead wood is a significant element of the total carbon stock in forests (Woodall et al., 2008). For these reasons, qualitative and quantitative information about the spatial distribution of dead wood in forests is important for any organizations interested in monitoring biodiversity, carbon sequestration and wildlife habitats.

During the last decade, Airborne Laser Scanning (ALS) has become an established technique for carrying out forest inventory tasks (Hyyppä et al., 2012). The emergence of new, commercially available hardware LiDAR platforms has stimulated research in this field, leading to the development of methods for estimating various forest parameters, thereby offering an alternative to time-consuming field inventories. Airborne LiDAR has been successfully deployed for deriving parameters of entire stands, such as leaf area index (LAI) (Morsdorf et al., 2006), forest land cover characteristics (Antonarakis et al., 2008), mean tree height (Naesset, 1997) and 3D vegetation structure (Lindberg et al., 2012). This is usually done by building a regression model with independent variables derived from characteristics of the ALS points and the dependent variable corresponding to the forest parameter of interest, where the training data is obtained using field works. With the gradual increase in precision of the laser measurement hardware and hence in the density of the ALS point cloud, single tree approaches were made feasible. Consequently, there has been a multitude of contributions dealing with the detection and delineation of individual trees from the LiDAR data. One of the fundamental research topics concerns tree species classification (Reitberger et al., 2008, Heurich, 2008, Holmgren and Persson, 2004). Obtaining the species label for each individual tree is not only a goal in itself, as deriving species-specific models for other forest parameters of interest may lead to an increased overall estimation accuracy. Other single-tree method applications include detection of standing dead trees (Yao et al., 2012b), estimation of stem volume and DBH (Yao et al., 2012a) as well as biomass (Kankare et al., 2013).

While there have been several contributions related to area-based estimation of the proportion (Vehmas et al., 2011) and volume (Pesonen et al., 2008) of downed stems using the aforementioned regression techniques, the problem of detecting individual fallen trees from ALS point clouds is relatively new. Among the first methodological studies was the work of Blanchard et al. (2011). This approach relies on rasterizing the point cloud with respect to different metrics which quantify elevation and scene information. A vector thematic layer is created and polygons are manually masked in order to retain polygons that best match the shape of a downed stem. Refinement of this layer is achieved by setting various thresholds determined through visual interpretation. Finally, a multiresolution, bottom-up pairwise region merging algorithm is applied on the raster and vector layers to produce an object-level segmentation. Each object is then classified into one of three categories: downed logs, canopy cover and ground using a total of 113 individual refinement and classification rules. The authors report a classification accuracy of 73% and note that their approach has some difficulty in situations like proximity of the stems to ground vegetation as well as large clusters of logs. Also, the task of constructing the classification rules and setting the various thresholds based on visual inspection may take days of the analyst’s time and its result is not easily transferable to new plots. Although the evaluation of the results was carried out in an automatic fashion based on spatial overlap of polygons, the fallen tree reference data itself does not originate from field works – instead, it was derived from the input LiDAR point cloud in combination with ortophotographs. Muecke et al. (2013) also perform the classification on a vectorized layer derived from the binarized point cloud filtered based on distance to the DTM. Additionally, to remove ground vegetation and shrubs, a pulse echo width filter is applied. The authors show that it is possible to reliably detect stems which distinguish themselves well upon the DTM. However, they note that applying the pulse echo width filter together with the vectorized object shape features is not always enough to separate stems from densely intertwined ground vegetation and piles of twigs. The classification relies on the value of the area-perimeter ratio of the polygons lying within a specified range which corresponds to the shape of elongated objects. This is a potential limitation of their method, since for multiple overlapping stems lying in close proximity, the vectorized polygon may have a shape that is not consistent with such an area-perimeter ratio. The evaluation of the results is performed manually by a human expert, which introduces a degree of subjectivity. The next study, due to Lindberg et al. (2013), is based on analyzing height homogeneity of points on raster cells with two resolutions. A line template matching algorithm is proposed, where the templates have a rectangular shape with fixed width and height. A voting scheme is then employed, wherein each raster cell votes for line parameters mostly supported by the points inside it. An important methodological advancement is the attempt to model not only the candidate stem, but also its immediate surroundings in order to distinguish between actual fallen trees and ground vegetation or other non-relevant objects. A potential drawback of this approach is that the entire modeling of shape is done explicitly using quite simple features and based on a multitude of user-defined thresholds. The authors note themselves that the choice of parameters was geared towards the characteristics of the study area. Also, this method only detects tree segments of equal length and there is no attempt to merge individual segments. This precludes successful detection of trees whose parts are not ideally linear. Finally, an automatic validation of the obtained results was made difficult by positioning errors in the field measurements, which resulted in a rather lenient criterion for matching the detected and reference trees (10 m horizontal distance and 30° deviation in planimetric direction). In a recent study, Nyström et al. (2014) also apply line template matching. In contrast to (Lindberg et al., 2013), they first derive an object height model (OHM) based on the active contour surface algorithm (Elmqvist, 2002). This OHM layer is defined on a raster grid with a resolution of 10 cm and represents the DTM-normalized heights of objects close to the ground. This can be seen as an alternative to the moving least squares interpolation used in Muecke et al. (2013). The matching is done on the OHM raster cells with rectangular templates of fixed length and 4 width levels. The templates having the highest correlation with the OHM are iteratively picked as stem candidates. Each candidate undergoes a classification step using linear discriminant analysis manually trained on a set of negative and positive examples based on a set of simple point height statistics within the template. The purpose of this step is to distinguish trees from other linear objects, e.g. edges, ditches and roads. The fixed length of the templates and the lack of any merging step make it impossible for this approach to detect trees of different length and of a more complex shape. Finally, similarly to the previous study, the evaluation method for linking reference and detected trees seems too permissive, as it allows a maximum distance of 12 m and up to 30° of directional deviation.

The quantitative results presented in the aforementioned studies seem to indicate that correctly detecting individual stems in the presence of ground vegetation and within complicated scenarios of multiple overlapping trees is a challenging task (see Fig. 1). We believe that one of the shortcomings of the presented methods is that they attempt to model quite complex objects using either multiple user-defined thresholds, or inadequate features which fail to capture the full extent of the variability of the target objects’ appearance. Another problem is the use of fixed-length templates, which further constrains the class of fallen tree shapes that can be detected. To mitigate these deficiencies, we propose the use of highly expressive 3D shape descriptors in a machine learning framework where the appearance can be learned from reference data either labeled by the user or obtained via simulation. Our approach consists of two major parts. In the first step, stem segments of equal length are detected, yielding a set of primitives. In the second step, these segments are clustered together to form entire fallen trees. This is achieved by means of the Normalized Cut algorithm (Shi and Malik, 2000), a well-established spectral clustering method which has been successfully applied in numerous fields, including computer vision, speech processing, and also remote sensing (Reitberger et al., 2009). To employ Normalized Cut, a domain-specific similarity function must be specified on the space of input object pairs. Usually, the original exponential model reported by Shi and Malik is used, where each feature possesses an associated weight that controls the magnitude of its contribution. However, the individual feature weights are often very application-specific, and finding a satisfactory set of weights for a new clustering scenario may require significant effort. The clustering method itself does not provide any recipe for finding the correct weights, and hence this search is in practice conducted on the basis of intuition, trial and error, or on a coarse grid of discretized weight values. These approximate approaches are all either computationally expensive, or do not provide a substantial improvement with respect to generating the weights in a random fashion. This motivates research towards finding ways to automatically learn the similarity function from reference segmentations. Several contributions attempt to incorporate prior knowledge (Maji et al., 2011) or constraints (Yu and Shi, 2001) into the clustering process. Perhaps the most relevant to our work is (Bach and Jordan, 2003), where the authors propose a new formulation of the k-way Normalized Cut optimization objective which makes it possible to learn the similarity matrix from a known reference segmentation. Their approach relies on approximately optimizing the spectral clustering cost function directly with respect to the similarity matrix, which can lead to difficult eigenvalue problems. In this paper, the Normalized Cut similarity function is viewed as a probabilistic binary classifier which, given an input object pair, estimates the likelihood that both objects belong to the same cluster. We build upon the assumption that a similarity function which achieves a high accuracy as a classifier will also yield high-quality clustering results when used with Normalized Cut. Such a relationship would be useful in that it would allow us to reduce the problem of learning a similarity function to the much simpler and well-studied task of learning a classifier from labeled data. Another important aspect of the Normalized Cut algorithm is the partition stopping criterion. The standard approach is to use a threshold on the Ncut value, which can be seen as a measure of similarity between the two partitioned subsets calculated using the object similarity matrix. A high Ncut value indicates that the proposed subdivision results in two parts that are not very distinct, which prompts the end of the partitioning. One of the problems with this method is that the Ncut value is tightly coupled to the scale of the weights matrix, and hence it is hard to predict whether a threshold value can be applicable for diverse input data. Instead, we postulate the use of a stopping criterion that is purely based on the appearance of the portion of the scene which is currently being partitioned. This can also be regarded as a classification problem where we want to detect the situation when the point cloud subset represents a single target object. We define a virtual sample generation scheme for synthesizing complex stem scenarios and use the generated exemplars as training data to learn the both the similarity function and stopping criterion for merging segments into entire fallen trees.

We consider the main contributions of this paper to be (i) the idea to approximate the problem of learning the similarity function and stopping criterion for Normalized Cut with the problem of training binary classifiers, (ii) the physical simulation of fallen stems which serves as a basis for learning the similarity function and stopping criterion and (iii) the entire pipeline for fallen stem detection which is able to generalize across different plot conditions. Also, this is the first study that we know of where the reference data is accurate enough to enable a detailed automatic evaluation of the detected stems. This paper is an extended version of Polewski et al. (2014a). Here, we additionally introduce the adaptive stopping criterion as well as the skeletonization algorithm, and evaluate the performance of the complete detection workflow against ground truth obtained from field works, including an error analysis of selected pipeline steps. The remainder of this work is structured as follows: in Section 2 we explain the details of our approach including the mathematical background. Section 3 describes the study area, experiments and evaluation strategy. The results are presented and discussed in Section 4. Finally, the conclusions are stated in Section 5.

Section snippets

Overview of strategy for detecting fallen trees

We develop an integrated, data driven method for detecting single tree stems from unstructured ALS point clouds given by the three-dimensional coordinates of their constituent points. The approach is suitable for both discrete and full waveform data. The output of our method is a list of subsets of the original points which correspond to individual detected fallen trees. We assume that reference data in the form of accurate fallen stem positions, lengths and diameters is available. The entire

Material

We tested the proposed method on five sample plots located in the Bavarian Forest National Park $(49 ° 3^{'} 19^{″} N$ , $13 ° 12^{'} 9^{″} E)$ , which is situated in South-Eastern Germany along the border to the Czech Republic. The study was carried out in the mountain mixed forests zone consisting mostly of Norway spruce (Picea abies) and European beech (Fagus sylvatica). From 1988 to 2010, a total of 5800 ha of the Norway spruce stands died off because of a bark beetle (Ips typographus) infestation (Lausch et al., 2013

Evaluation of entire pipeline

Fig. 11, Fig. 12 show examples of detection results for all test plots. The detection performance of the entire pipeline is depicted by Fig. 13, Fig. 14 for the deciduous (1–3) and coniferous (4–5) plots, respectively. The three deciduous datasets share the characteristic that at a detection threshold of 40% of the tree’s length, over 80% trees can be detected with an accuracy of ca. 0.8. For Plots 1 and 3, we can further improve the 40% completeness to over 0.9 by trading for some correctness.

Conclusions and outlook

In this work we have presented an integrated method for detecting fallen stems from ALS point clouds. The method proceeds in a bottom-up fashion by building up an understanding of the scene starting from the single point level, through the segment level, to the object (fallen tree) perspective. A set of differential features has been proposed which forms the basis for merging stem segments into entire stems using the Normalized Cut algorithm. We have then shown that manipulating the attribute

References (40)

A. Antonarakis et al.
Object-based land cover classification using airborne LiDAR
Rem. Sens. Environ.
(2008)
R. Bellman
A note on cluster analysis and dynamic programming
Math. Biosci.
(1973)
M. Heurich
Automatic recognition and measurement of single trees based on data from airborne laser scanning over the richly structured natural forests of the Bavarian Forest National Park
For. Ecol. Manage.
(2008)
J. Holmgren et al.
Identifying species of individual trees using airborne laser scanner
Rem. Sens. Environ.
(2004)
V. Kankare et al.
Single tree biomass modelling using airborne laser scanning
ISPRS J. Photogramm.
(2013)
G. Lan et al.
An effective and simple heuristic for the set covering problem
Eur. J. Oper. Res.
(2007)
A. Lausch et al.
Spatio-temporal infestation patterns of Ips typographus (L). in the Bavarian Forest National Park, Germany
Ecol. Indic.
(2013)
E. Lindberg et al.
Estimation of 3D vegetation structure from waveform and discrete return airborne laser scanning data
Rem. Sens. Environ.
(2012)
F. Morsdorf et al.
Estimation of LAI and fractional cover from small footprint airborne laser scanning data based on gap fraction
Rem. Sens. Environ.
(2006)
E. Naesset
Determination of mean tree height of forest stands using airborne laser scanner data
ISPRS J. Photogramm.
(1997)

M. Nyström et al.

Detection of windthrown trees using airborne laser scanning

Int. J. Appl. Earth Obs.

(2014)

A. Pesonen et al.

Airborne laser scanning-based prediction of coarse woody debris volumes in a conservation area

For. Ecol. Manage.

(2008)

J. Reitberger et al.

3D segmentation of single trees exploiting full waveform LIDAR data

ISPRS J. Photogramm.

(2009)

J. Siitonen et al.

Coarse woody debris and stand characteristics in mature managed and old-growth boreal mesic forests in southern Finland

For. Ecol. Manage.

(2000)

J.K. Weaver et al.

Decaying wood and tree regeneration in the Acadian Forest of Maine, USA

For. Ecol. Manage.

(2009)

C. Woodall et al.

National inventories of down and dead woody material forest carbon stocks in the United States: challenges and opportunities

For. Ecol. Manage.

(2008)

W. Yao et al.

Tree species classification and estimation of stem volume and DBH based on single tree extraction by exploiting airborne full-waveform LiDAR data

Rem. Sens. Environ.

(2012)

B. Aronov et al.

Polyline fitting of planar points under min-sum criteria

Bach, F.R., Jordan, M.I., 2003. Learning spectral clustering. In: Adv. Neur. In., vol....

S.D. Blanchard et al.

Object-based image analysis of downed logs in disturbed forested landscapes using lidar

Rem. Sens.

(2011)

Cited by (75)

Assessing biodiversity using forest structure indicators based on airborne laser scanning data
2023, Forest Ecology and Management
The role of forests in biodiversity assessment and planning is substantial as these ecosystems support approximately 80% of the world’s terrestrial biodiversity. Forests provide food, shelter, and nesting environments for numerous species, and deliver multiple ecosystem services. It has been widely recognised that forest vegetation structure and its complexity influence local variations in biodiversity. As forests are facing threats globally caused by human activities, there is a need to map the biodiversity of these ecosystems. The main objective of this review was to summarise the use of airborne laser scanning (ALS) data in biodiversity-related assessment of forests. We draw attention to topics related to animal ecology, structural diversity, dead wood, fragmentation and forest habitat classification. After conducting a thorough literature search, we categorised scientific articles based on their topics, which served as the basis for the section division in this paper. The majority of the research was found to be conducted in Europe and North America, only a small fraction of the study areas was located elsewhere. Topics that have received the most attention were related to animal ecology (namely richness and diversity of forest fauna), assessment of dead trees and tree species diversity measures. Not all studies used ALS data only, as it were often fused with other remote sensing data – especially with aerial or satellite images. The fusion of spectral information from optical images and the structural information provided by ALS was highly advantageous in studies where tree species were considered. Relevant ALS variables were found to be case-specific, so variables varied widely between forest biodiversity studies. We found that there was a lack of research in geographical areas and forest types other than temperate and boreal forests. Also, topics that considered functional diversity, community composition and the effect of spatial resolution at which ALS data and field information are linked, were covered to much lesser extent.
One Class One Click: Quasi scene-level weakly supervised point cloud semantic segmentation with active learning
2023, ISPRS Journal of Photogrammetry and Remote Sensing
Reliance on vast annotations to achieve leading performance severely restricts the practicality of large-scale point cloud semantic segmentation. For the purpose of reducing data annotation costs, effective labeling schemes are developed and contribute to attaining competitive results under weak supervision strategy. Revisiting current weak label forms, we introduce One Class One Click (OCOC), a low cost yet informative quasi scene-level label, which encapsulates both point-level and scene-level annotations. An active weakly supervised framework is proposed to leverage scarce labels by involving weak supervision from both global and local perspectives. Contextual constraints are imposed by an auxiliary scene classification task, respectively based on global feature embedding and point-wise prediction aggregation, which restricts the model prediction merely to OCOC labels within a sub-cloud. Furthermore, we design a context-aware pseudo labeling strategy, which effectively supplement point-level supervisory signals subject to OCOC labels. Finally, an active learning scheme with a uncertainty measure — temporal output discrepancy is integrated to examine informative samples and provides guidance on sub-clouds query, which is conducive to quickly attaining desirable OCOC annotations and reduces the labeling cost to an extremely low extent. Extensive experimental analysis using three LiDAR benchmarks respectively collected from airborne, mobile and ground platforms demonstrates that our proposed method achieves very promising results though subject to scarce labels. It considerably outperforms genuine scene-level weakly supervised methods by up to 25% in terms of average F1 score and achieves competitive results against full supervision schemes. On terrestrial LiDAR dataset — Semantics3D, using approximately 2‱ of labels, our method achieves an average F1 score of 85.2%, which increases by 11.58% compared to the baseline model. Codes are publicly available at https://github.com/PuzoW/One-Class-One-Click.
Control of distributed segmentation of indoor point cloud via homogenization clustering network
2023, Journal of the Franklin Institute
Indoor point cloud segmentation is necessary for many applications including object identification for indoor navigation, facility management and so on. In order to construct accurate segmentation system of indoor point cloud, this paper improves cloth simulation and uses it to control the segmentation of ceiling and floor. Simultaneously, we obtain the wall points according to the points’ density. We mainly introduce indoor objects control segmentation via clustering strategy. First, we generate the cutoff distance according to the angular resolution and scan distance. Second, we use exponential function to determine the local density. Third, the cluster center is determined according to the magnitude of product of local density and distance which are normalized. Finally, the points affiliated to a cluster are controlled one by one according to the distance between point and cluster center. Segmentation of objects is then realized based on the established clusters. Experiments indicate that the proposed algorithm achieves a competitive performance when compared with several state-of-the-art algorithms. The performance of the proposed method and the accuracy of distributed segmentation are affected by the degree of closeness between objects.
A new weakly supervised approach for ALS point cloud semantic segmentation
2022, ISPRS Journal of Photogrammetry and Remote Sensing
Citation Excerpt :
The 3D coordinates and associated attributes (e.g., laser reflectance and return count information) are usually contained in ALS point clouds. To fully interpret a complex geographical scene, the key step is to acquire semantic information as a valuable cue utilized in various remote sensing applications, such as land cover surveys (Yan et al., 2015), forest monitoring (Yao et al., 2012; Polewski et al., 2015), change detection (Okyay et al., 2019), and 3D mapping (Fan et al., 2014; Zhang et al., 2018). Semantic segmentation, or classification, which usually assigns a label to each point, is an indispensable solution for point cloud parsing.
Although novel point cloud semantic segmentation schemes that continuously surpass state-of-the-art results exist, the success of learning an effective model typically relies on the availability of abundant labeled data. However, data annotation is a time-consumng and labor-intensive task, particularly for large-scale airborne laser scanning (ALS) point clouds involving multiple classes in urban areas. Therefore, simultaneously obtaining promising results while significantly reducing labeling is crucial. In this study, we propose a deep-learning-based weakly supervised framework for the semantic segmentation of ALS point clouds. This is to exploit implicit information from unlabeled data subject to incomplete and sparse labels. Entropy regularization is introduced to penalize class overlap in the predictive probability. Additionally, a consistency constraint is designed to improve the robustness of the predictions by minimizing the difference between the current and ensemble predictions. Finally, we propose an online soft pseudo-labeling strategy to create additional supervisory sources in an efficient and nonparametric manner. Extensive experimental analysis using three benchmark datasets demonstrates that our proposed method significantly boosts the classification performance without compromising the computational efficiency, considering the sparse point annotations. It outperforms the current weakly supervised methods and achieves a result comparable to that of full supervision competitors. Considering the ISPRS Vaihingen 3D data, using only 1‰ labels, our method achieved an overall accuracy of 83.0% and an average F1 score of 70.0%. These increased by 6.9% and 12.8%, respectively, compared to the model trained only using sparse label information.
Instance segmentation of fallen trees in aerial color infrared imagery using active multi-contour evolution with fully convolutional network-based intensity priors
2021, ISPRS Journal of Photogrammetry and Remote Sensing
Over the last several years, semantic image segmentation based on deep neural networks has been greatly advanced. On the other hand, single-instance segmentation still remains a challenging problem. In this paper, we introduce a framework for segmenting instances of a common object class by multiple active contour evolution over semantic segmentation maps of images obtained through fully convolutional networks. The contour evolution is cast as an energy minimization problem, where the aggregate energy functional incorporates a data fit term, an explicit shape model, and accounts for object overlap. Efficient solution neighborhood operators are proposed, enabling optimization through metaheuristics such as simulated annealing. We instantiate the proposed framework in the context of segmenting individual fallen stems from high-resolution aerial multispectral imagery, providing problem-specific energy potentials. We validated our approach on 3 real-world scenes of varying complexity, using 730 manually labeled polygon outlines as ground truth. The test plots were situated in regions of the Bavarian Forest National Park, Germany, which sustained a heavy bark beetle infestation. Evaluations were performed on both the polygon and line segment level, showing that the multi-contour segmentation can achieve up to 0.93 precision and 0.82 recall. An improvement of up to 7 percentage points (pp) in recall and 6 in precision compared to an iterative sample consensus line segment detection baseline was achieved. Despite the simplicity of the applied shape parametrization, an explicit shape model incorporated into the energy function improved the results by up to 4 pp of recall. Finally, we show the importance of using a high-quality semantic segmentation method (e.g. U-net) as the basis for individual stem detection, as the quality of the results degraded dramatically in our baseline experiment utilizing a simpler method. Our method is a step towards increased accessibility of automatic fallen tree mapping in forests, due to higher cost efficiency of aerial imagery acquisition compared to laser scanning. The precise fallen tree maps could be further used as a basis for plant and animal habitat modeling, studies on carbon sequestration as well as soil quality in forest ecosystems.
GraNet: Global relation-aware attentional network for semantic segmentation of ALS point clouds
2021, ISPRS Journal of Photogrammetry and Remote Sensing
Citation Excerpt :
Airborne laser scanning (ALS), as one of the most important systems using the light detection and ranging (LiDAR) technique. By carrying LiDAR devices on an aircraft or UAV, ALS has the advantages of quickly acquiring large-scale and high-precision ground information (Vosselman et al., 2017; Li et al., 2019a), which enable it to be utilized in a wide variety of applications such as 3D city modeling (Moussa and El-Sheimy, 2010; Lafarge and Mallet, 2012; Yang et al., 2013), land cover and land use mapping (Yan et al., 2015), forestry monitoring (Reitberger et al., 2009; Polewski et al., 2015), construction monitoring (Bosché et al., 2015; Xu et al., 2018; Huang et al., 2020a), change detection (Hebel et al., 2013), powerline inspection (Clode and Rottensteiner, 2005; Guo et al., 2015), and deformation monitoring (Alba et al., 2006; Olsen et al., 2010). However, acquired ALS point clouds usually provide points with 3D coordinates and attributes (e.g., intensities, incident angles, or numbers of returns), but without semantic information indicating labels of the scanned ground objects (Huang et al., 2020b), which hinders further applications like urban mapping or building reconstruction.
Semantic labeling is an essential but challenging task when interpreting point clouds of 3D scenes. As a core step for scene interpretation, semantic labeling is the task of annotating every point in the point cloud with a label of semantic meaning, which plays a significant role in plenty of point cloud related applications. For airborne laser scanning (ALS) point clouds, precise annotations can considerably broaden its use in various applications. However, accurate and efficient semantic labeling is still a challenging task, due to the sensor noise, complex object structures, incomplete data, and uneven point densities. In this work, we propose a novel neural network focusing on semantic labeling of ALS point clouds, which investigates the importance of long-range spatial and channel-wise relations and is termed as global relation-aware attentional network (GraNet). GraNet first learns local geometric description and local dependencies using a local spatial discrepancy attention convolution module (LoSDA). In LoSDA, the orientation information, spatial distribution, and elevation information are fully considered by stacking several local spatial geometric learning modules and the local dependencies are learned by using an attention pooling module. Then, a global relation-aware attention module (GRA), consisting of a spatial relation-aware attention module (SRA) and a channel relation-aware attention module (CRA), is presented to further learn attentions from the structural information of a global scope from the relations and enhance high-level features with the long-range dependencies. The aforementioned two important modules are aggregated in the multi-scale network architecture to further consider scale changes in large urban areas. We conducted comprehensive experiments on three ALS point cloud datasets to evaluate the performance of our proposed framework. The results show that our method can achieve higher classification accuracy compared with other commonly used advanced classification methods. For the ISPRS benchmark dataset, our method improves the overall accuracy (OA) to 84.5 % and the average $F_{1}$ measure ( ${AvgF}_{1}$ ) to 73.6 %, which outperforms other baselines. Besides, experiments were conducted using a new ALS point cloud dataset covering highly dense urban areas and a newly published large-scale dataset.

View all citing articles on Scopus

View full text

Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

Abstract

Introduction

Section snippets

Overview of strategy for detecting fallen trees

Material

Evaluation of entire pipeline

Conclusions and outlook

Rem. Sens. Environ.

Math. Biosci.

For. Ecol. Manage.

Rem. Sens. Environ.

ISPRS J. Photogramm.

Eur. J. Oper. Res.

Ecol. Indic.

Rem. Sens. Environ.

Rem. Sens. Environ.

ISPRS J. Photogramm.

Int. J. Appl. Earth Obs.

For. Ecol. Manage.

ISPRS J. Photogramm.

For. Ecol. Manage.

For. Ecol. Manage.

For. Ecol. Manage.

Rem. Sens. Environ.

Polyline fitting of planar points under min-sum criteria

Object-based image analysis of downed logs in disturbed forested landscapes using lidar

Rem. Sens.