Tools and Resources

Ecology

A remote sensing derived data set of 100 million individual tree crowns for the National Ecological Observatory Network

Department of Wildlife Ecology and Conservation, University of Florida, United States
School of Forest Resources and Conservation, University of Florida, United States
Department of Electrical and Computer Engineering, University of Florida, United States
Department of Agricultural & Biological Engineering, University of Florida, United States
Nelson Institute for Environmental Studies, University of Wisconsin-Madison, United States
Informatics Institute, University of Florida, United States
Biodiversity Institute, University of Florida, United States

Feb 19, 2021

https://doi.org/10.7554/eLife.62922

Open access
Copyright information

Version of Record

Accepted for publication after peer review and revision.

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Version of Record published: February 19, 2021 (This version)
Accepted Manuscript published: February 19, 2021 (Go to version)
Accepted: February 15, 2021
Received: September 9, 2020

1. Of interest
Warming and altered precipitation independently and interactively suppress alpine soil microbial growth in a decadal-long experiment

Yang Ruan, Ning Ling ... Zhibiao Nan

Research Article Apr 22, 2024
Further reading

Abstract
Introduction
Results
Materials and methods
Discussion
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Forests provide biodiversity, ecosystem, and economic services. Information on individual trees is important for understanding forest ecosystems but obtaining individual-level data at broad scales is challenging due to the costs and logistics of data collection. While advances in remote sensing techniques allow surveys of individual trees at unprecedented extents, there remain technical challenges in turning sensor data into tangible information. Using deep learning methods, we produced an open-source data set of individual-level crown estimates for 100 million trees at 37 sites across the United States surveyed by the National Ecological Observatory Network’s Airborne Observation Platform. Each canopy tree crown is represented by a rectangular bounding box and includes information on the height, crown area, and spatial location of the tree. These data have the potential to drive significant expansion of individual-level research on trees by facilitating both regional analyses and cross-region comparisons encompassing forest types from most of the United States.

Introduction

Trees are central organisms in maintaining the ecological function, biodiversity, and the health of the planet. There are estimated to be over three trillion individual trees on earth (Crowther et al., 2015) covering a broad range of environments and geography (Hansen et al., 2013). Counting and measuring trees are central to understanding key environmental and economic issues and has implications for global climate, land management, and wood production. Field-based surveys of trees are generally conducted at local scales (~0.1–100 ha) with measurements of attributes for individual trees within plots collected manually. Connecting these local scale measurements at the plot level to broad scale patterns is challenging because of spatial heterogeneity in forests. Many of the central processes in forests, including change in forest structure and function in response to disturbances such as hurricanes and pest outbreaks, and human modification through forest management and fire, occur at scales beyond those feasible for direct field measurement.

Satellite data with continuous global coverage have been used to quantify important patterns in forest ecology and management such as global tree cover dynamics and disturbances in temperate forests (e.g., Bastin et al., 2018). However, the spatial resolution of satellite data makes it difficult to detect and monitor individual trees that underlie large scale patterns. Individual level data is important for forest ecology, ecosystem services, and forestry applications because it connects sets of remote sensing pixels to a fundamental ecological, evolutionary, and economic unit used in analysis. Without grouping to the crown level, it becomes difficult to compare remotely sensed and field-based measurements on individual trees, since field surveys have no corresponding concept of pixels. In addition, characteristics such as species identity, structural traits, growth, and carbon storage potential are properties of individuals rather than pixels. Delineation of crowns also serves as a first step in species classification (Fassnacht et al., 2016), foliar trait mapping (Zheng et al., 2021), and analyses of tree mortality (Stovall et al., 2019).

High-resolution data from airborne sensors have become increasingly accessible, but converting the data into information on individual trees requires significant technical expertise and access to high-performance computing environments (Aubry-Kientz et al., 2019; Puliti et al., 2020). This prevents most ecologists, foresters, and managers from engaging with large scale data on individual trees, despite the availability of the underlying data products and broad importance for forest ecology and management. In response to the growing need for publicly available and standardized airborne remote sensing data over forested ecosystems, the National Ecological Observatory Network (NEON) is collecting multi-sensor data for more than 40 sites across the United States. We combine NEON sensor data with a semi-supervised deep learning approach (Weinstein et al., 2019; Weinstein et al., 2020b) to produce a data set on the location, height, and crown area of over 100 million individual canopy trees at 37 sites distributed across the United States. To make these data readily accessible, we are releasing easy to access data files to spur biological analyses and to facilitate model development for tasks that rely on individual tree prediction. We describe the components of this open-source data set, compare predicted crowns with hand-labeled crowns for a range of forest types, and discuss how this data set can be used in forest research.

Results

The NEON crowns data set

The NEON Crowns data set contains tree crowns for all canopy trees (those visible from airborne remote sensing) at 37 NEON sites. Since subcanopy trees are not visible from above, they are not included in this data set. We operationally define ‘trees’ as plants over 3 m tall. The 37 NEON sites represent all NEON sites containing trees with co-registered RGB and LiDAR data from 2018 or 2019 (see Figure 1 and Appendix 1 for a list of sites and their locations). Predictions were made using the most recent year for which images were available for each site.

Figure 1

Download asset Open asset

Locations of 37 NEON sites included in the NEON crowns data set and examples of tree predictions shown with RGB imagery for six sites.

Clockwise from bottom right: (1) OSBS: Ordway-Swisher Biological Station, Florida (2) DELA: Dead Lake, Alabama, (3) SJER: San Joaquin Experimental Range, California, (4) WREF: Wind River Experimental Forest, Washington, (5) BONA: Caribou Creek, Alaska and (6) BART: Bartlett Experimental Forest, New Hampshire. Each predicted crown is associated with the spatial position, crown area, maximum height estimates from co-registered LiDAR data, and a predicted confidence score.

The data set includes a total of 104,675,304 crowns. Each predicted crown includes data on the spatial position of the crown bounding box, the area of the bounding box (an approximation of crown area), the 99th quantile of the height of LiDAR returns within the bounding box above ground level (an estimate of tree height), the year of sampling, the site where the tree is located, and a confidence score indicating the model confidence that the box represents a tree. The confidence score can vary from 0 to 1, but based on the results from Weinstein et al., 2020b, boxes with less than 0.15 confidence were not included in the data set.

The data set is provided in two formats: (1) as 11,000 individual files each covering a single 1 km² tile (geospatial ‘shapefiles’ in standard ESRI format); and (2) as 37 csv files, each covering an entire NEON site. Geospatial tiles have embedded spatial projection information and can be read in commonly available GIS software (e.g., ArcGIS, QGIS) and geospatial packages for most common programming languages used in data analysis (e.g., R, Python). All data are publicly available, openly licensed (CC-BY), and permanently archived on Zenodo (https://zenodo.org/record/3765872) (Weinstein et al., 2020a; Weinstein et al., 2020c; Weinstein et al., 2020b).

To support the visualization of the data set, we developed a web visualization tool using the ViSUS WebViewer (https://visus.org//) to allow users to view all of the trees at the full site scale with the ability to zoom and pan to examine individual groups of trees down to a scale of 20 m (see http://visualize.idtrees.org, Figure 2). This tool will allow the ecological community to assist in identifying areas in need of further refinement within the large area covered by the 37 sites.

Figure 2

Download asset Open asset

The Neon crowns data set provides individual-level tree predictions at broad scales.

An example from Bartlett Forest, NH shows the ability to continuously zoom from landscape level to stand level views. A single 1 km tile is shown. NEON sites tend to have between 100 and 400 tiles in the full airborne footprint.

Materials and methods

Crown delineation

View detailed protocol

The location of individual tree crowns was estimated using a semi-supervised deep learning workflow (Figure 3) developed by Weinstein et al., 2020b, Weinstein et al., 2019, which is implemented in the ‘DeepForest’ Python package (Weinstein et al., 2020c). We extend the workflow by filtering trees using the LiDAR-derived canopy height model (CHM) to remove objects identified by the model with heights of <3 m (Supplementary material). The deep learning model uses a one-shot object detector with a convolutional neural network backbone to predict tree crowns in RGB imagery. The model was pre-trained first on ImageNet (Deng et al., 2009) and then using weak labels generated from a previous published LiDAR tree detection algorithm using NEON data from 30 sites (Silva et al., 2016). The model was then trained on 10,000 hand-annotated crowns from seven NEON sites (Figure 1). Hand-annotations included any vegetation over 3 m in height, including standing dead trees. The LiDAR derived 3 m threshold is important in sparsely vegetated landscapes, such as oak savannah and deserts, where it was difficult for the model to distinguish between trees and low shrubs in the RGB imagery. We chose this approach because it is flexible enough to allow the data set to be updated and improved by integrating new data and modeling approaches and because it can be effectively applied at large scales with the remote sensing data available from NEON. This required a flexible method that: (1) avoided hand-tuned parameterizations for each site or ecosystem (Weinstein et al., 2020b), (2) accounted for the highly variable data spanning more than 10,000 tiles that included RGB artifacts and sparse LiDAR point densities, and (3) did not rely on site-specific or species information for allometric constraints on crown size (Duncanson et al., 2015; Fischer et al., 2020). For details of the underlying algorithms, see Weinstein et al., 2019, Weinstein et al., 2020b. Relaxing each of these constraints opens areas of future improvement, especially once species information is available for each label (Maschler et al., 2018). For example, Duncanson and Dubayah, 2018 showed that site-specific allometric functions can be effective at Teakettle Canyon (TEAK) in predicting tree location and measuring growth over time.

Figure 3

Download asset Open asset

Workflow diagram adapted from Weinstein et al., 2020c.

The workflow for model training and development are identical to Weinstein et al., 2020c with the exception of extracting heights from the canopy height model for each bounding box prediction.

Evaluation and validation

Request a detailed protocol

The DeepForest method has been compared with leading tree crown detection tools that use an array of sensor data and algorithmic approaches. Weinstein et al., 2020b compared the approach to three commonly used LiDAR algorithms (Coomes et al., 2017; Li et al., 2012; Silva et al., 2016) in the lidR package (Roussel et al., 2020) and showed that DeepForest generalized better across forest types with higher precision and recall. Weinstein et al., 2020c evaluated DeepForest using the data from a recent crown delineation comparison from a tropical forest in French Guiana (Aubry-Kientz et al., 2019). The original paper competed five leading methods (e.g., Ferraz et al., 2016; Hamraz et al., 2016; Williams et al., 2020b) with the authors submitting data to an evaluation data set kept private by the evaluation team. We repeated this setup and found that DeepForest marginally outperformed all previously tested algorithms, despite the fact that the crown evaluation data used convex polygons and DeepForest used bounding boxes to delineate tree crowns.

In this paper, we further improved the delineation method by incorporating a 3 m height filter using the NEON LiDAR-derived canopy model (NEON ID: DP3.30015.001). To validate this addition, we compare predictions to the same set of image-annotated bounding boxes used in Weinstein et al., 2020b (21 NEON sites, 207 images, 6926 trees). Annotations were filtered to 3 m in height by comparing bounding boxes. In rare cases, there were obvious trees that were missed by the height threshold. We choose to maintain these rare occurrences as a measure of cross-sensor error when defining ‘tree’ based on an arbitrary lidar-derived height measure. We defined a true positive crown as a predicted bounding box with greater than 50% intersection-over-union (the area of box intersection divided the area of box union of the two boxes) between the predicted and ground truth (image-annotated) bounding box. From the true positives and the total number of samples we calculated crown recall and precision. Crown recall is the proportion of image-annotated crowns matched to a crown prediction and crown precision is the proportion of predictions that match image-annotated crowns. The workflow yielded a bounding box recall of 79.1% with a precision of 72.6%. Tests indicate that the model generalizes well across geographic sites and forest conditions (Figure 4; Weinstein et al., 2020c; Weinstein et al., 2020b). There is a general bias toward undersegmenting trees in dense stands where multiple individual trees with similar optical characteristics are grouped into a single delineation. Adding the LiDAR threshold in this implementation resulted in predictions that were 7.0% more precise, but 0.2% less accurate on average (Figure 4). The decrease in recall is due to sparse LiDAR coverage in the CHM model where trees in the evaluation data were clearly taller than 3 m were missed in the evaluation data set.

Figure 4 with 2 supplements see all

Download asset Open asset

Precision and recall scores for the algorithm used to create the NEON crowns data set (red points), as well as the DeepForest model from Weinstein et al., 2020a (blue points).

Evaluation is performed on 207 image-annotated images (6926 trees) in the NEONTreeEvaluation data set (https://github.com/weecology/NeonTreeEvaluation). The small drop in recall in the LiDAR thresholding is due to the sparse nature of the LiDAR cloud which can occasionally miss valid trees over 3 m. Overlapping points show areas without change between the methods.

We also compared crowns delineated by the algorithm to field-collected stems from NEON’s Woody Vegetation Structure data set. This data product contains a single point for each tree with a stem diameter ≥10 cm. We filtered the raw data to only include live trees likely to be visible in the canopy (see Figure 5—figure supplement 1). These overstory tree field data help us analyze the performance of our workflow in matching crown predictions to individual trees by scoring the proportion of field stems that fall within a prediction. Field stems can only be applied to one prediction, so if two predictions overlap over a field stem, only one is considered a positive match. This test produces an overall stem recall rate at 69.4%, which is similar to the bounding box recall rate from the image-annotated data (Figure 5). The analysis of stem recall rate is conservative due to the challenge of aligning the field-collected spatial data with the remote sensing data (Figure 5—figure supplement 1). We found several examples of good predictions that were counted as false positives due to errors in the position of the ground samples within the imagery. The two outliers in OSBS are trees whose most recent field data (2015–2017) are labeled ‘Live’ but have little discernable crowns during leaf-on flights in 2019. It is possible that these trees have since died. In the case of two of the 12 missed trees, they are labeled ‘disease damaged’ and are not recorded in subsequent surveys. Capturing mortality events remains an area of further work, as RGB-based detection requires visible crowns.

Figure 5 with 1 supplement see all

Download asset Open asset

Overstory stem recall rate for NEON sites with available field data.

Each data point is the recall rate for a field-collected plot. NEON plots are either 40m × 40 m ‘tower’ plots with two 20 × 20 m subplots, or a single 20 m × 20 m ‘distributed’ plot. See NEON sampling protocols for details. For site abbreviations see Appendix 1 for complete table.

To assess the utility of our approach for mapping forest structure, we compared remotely sensed predictions of maximum tree height to field measurements of tree height of overstory trees using NEON’s Woody Plant Vegetation Structure Data. We used the same workflow described in Figure 5—figure supplement 1 for determining overstory status for both the stem recall and height verification analysis. Predicted heights showed good correspondence with field-measured heights of reference trees. Using a linear-mixed model with a site-level random effect, the predicted crown height had a root mean squared error (RMSE) of 1.73 m (Figure 6). The relationship is stronger in forests with more open canopies (SJER, OSBS) and predictably more prone to error in forests with denser canopies (BART, MLBS). There is a persistent trend of taller predictions from the remote sensing data as compared with field measured heights. This results in part from tree growth since field measurement due to the temporal gap between field data collection and remote sensing acquisition. For example, 73.8% of the field data for Bartlett Forest (BART; RMSE = 1.68 m) came from 2015 to 2017, but the remote sensing data is from 2019. In addition, previous work to compare field heights to remote sensing data usually first identify trees that are visible from an overhead perspective canopy (e.g., Puliti et al., 2020), whereas all the trees above 10 cm are sampled in NEON plots. This makes it necessary to infer which crowns are visible (for details for implementation see Figure 5—figure supplement 1). This process can lead to overestimation of heights if the tree identified in the field data is overtopped by a larger tree, leading to a higher predicted than measured field height. Given the data available, an average RMSE of 1.73 m suggests that overstory height measures are reasonably accurate across the data set.

Figure 6

Download asset Open asset

Comparison of field and remote sensing measurements of tree heights for 11 sites in the National Ecological Observatory Network.

Each point is an individual tree. See text and Figure 5—figure supplement 1 for selection criteria and matching scheme for the field data. The root mean squared error (RMSE) of a mixed-effects model with a site level random effect is 1.73 m.

Discussion

Using the NEON crowns data set for individual, landscape, and biogeographic scale applications

This data set supports individual-level cross-scale ecological research that has not been previously possible. It provides the unique combination of information spanning the entire United States, with sites ranging from Puerto Rico to Alaska, with continuous individual-level data within sites at scales hundreds of times larger than what is possible using field-based survey methods. At the individual level, high-resolution airborne imagery can inform analysis of critical forest properties, such as tree growth and mortality (Clark et al., 2004; Stovall et al., 2019), foliar biochemistry (Chadwick and Asner, 2016; Wang et al., 2020), and landscape-scale carbon storage (Graves et al., 2018b). Because field data on these properties are measured on individual trees, individual level tree detection allows connecting field data directly to image data. In addition, growth, mortality, and changes in carbon storage occur on the scale of individual trees such that detection of individual crowns allows direct tracking of these properties across space and time. This allows researchers to understand questions like how individual level attributes relate to mortality in response to disturbance and pests and how the spatial configuration of individual trees within a landscape influences resilience. As a result, this individual level data may be useful for promoting fire resistance landscapes and combating large scale pest outbreaks. While it is possible to aggregate information solely at the stand level, we believe that individual level data opens new possibilities in large scale forest monitoring and provides richer insights into the underlying mechanisms that drive these patterns.

At landscape scales, research is often focused on the effect of environmental and anthropogenic factors on forest structure and biodiversity (Denslow, 1995). For example, understanding why tree biomass and traits vary across landscapes has direct applications to numerous ecological questions and economic implications (e.g., Laubhann et al., 2009). Often this requires sampling at a number of disparate locations and either extrapolation to continuous patterns at landscape scales, or assumptions that the range of possible states of the system are captured by the samples. Using the individual level data from this data set, we can now produce continuous high-resolution maps across entire NEON sites for enabling landscape scale studies of multiple ecological phenomena (Figure 7). For example, previous work has found that functional and species diversity at local scales promotes biomass and tree growth (Barrufol et al., 2013; Liang et al., 2016). Similar findings have been reported for phylogenetic diversity at local scales (Satdichanh et al., 2019). Especially when combining with species data, using the crown data to investigate the scale and strength of these effects will inform the mechanisms of community assembly, ecological stability, and forest productivity. These landscape scale responses can then be combined with high resolution data on natural and anthropogenic drivers (e.g., topography, soils, and fire management) to model forest dynamics at broad scales.

Figure 7 with 1 supplement see all

Download asset Open asset

Tree density maps for Teakettle Canyon, California (left) and Ordway Swisher Biological Station, Florida (right).

For each 100 m² pixel, the total number of predicted crowns were counted.

By focusing on detecting individual trees, this approach to landscape scale forest analysis does not require assumptions about spatial similarity, sufficiently extensive sampling, or consistent responses of the ecosystem to drivers across spatial gradients. This is important because the heterogeneity of forest landscapes makes it difficult to use field plot data on quantities such as tree density and biomass to extrapolate inference to broad scales (Marvin et al., 2014). To illustrate this challenge, we compared field-measured tree densities of NEON field plots to estimated densities of 10,000 remotely sensed plots of the same size placed randomly throughout the landscapes across footprints of the airborne data. We attempted to change the Woody Vegetation data as little as possible (i.e., compared with the more refined filtered data in previous analyses) in order to obtain estimates of tree cover in a plot from the field data. To be included in this analysis, trees needed to have valid spatial coordinates and a minimum height of 3 m. Some older data lacked height estimates, in which case we used a minimum DBH threshold of 15 cm. In each simulated plot, we then counted the total number of predicted tree crowns to create a distribution of tree densities at the site level (Figure 8). Comparing the field plot tree densities with the distribution from the full site shows deviations for most sites, with NEON field plots exhibiting higher tree densities than encountered on average in the airborne data for some sites (e.g., Teakettle Canyon, CA) and lower tree densities than from remote sensing in others (e.g., Ordway-Swisher Biological Station). While this kind of comparison is inherently difficult due to differing thresholds and filters for data inclusion in field versus remotely sensed data, it highlights that even well stratified sampling of large landscapes as was done with NEON plots (see NEON technical documents for NEON.DP1.10098) can produce differing tree attribute estimates than continuous sampling from remote sensing data. Combining representative field sampling with remote sensing to produce data products like the NEON Crowns data set provides an approach to addressing this challenge to improve estimations of the abundance, biomass, and size distributions across large geographic areas.

Figure 8

Download asset Open asset

Comparison of tree counts between the field-collected NEON plots and the predicted plots from the data set.

For the remote sensing data, 10,000 simulated 40 m² plots were calculated for each site for the full AOP footprint for each year. To mimic NEON sampling, two quadrants were randomly sampled in each simulated plot. No plots on water, bare ground, or herbaceous land classes were included in the comparison. We selected three sites from three NEON domains to show a sample of sites across the continental United States. Both distributed and tower NEON plots were used for these analyses.

The NEON Crowns data set supports the assessment of cross-site patterns to help understand the influence of large-scale processes on forest structure at biogeographic scales. For example, ecologists are interested in how and why forest characteristics such as abundance, biomass, and allometric relationships vary among forest types (e.g., Jucker et al., 2017) and how these patterns covary across environmental gradients (Feldpausch et al., 2011). Understanding these relationships is important for inferring controls over forest stand structure, understanding individual tree biology, and assessing stand productivity. For example, are local patterns of density and structural biomass primarily the result of historical mechanisms, such as dispersal and adaptation, or local mechanisms such as nutrient availability? By providing standardized data that span near-continental scales, this data set can help inform the fundamental mechanisms that shape forest structure and dynamics. For example, we can calculate tree allometries (e.g., the ratio of tree height to crown area) on a large number of individual trees across NEON sites and explore how allometry varies with stand density and vegetation type (Figure 9). This analysis shows a continental-scale relationship, with denser forests exhibiting trees with narrower crowns for the same tree height compared with less dense forests, but also clustering and variation in the relationship within vegetation types. For example, subalpine forests illustrate relationships between tree density and allometry that are distinct from other forest types. By defining both general biogeographic patterns, and deviations therein, this data set will allow the investigation of factors shaping forest structure at macroecological scales.

Figure 9

Download asset Open asset

Individual crown attributes for predictions made at each NEON site.

For site abbreviations see Appendix 1. Crown area is calculated by multiplying the width and height of the predicted crown bounding box. Crown height is the 99th quantile of the LiDAR returns that fall inside the predicted crown bounding box. Sites are colored by the dominant forest type to illustrate the general macroecological relationship among sites in similar biomes.

In addition to these ecological applications, the NEON Crowns data set can also act as a foundation for other machine learning and computer vision applications in forest informatics, such as tree health assessments, species classification, or foliar trait estimation both within NEON sites (Wang et al., 2020) and outside of NEON sites (Schneider et al., 2020). In each of these tasks, individual tree delineation is the first step to associate sensor data with ground measurements. However, delineation requires a distinct set of technical background and computational approaches and thus many ecological applications skip an explicit delineation step entirely (Williams et al., 2020a). In addition, the growing availability of continental scale data sets of high resolution remote sensing imagery opens up the possibility for broad scale forest monitoring of individual trees (Brandt et al., 2020; Schneider et al., 2020) that can be supported by this data set. Just as we used weak annotations generated from unsupervised LiDAR algorithms, future developers can use this data set to train in the multiple data types provided by the NEON Airborne Observation Platform across a broad range of forest types. While our crown annotations are not perfect, they are specifically tailored to one of the largest data sets that allows pairing individual tree detections with information on species identity, tree health, and leaf traits through NEONs field sampling, and we believe they are sufficiently robust to serve as the basis for broad scale analysis.

Limitations and further technical developments

An important limitation for this data set is that it only provides information on sun-exposed tree crowns. It is therefore not appropriate for ecological analyses that depend on accurate characterization of subcanopy trees and the three-dimensional structure of forest stands. Fortunately, a number of the major questions and applications in ecology are primarily influenced by large individuals (Enquist et al., 2020). For example, biomass estimation is largely driven by the canopy in most ecosystems, rather than mid or understory trees that are likely to be missed by aerial surveys. Similarly, habitat classification and species abundance curves can depend on the dominant forest structure that can be inferred from coarse resolution airborne data (Shirley et al., 2013) and could be improved using this data set. It may be possible to establish relationships between understory and canopy measures using field data that could allow this data set to be used as part of a broader analysis (Bohlman, 2015; Duncanson et al., 2014; Fischer et al., 2020). However, this would require significant additional research to validate the potential for this type of approach at continental scales.

We experimented with avenues to combine RGB information and a LiDAR CHM to create a jointly learned input and found that no combination of data fusion outperformed the current pipeline (Figure 4—figure supplement 2). The lack of improvement when directly incorporating LiDAR data into the CNN is likely due to a combination of geographic variation in tree shape and LiDAR coverage, sparse LiDAR point densities (~6 pts/m at many NEON sites), and a lack of joint RGB and LiDAR data for initial pretraining. Most LiDAR based methods are evaluated on data from a single forest type with point densities ranging from ~15 pts/m (e.g., Duncanson and Dubayah, 2018) to over 100 pts/m (e.g., Aubry-Kientz et al., 2019). As instrumentation improves to support collecting higher density LiDAR consistently at larger scales and algorithms are improved to allow generalization across forest types, we anticipate updating the data set with improved delineation of the sunlit canopy and begin to add subcanopy trees.

An additional limitation is the uncertainty inherent in the algorithmic detection of crowns. While we found good correspondence between image-based crown annotations and those produced by the model for many sites, there remained substantial uncertainty across all sites and reasonable levels of error in some sites. It is important to consider how this uncertainty will influence the inference from research using this and similar data sets. The model is biased toward undersegmentation, meaning that multiple trees are prone to being grouped as a single crown. It is also somewhat conservative in estimating crown extent wherein it tends to ignore small extensions of branches from the main crown. These biases could impact studies of tree allometry and biomass if the analysis is particularly sensitive to crown area. When making predictions for ecosystem features such as biomass, it will be important to propagate the uncertainty in individual crowns into downstream analyses. While confidence scores for individual detections are provided to aid uncertainty propagation, the use of additional ground truth data may also be necessary to infer reliability.

One aspect of individual crown uncertainty that we have not addressed is the uncertainty related to image-based crown annotations and measurement of trees in the field (Graves et al., 2018a). To allow training and evaluating the model across a broad range of forest types, we used image-based crown annotations. This approach assumes that crowns identifiable in remotely sensed imagery accurately reflect trees on the ground. This will not always be the case, as what appears to be a single crown from above may constitute multiple neighboring trees, and conversely, what appears to be two distinct crowns in an image may be two branches of a single large tree (Graves et al., 2018a). Distinguishing individual trees, especially when considering species with multi-bole stems, can be subjective, even during field surveys. Targeted field surveys will be always needed to validate these predictions and community annotation efforts will allow for assessment of this component of uncertainty. In particular, combining terrestrial LiDAR sampling with airborne sensors is a promising route to both validate the number of stems and establish subcanopy diversity (Calders et al., 2020). In addition, when co-registered hyperspectral data are available, it may help to separate neighboring trees in diverse forests, provided it does not cause lumping of neighboring trees of the same species. Weighing these tradeoffs across a range of forest types remains an open area of exploration.

The machine learning workflow used to generate this data set also has several areas that could be improved for greater accuracy, transferability, and robustness. The current model contains a single class ‘Tree’ with an associated confidence score. This representation prevents the model from differentiating between objects that are not trees and objects for which sufficient training information is not available. For example, the model has been trained to ignore buildings and other vertical structures that may look like trees. However, when confronted by objects data that has never been encountered, it often produces unintuitive results. Examples of objects that did not appear in the training data, and as a result were erroneously predicted as trees, include weather stations, floating buoys, and oil wells. Designing models that can identify outliers, anomalies, and ‘unknown’ objects is an active area of research in machine learning and will be useful in increasing accuracy in novel environments. In addition, NEON data can sometimes be afflicted by imaging artifacts due to co-registration issues with LiDAR and raw RGB imagery (Figure 7—figure supplement 1). This effect can lead to distorted imagery that appears fuzzy and swirled and lead to poor segmentation. An ideal model would detect these areas of poor quality and label them as ‘unknown’ rather than attempting to detect trees in these regions.

Given these limitations, we view this version of the data set as the first step in an iterative process to improve cross-scale individual level data on trees. Ongoing assessment of these predictions using both our visualization tool and field-based surveys will be crucial to continually identify areas for improvements in both training data and modeling approaches. While iterative improvements are important, the accuracy of the current predictions illustrates that this data set is sufficiently precise for addressing many cross-scale questions related to forest structure. By providing broad scale crown data we hope to highlight the promising integration between deep learning, remote sensing, and forest informatics, and provide access to the results of this next key step in ecological research to the broad range of stakeholders who can benefit from these data.

Appendix 1

NEON site abbreviations

Site name	Site ID	Domain number	State	Latitude	Longitude
Abby Road	ABBY	D16	WA	45.76243	−122.33033
Bartlett Experimental Forest	BART	D01	NH	44.06388	−71.28731
Blandy Experimental Farm	BLAN	D02	VA	39.06026	−78.07164
Caribou-Poker Creeks Research Watershed	BONA	D19	AK	65.15401	−147.50258
LBJ National Grassland	CLBJ	D11	TX	33.40123	−97.57
Rio Cupeyes	CUPE	D04	PR	18.11352	−66.98676
Delta Junction	DEJU	D19	AK	63.88112	−145.75136
Dead Lake	DELA	D08	AL	32.54172	−87.80389
Disney Wilderness Preserve	DSNY	D03	FL	28.12504	−81.4362
Guanica Forest	GUAN	D04	PR	17.96955	−66.8687
Harvard Forest	HARV	D01	MA	42.5369	−72.17266
Healy	HEAL	D19	AK	63.87569	−149.21334
Lower Hop Brook	HOPB	D01	MA	42.47179	−72.32963
Jones Ecological Research Center	JERC	D03	GA	31.19484	−84.46861
Jornada LTER	JORN	D14	NM	32.59068	−106.84254
Konza Prairie Biological Station	KONZ	D06	KS	39.10077	−96.56309
Lajas Experimental Station	LAJA	D04	PR	18.02125	−67.0769
Lenoir Landing	LENO	D08	AL	31.85388	−88.16122
Mountain Lake Biological Station	MLBS	D07	VA	37.37828	−80.52484
Moab	MOAB	D13	UT	38.24833	−109.38827
Niwot Ridge Mountain Research Station	NIWO	D13	CO	40.05425	−105.58237
Northern Great Plains Research Laboratory	NOGP	D09	ND	46.76972	−100.91535
Klemme Range Research Station	OAES	D11	OK	35.41059	−99.05879
Ordway-Swisher Biological Station	OSBS	D03	FL	29.68927	−81.99343
Red Butte Creek	REDB	D15	UT	40.78374	−111.79765
Rocky Mountain National Park, CASTNET	RMNP	D10	CO	40.27591	−105.54592
Smithsonian Conservation Biology Institute	SCBI	D02	VA	38.89292	−78.1395
Smithsonian Environmental Research Center	SERC	D02	MD	38.89008	−76.56001
San Joaquin Experimental Range	SJER	D17	CA	37.10878	−119.73228
Soaproot Saddle	SOAP	D17	CA	37.03337	−119.26219
Santa Rita Experimental Range	SRER	D14	AZ	31.91068	−110.83549
Talladega National Forest	TALL	D08	AL	32.95046	−87.39327
Lower Teakettle	TEAK	D17	CA	37.00583	−119.00602
West St Louis Creek	WLOU	D13	CO	39.89137	−105.9154
Woodworth	WOOD	D09	ND	47.12823	−99.24136
Wind River Experimental Forest	WREF	D16	WA	45.82049	−121.95191
Yellowstone Northern Range (Frog Rock)	YELL	D12	WY	44.95348	−110.53914

Data availability

The dataset is available at https://zenodo.org/record/3765872#.X2J1zZNKjOQ.

The following data sets were generated

1. Weinstein BG
2. Marconi S
3. Zare A
4. Bohlman SA
5. Graves SJ
6. Singh A
7. White EP
(2020) Zenodo
NEON Tree Crowns Dataset.
https://doi.org/10.5281/zenodo.3765872

References

1. Aubry-Kientz M
2. Dutrieux R
3. Ferraz A
4. Saatchi S
5. Hamraz H
6. Williams J
7. Coomes D
8. Piboule A
9. Vincent G
(2019) A comparative assessment of the performance of individual tree crowns delineation algorithms from ALS data in tropical forests
Remote Sensing 11:1086.
https://doi.org/10.3390/rs11091086
- Google Scholar
1. Barrufol M
2. Schmid B
3. Bruelheide H
4. Chi X
5. Hector A
6. Ma K
7. Michalski S
8. Tang Z
9. Niklaus PA
(2013) Biodiversity promotes tree growth during succession in subtropical forest
PLOS ONE 8:e81246.
https://doi.org/10.1371/journal.pone.0081246
- PubMed
- Google Scholar
1. Bastin J-F
2. Rutishauser E
3. Kellner JR
4. Saatchi S
5. Pélissier R
6. Hérault B
7. Slik F
8. Bogaert J
9. De Cannière C
10. Marshall AR
11. Poulsen J
12. Alvarez-Loyayza P
13. Andrade A
14. Angbonga-Basia A
15. Araujo-Murakami A
16. Arroyo L
17. Ayyappan N
18. de Azevedo CP
19. Banki O
20. Barbier N
21. Barroso JG
22. Beeckman H
23. Bitariho R
24. Boeckx P
25. Boehning-Gaese K
26. Brandão H
27. Brearley FQ
28. Breuer Ndoundou Hockemba M
29. Brienen R
30. Camargo JLC
31. Campos-Arceiz A
32. Cassart B
33. Chave J
34. Chazdon R
35. Chuyong G
36. Clark DB
37. Clark CJ
38. Condit R
39. Honorio Coronado EN
40. Davidar P
41. de Haulleville T
42. Descroix L
43. Doucet J-L
44. Dourdain A
45. Droissart V
46. Duncan T
47. Silva Espejo J
48. Espinosa S
49. Farwig N
50. Fayolle A
51. Feldpausch TR
52. Ferraz A
53. Fletcher C
54. Gajapersad K
55. Gillet J-F
56. Amaral ILdo
57. Gonmadje C
58. Grogan J
59. Harris D
60. Herzog SK
61. Homeier J
62. Hubau W
63. Hubbell SP
64. Hufkens K
65. Hurtado J
66. Kamdem NG
67. Kearsley E
68. Kenfack D
69. Kessler M
70. Labrière N
71. Laumonier Y
72. Laurance S
73. Laurance WF
74. Lewis SL
75. Libalah MB
76. Ligot G
77. Lloyd J
78. Lovejoy TE
79. Malhi Y
80. Marimon BS
81. Marimon Junior BH
82. Martin EH
83. Matius P
84. Meyer V
85. Mendoza Bautista C
86. Monteagudo-Mendoza A
87. Mtui A
88. Neill D
89. Parada Gutierrez GA
90. Pardo G
91. Parren M
92. Parthasarathy N
93. Phillips OL
94. Pitman NCA
95. Ploton P
96. Ponette Q
97. Ramesh BR
98. Razafimahaimodison J-C
99. Réjou-Méchain M
100. Rolim SG
101. Saltos HR
102. Rossi LMB
103. Spironello WR
104. Rovero F
105. Saner P
106. Sasaki D
107. Schulze M
108. Silveira M
109. Singh J
110. Sist P
111. Sonke B
112. Soto JD
113. de Souza CR
114. Stropp J
115. Sullivan MJP
116. Swanepoel B
117. Steege Hter
118. Terborgh J
119. Texier N
120. Toma T
121. Valencia R
122. Valenzuela L
123. Ferreira LV
124. Valverde FC
125. Van Andel TR
126. Vasque R
127. Verbeeck H
128. Vivek P
129. Vleminckx J
130. Vos VA
131. Wagner FH
132. Warsudi PP
133. Wortel V
134. Zagt RJ
135. Zebaze D
(2018) Pan-tropical prediction of forest structure from the largest trees
Global Ecology and Biogeography 27:1366–1383.
https://doi.org/10.1111/geb.12803
- Google Scholar
1. Bohlman SA
(2015) Species diversity of canopy versus understory trees in a neotropical forest: implications for forest structure, function and monitoring
Ecosystems 18:658–670.
https://doi.org/10.1007/s10021-015-9854-0
- Google Scholar
1. Brandt M
2. Tucker CJ
3. Kariryaa A
4. Rasmussen K
5. Abel C
6. Small J
7. Chave J
8. Rasmussen LV
9. Hiernaux P
10. Diouf AA
11. Kergoat L
12. Mertz O
13. Igel C
14. Gieseke F
15. Schöning J
16. Li S
17. Melocik K
18. Meyer J
19. Sinno S
20. Romero E
21. Glennie E
22. Montagu A
23. Dendoncker M
24. Fensholt R
(2020) An unexpectedly large count of trees in the west african sahara and sahel
Nature 587:78–82.
https://doi.org/10.1038/s41586-020-2824-5
- PubMed
- Google Scholar
1. Calders K
2. Adams J
3. Armston J
4. Bartholomeus H
5. Bauwens S
6. Bentley LP
7. Chave J
8. Danson FM
9. Demol M
10. Disney M
11. Gaulton R
12. Krishna Moorthy SM
13. Levick SR
14. Saarinen N
15. Schaaf C
16. Stovall A
17. Terryn L
18. Wilkes P
19. Verbeeck H
(2020) Terrestrial laser scanning in forest ecology: expanding the horizon
Remote Sensing of Environment 251:112102.
https://doi.org/10.1016/j.rse.2020.112102
- Google Scholar
1. Chadwick K
2. Asner G
(2016) Organismic-Scale remote sensing of canopy foliar traits in lowland tropical forests
Remote Sensing 8:87.
https://doi.org/10.3390/rs8020087
- Google Scholar
(2004) Quantifying mortality of tropical rain forest trees using high-spatial-resolution satellite data
Ecology Letters 7:52–59.
https://doi.org/10.1046/j.1461-0248.2003.00547.x
- Google Scholar
1. Coomes DA
2. Dalponte M
3. Jucker T
4. Asner GP
5. Banin LF
6. Burslem DFRP
7. Lewis SL
8. Nilus R
9. Phillips OL
10. Phua M-H
11. Qie L
(2017) Area-based vs tree-centric approaches to mapping forest carbon in southeast asian forests from airborne laser scanning data
Remote Sensing of Environment 194:77–88.
https://doi.org/10.1016/j.rse.2017.03.017
- Google Scholar
1. Crowther TW
2. Glick HB
3. Covey KR
4. Bettigole C
5. Maynard DS
6. Thomas SM
7. Smith JR
8. Hintler G
9. Duguid MC
10. Amatulli G
11. Tuanmu MN
12. Jetz W
13. Salas C
14. Stam C
15. Piotto D
16. Tavani R
17. Green S
18. Bruce G
19. Williams SJ
20. Wiser SK
21. Huber MO
22. Hengeveld GM
23. Nabuurs GJ
24. Tikhonova E
25. Borchardt P
26. Li CF
27. Powrie LW
28. Fischer M
29. Hemp A
30. Homeier J
31. Cho P
32. Vibrans AC
33. Umunay PM
34. Piao SL
35. Rowe CW
36. Ashton MS
37. Crane PR
38. Bradford MA
(2015) Mapping tree density at a global scale
Nature 525:201–205.
https://doi.org/10.1038/nature14967
- PubMed
- Google Scholar
Conference
1. Deng J
2. Dong W
3. Socher R
4. Li L
5. Kai Li LF-F
(2009) ImageNet: a large-scale hierarchical image database2009 IEEE conference on computer vision and pattern recognition
Presented at the 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255.
https://doi.org/10.1109/CVPR.2009.5206848
- Google Scholar
1. Denslow JS
(1995) Disturbance and diversity in tropical rain forests: the density effect
Ecological Applications 5:962–968.
https://doi.org/10.2307/2269347
- Google Scholar
(2014) An efficient, multi-layered crown delineation algorithm for mapping individual tree structure across multiple ecosystems
Remote Sensing of Environment 154:378–386.
https://doi.org/10.1016/j.rse.2013.07.044
- Google Scholar
(2015) Small sample sizes yield biased allometric equations in temperate forests
Scientific Reports 5:17153.
https://doi.org/10.1038/srep17153
- PubMed
- Google Scholar
1. Duncanson L
2. Dubayah R
(2018) Monitoring individual tree-based change with airborne lidar
Ecology and Evolution 8:5079–5089.
https://doi.org/10.1002/ece3.4075
- PubMed
- Google Scholar
(2020) The megabiota are disproportionately important for biosphere functioning
Nature Communications 11:14369-y.
https://doi.org/10.1038/s41467-020-14369-y
- Google Scholar
1. Fassnacht FE
2. Latifi H
3. Stereńczak K
4. Modzelewska A
5. Lefsky M
6. Waser LT
7. Straub C
8. Ghosh A
(2016) Review of studies on tree species classification from remotely sensed data
Remote Sensing of Environment 186:64–87.
https://doi.org/10.1016/j.rse.2016.08.013
- Google Scholar
1. Feldpausch TR
2. Banin L
3. Phillips OL
4. Baker TR
5. Lewis SL
6. Quesada CA
7. Affum-Baffoe K
8. Arets EJMM
9. Berry NJ
10. Bird M
11. Brondizio ES
12. de Camargo P
13. Chave J
14. Djagbletey G
15. Domingues TF
16. Drescher M
17. Fearnside PM
18. França MB
19. Fyllas NM
20. Lopez-Gonzalez G
21. Hladik A
22. Higuchi N
23. Hunter MO
24. Iida Y
25. Salim KA
26. Kassim AR
27. Keller M
28. Kemp J
29. King DA
30. Lovett JC
31. Marimon BS
32. Marimon-Junior BH
33. Lenza E
34. Marshall AR
35. Metcalfe DJ
36. Mitchard ETA
37. Moran EF
38. Nelson BW
39. Nilus R
40. Nogueira EM
41. Palace M
42. Patiño S
43. Peh KS-H
44. Raventos MT
45. Reitsma JM
46. Saiz G
47. Schrodt F
48. Sonké B
49. Taedoumg HE
50. Tan S
51. White L
52. Wöll H
53. Lloyd J
(2011) Height-diameter allometry of tropical forest trees
Biogeosciences 8:1081–1106.
https://doi.org/10.5194/bg-8-1081-2011
- Google Scholar
1. Ferraz A
2. Saatchi S
3. Mallet C
4. Meyer V
(2016) Lidar detection of individual tree size in tropical forests
Remote Sensing of Environment 183:318–333.
https://doi.org/10.1016/j.rse.2016.05.028
- Google Scholar
1. Fischer FJ
2. Labrière N
3. Vincent G
4. Hérault B
5. Alonso A
6. Memiaghe H
7. Bissiengou P
8. Kenfack D
9. Saatchi S
10. Chave J
(2020) A simulation method to infer tree allometry and forest structure from airborne laser scanning and forest inventories
Remote Sensing of Environment 251:112056.
https://doi.org/10.1016/j.rse.2020.112056
- Google Scholar
(2018a) A digital mapping method for linking high-resolution remote sensing images to individual tree crowns
PeerJ 6:e27182v1.
https://doi.org/10.7287/peerj.preprints.27182v1
- Google Scholar
(2018b) A tree-based approach to biomass estimation from remote sensing data in a tropical agricultural landscape
Remote Sensing of Environment 218:32–43.
https://doi.org/10.1016/j.rse.2018.09.009
- Google Scholar
(2016) A robust approach for tree segmentation in deciduous forests using small-footprint airborne LiDAR data
International Journal of Applied Earth Observation and Geoinformation 52:532–541.
https://doi.org/10.1016/j.jag.2016.07.006
- Google Scholar
1. Hansen MC
2. Potapov PV
3. Moore R
4. Hancher M
5. Turubanova SA
6. Tyukavina A
7. Thau D
8. Stehman SV
9. Goetz SJ
10. Loveland TR
11. Kommareddy A
12. Egorov A
13. Chini L
14. Justice CO
15. Townshend JR
(2013) High-resolution global maps of 21st-century forest cover change
Science 342:850–853.
https://doi.org/10.1126/science.1244693
- PubMed
- Google Scholar
1. Jucker T
2. Caspersen J
3. Chave J
4. Antin C
5. Barbier N
6. Bongers F
7. Dalponte M
8. van Ewijk KY
9. Forrester DI
10. Haeni M
11. Higgins SI
12. Holdaway RJ
13. Iida Y
14. Lorimer C
15. Marshall PL
16. Momo S
17. Moncrieff GR
18. Ploton P
19. Poorter L
20. Rahman KA
21. Schlund M
22. Sonké B
23. Sterck FJ
24. Trugman AT
25. Usoltsev VA
26. Vanderwel MC
27. Waldner P
28. Wedeux BM
29. Wirth C
30. Wöll H
31. Woods M
32. Xiang W
33. Zimmermann NE
34. Coomes DA
(2017) Allometric equations for integrating remote sensing imagery into forest monitoring programmes
Global Change Biology 23:177–190.
https://doi.org/10.1111/gcb.13388
- PubMed
- Google Scholar
(2009) The impact of atmospheric deposition and climate on forest growth in european monitoring plots: an individual tree growth model
Forest Ecology and Management 258:1751–1761.
https://doi.org/10.1016/j.foreco.2008.09.050
- Google Scholar
1. Li W
2. Guo Q
3. Jakubowski MK
4. Kelly M
(2012) A new method for segmenting individual trees from the lidar point cloud
Photogrammetric Engineering & Remote Sensing 78:75–84.
https://doi.org/10.14358/PERS.78.1.75
- Google Scholar
1. Liang J
2. Crowther TW
3. Picard N
4. Wiser S
5. Zhou M
6. Alberti G
7. Schulze ED
8. McGuire AD
9. Bozzato F
10. Pretzsch H
11. de-Miguel S
12. Paquette A
13. Hérault B
14. Scherer-Lorenzen M
15. Barrett CB
16. Glick HB
17. Hengeveld GM
18. Nabuurs GJ
19. Pfautsch S
20. Viana H
21. Vibrans AC
22. Ammer C
23. Schall P
24. Verbyla D
25. Tchebakova N
26. Fischer M
27. Watson JV
28. Chen HY
29. Lei X
30. Schelhaas MJ
31. Lu H
32. Gianelle D
33. Parfenova EI
34. Salas C
35. Lee E
36. Lee B
37. Kim HS
38. Bruelheide H
39. Coomes DA
40. Piotto D
41. Sunderland T
42. Schmid B
43. Gourlet-Fleury S
44. Sonké B
45. Tavani R
46. Zhu J
47. Brandl S
48. Vayreda J
49. Kitahara F
50. Searle EB
51. Neldner VJ
52. Ngugi MR
53. Baraloto C
54. Frizzera L
55. Bałazy R
56. Oleksyn J
57. Zawiła-Niedźwiecki T
58. Bouriaud O
59. Bussotti F
60. Finér L
61. Jaroszewicz B
62. Jucker T
63. Valladares F
64. Jagodzinski AM
65. Peri PL
66. Gonmadje C
67. Marthy W
68. O'Brien T
69. Martin EH
70. Marshall AR
71. Rovero F
72. Bitariho R
73. Niklaus PA
74. Alvarez-Loayza P
75. Chamuya N
76. Valencia R
77. Mortier F
78. Wortel V
79. Engone-Obiang NL
80. Ferreira LV
81. Odeke DE
82. Vasquez RM
83. Lewis SL
84. Reich PB
(2016) Positive biodiversity-productivity relationship predominant in global forests
Science 354:aaf8957.
https://doi.org/10.1126/science.aaf8957
- PubMed
- Google Scholar
1. Marvin DC
2. Asner GP
3. Knapp DE
4. Anderson CB
5. Martin RE
6. Sinca F
7. Tupayachi R
(2014) Amazonian landscapes and the Bias in field studies of forest structure and biomass
PNAS 111:E5224–E5232.
https://doi.org/10.1073/pnas.1412999111
- PubMed
- Google Scholar
(2018) Individual tree crown segmentation and classification of 13 tree species using airborne hyperspectral data
Remote Sensing 10:1218.
https://doi.org/10.3390/rs10081218
- Google Scholar
(2020) Estimation of forest growing stock volume with UAV laser scanning data: can it be done without field data?
Remote Sensing 12:1245.
https://doi.org/10.3390/rs12081245
- Google Scholar
1. Roussel J-R
2. Auty D
3. Coops NC
4. Tompalski P
5. Goodbody TRH
6. Meador AS
7. Bourdon J-F
8. de Boissieu F
9. Achim A
(2020) lidR: an R package for analysis of airborne laser scanning (ALS) data
Remote Sensing of Environment 251:112061.
https://doi.org/10.1016/j.rse.2020.112061
- Google Scholar
1. Satdichanh M
2. Ma H
3. Yan K
4. Dossa GGO
5. Winowiecki L
6. Vågen T‐G
7. Gassner A
8. Xu J
9. Harrison RD
(2019) Phylogenetic diversity correlated with above‐ground biomass production during forest succession: evidence from tropical forests in southeast asia
Journal of Ecology 107:1419–1432.
https://doi.org/10.1111/1365-2745.13112
- Google Scholar
(2020) Towards mapping the diversity of canopy structure from space with GEDI
Environmental Research Letters 15:115006.
https://doi.org/10.1088/1748-9326/ab9e99
- Google Scholar
(2013) Species distribution modelling for the people: unclassified landsat TM imagery predicts bird occurrence at fine resolutions
Diversity and Distributions 19:855–866.
https://doi.org/10.1111/ddi.12093
- Google Scholar
(2016) Imputation of Individual Longleaf Pine ( Pinus palustris Mill.) Tree Attributes from Field and LiDAR Data
Canadian Journal of Remote Sensing 42:554–573.
https://doi.org/10.1080/07038992.2016.1196582
- Google Scholar
(2019) Tree height explains mortality risk during an intense drought
Nature Communications 10:4385.
https://doi.org/10.1038/s41467-019-12380-6
- PubMed
- Google Scholar
1. Wang Z
2. Chlus A
3. Geygan R
4. Ye Z
5. Zheng T
6. Singh A
7. Couture JJ
8. Cavender‐Bares J
9. Kruger EL
10. Townsend PA
(2020) Foliar functional traits from imaging spectroscopy across biomes in eastern north america
New Phytologist 228:494–511.
https://doi.org/10.1111/nph.16711
- Google Scholar
1. Weinstein BG
2. Marconi S
3. Bohlman S
4. Zare A
5. White E
(2019) Individual Tree-Crown detection in RGB imagery using Semi-Supervised deep learning neural networks
Remote Sensing 11:1309.
https://doi.org/10.3390/rs11111309
- Google Scholar
Software
1. Weinstein B
2. Marconi S
3. Zare AA
4. Bohlman S
5. Graves S
6. Singh A
7. White E
(2020a) NEON tree crowns dataset, version 0.0.1
Zenodo.

http://doi.org/10.5281/zenodo.3765872
1. Weinstein BG
2. Marconi S
3. Bohlman SA
4. Zare A
5. White EP
(2020b) Cross-site learning in deep learning RGB tree crown detection
Ecological Informatics 56:101061.
https://doi.org/10.1016/j.ecoinf.2020.101061
- Google Scholar
(2020c) DeepForest: a Python package for RGB deep learning tree crown delineation
Methods in Ecology and Evolution 11:1743–1751.
https://doi.org/10.1111/2041-210X.13472
- Google Scholar
Preprint
1. Williams J
2. Schönlieb C-B
3. Swinfield T
4. Irawan B
5. Achmad E
6. Zudhi M
7. Habibi GE
8. Coomes DA
(2020a) SLIC-UAV: a method for monitoring recovery in tropical restoration projects through identification of signature species using UAVs
arXiv.

https://arxiv.org/abs/2006.06624
- Google Scholar
1. Williams J
2. Schonlieb C-B
3. Swinfield T
4. Lee J
5. Cai X
6. Qie L
7. Coomes DA
(2020b) 3d segmentation of trees through a flexible multiclass graph cut algorithm
IEEE Transactions on Geoscience and Remote Sensing 58:754–776.
https://doi.org/10.1109/TGRS.2019.2940146
- Google Scholar
1. Zheng Z
2. Zeng Y
3. Schneider FD
4. Zhao Y
5. Zhao D
6. Schmid B
7. Schaepman ME
8. Morsdorf F
(2021) Mapping functional diversity using individual tree-based morphological and physiological traits in a subtropical forest
Remote Sensing of Environment 252:112170.
https://doi.org/10.1016/j.rse.2020.112170
- Google Scholar

Article and author information

Author details

Ben G Weinstein

Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, United States

Contribution
Conceptualization, Data curation, Investigation, Visualization, Writing - review and editing

For correspondence
ben.weinstein@weecology.org

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2176-7935
Sergio Marconi

Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, United States

Contribution
Conceptualization, Data curation, Writing - original draft, Writing - review and editing

Competing interests
No competing interests declared
Stephanie A Bohlman

School of Forest Resources and Conservation, University of Florida, Gainesville, United States

Contribution
Funding acquisition, Investigation, Methodology, Writing - review and editing

Competing interests
No competing interests declared
Alina Zare

Department of Electrical and Computer Engineering, University of Florida, Gainesville, United States

Contribution
Conceptualization, Supervision, Methodology, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-4847-7604
Aditya Singh

Department of Agricultural & Biological Engineering, University of Florida, Gainesville, United States

Contribution
Conceptualization, Software, Methodology, Writing - original draft

Competing interests
No competing interests declared
Sarah J Graves

Nelson Institute for Environmental Studies, University of Wisconsin-Madison, Madison, United States

Contribution
Conceptualization, Data curation, Methodology, Writing - review and editing

Competing interests
No competing interests declared
Ethan P White
1. Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, United States
2. Informatics Institute, University of Florida, Gainesville, United States
3. Biodiversity Institute, University of Florida, Gainesville, United States
Contribution
Conceptualization, Data curation, Supervision, Writing - original draft, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-6728-7745

Funding

Gordon and Betty Moore Foundation (GBMF4563)

Ethan P White

National Science Foundation (1926542)

Stephanie A Bohlman
Alina Zare
Aditya Singh
Ethan P White

National Institute of Food and Agriculture (McIntire Stennis projects #1007080)

Stephanie A Bohlman

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank NEON staff and in particular Tristan Goulden and Courtney Meier for their assistance and support. This research was supported by the Gordon and Betty Moore Foundation’s Data-Driven Discovery Initiative (GBMF4563) to EP White and by the National Science Foundation (1926542) to EP White, SA Bohlman, A Zare, DZ Wang, and A Singh and USDA National Institute of Food and Agriculture McIntire Stennis projects #1007080 to SA Bohlman. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Version history

Received: September 9, 2020
Accepted: February 15, 2021
Accepted Manuscript published: February 19, 2021 (version 1)
Version of Record published: February 19, 2021 (version 2)

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

4,622

views
359

downloads
37

citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Ben G Weinstein
Sergio Marconi
Stephanie A Bohlman
Alina Zare
Aditya Singh
Sarah J Graves
Ethan P White

(2021)

A remote sensing derived data set of 100 million individual tree crowns for the National Ecological Observatory Network

eLife 10:e62922.

https://doi.org/10.7554/eLife.62922

Share this article

Cite this article

Locations of 37 NEON sites included in the NEON crowns data set and examples of tree predictions shown with RGB imagery for six sites.

The Neon crowns data set provides individual-level tree predictions at broad scales.

Workflow diagram adapted from Weinstein et al., 2020c.

Precision and recall scores for the algorithm used to create the NEON crowns data set (red points), as well as the DeepForest model from Weinstein et al., 2020a (blue points).

Overstory stem recall rate for NEON sites with available field data.

Comparison of field and remote sensing measurements of tree heights for 11 sites in the National Ecological Observatory Network.

Tree density maps for Teakettle Canyon, California (left) and Ordway Swisher Biological Station, Florida (right).

Comparison of tree counts between the field-collected NEON plots and the predicted plots from the data set.

Individual crown attributes for predictions made at each NEON site.

Author details

Ben G Weinstein

Contribution

For correspondence

Competing interests

Sergio Marconi

Contribution

Competing interests

Stephanie A Bohlman

Contribution

Competing interests

Alina Zare

Contribution

Competing interests

Aditya Singh

Contribution

Competing interests

Sarah J Graves

Contribution

Competing interests

Ethan P White

Contribution

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Further reading