Country-wide high-resolution vegetation height mapping with Sentinel-2

https://doi.org/10.1016/j.rse.2019.111347Get rights and content

Highlights

  • Vegetation height at 10 m ground sampling distance is regressed from Sentinel-2

  • Country-wide maps are computed for Switzerland and Gabon

  • Mean absolute error (MAE) of 1.7 m in Switzerland and 4.3 m in Gabon

  • The deep convolutional neural network correctly predicts vegetation heights up to 50 m

Abstract

Sentinel-2 multi-spectral images collected over periods of several months were used to estimate vegetation height for Gabon and Switzerland. A deep convolutional neural network (CNN) was trained to extract suitable spectral and textural features from reflectance images and to regress per-pixel vegetation height. In Gabon, reference heights for training and validation were derived from airborne LiDAR measurements. In Switzerland, reference heights were taken from an existing canopy height model derived via photogrammetric surface reconstruction. The resulting maps have a mean absolute error (MAE) of 1.7 m in Switzerland and 4.3 m in Gabon (a root mean square error (RMSE) of 3.4 m and 5.6 m, respectively), and correctly estimate vegetation heights up to >50 m. They also show good qualitative agreement with existing vegetation height maps. Our work demonstrates that, given a moderate amount of reference data (i.e., 2000 km2 in Gabon and ≈5800 km2 in Switzerland), high-resolution vegetation height maps with 10 m ground sampling distance (GSD) can be derived at country scale from Sentinel-2 imagery.

Introduction

Vegetation height is a basic variable to characterise a forest's structure, and is known to correlate with important biophysical parameters like primary productivity (Thomas et al., 2008), above-ground biomass (Anderson et al., 2006) and bio-diversity (Goetz et al., 2007). However, direct measurement of tree height does not scale to large areas and/or high spatial resolution: in-situ observations are in practice only feasible for a limited number of sample plots and logging sites. Airborne light detection and ranging (LiDAR) can map canopy height over ground densely and accurately, but the financial cost and the limited area covered per day only allow for small regional projects (some countries of moderate size have complete coverage, but with low revisit times of several years between subsequent acquisitions). Finally, space-borne LiDAR provides world-wide coverage, but the measurements are sparse in both space and time: distances between adjacent profiles are in the tens of kilometers, and nearby observations have been acquired up to 6 years apart. After 7 years of data collection, the point density in Gabon, for example, is only 1.26 shots per km2 (Baghdadi et al., 2013). Moreover, each measurement is averaged over a ground footprint of 70 m radius.

Hence, dense wide-area maps of canopy height are typically obtained by regression from multi-spectral satellite images, using in-situ or LiDAR heights as reference data to fit the regression model (Lefsky, 2010, Hudak et al., 2002). This approach has made it possible to produce tree height maps with ground resolutions down to 30 m, by exploiting the Landsat archive (Hansen et al., 2016).

Here, we demonstrate country-wide mapping of canopy height with a ground resolution of 10 m, by regression from Sentinel-2 multi-spectral data. At such high resolutions, the spectral signature of an individual pixel is no longer sufficient to predict tree height. Rather, the physical phenomena underlying the monocular prediction of tree height, like shadowing, roughness, and species distribution give rise to reflectance patterns across neighbourhoods of multiple pixels. It is, however, not obvious how to encode the resulting image textures into predictive feature descriptors that support the regression. To sidestep this problem, we resort to deep learning. Recent progress in computer vision and image analysis has impressively demonstrated that very deep1 convolutional neural networks (CNNs) are able to learn a tailored multi-level feature encoding for a given prediction task from raw images, given a sufficient (large) amount of training data. Our experiments reveal that texture patterns are particularly important in areas of high (tropical) forest, extending the sensitivity of the regressor to heights up to ≈55 m. End-to-end learning of rich contextual feature hierarchies underlies several successes of image and raster data analysis, including visual recognition of objects (Krizhevsky et al., 2012), understanding human speech from spectrograms (Abdel-Hamid et al., 2014) and assessment of positions in board games like go or chess (Silver et al., 2018).

We employ a deep convolutional neural network to regress country-wide canopy height for Gabon and Switzerland from 13-channel Sentinel-2 Level 2A images (corrected to bottom-of-atmosphere reflectance), using reference values obtained from airborne LiDAR scans and photogrammetric stereo matching as training data. The two countries were selected because in both we have access to reference data for training and quantitative evaluation: in Switzerland from the national forest inventory program; in Gabon via NASA's LVIS project. At the same time, the two countries are very different in terms of their geography and biomes, which supports our belief that the proposed approach can be scaled up to global coverage. Importantly, we also find that no long time series or multi-temporal signatures are required. A few observations per pixel (4 to 12) already achieve low prediction errors – in fact, even predicting from a single image yields fairly decent results. This means that, at the 5-day revisit cycle of Sentinel-2, we are able to obtain almost complete coverage using only the 10 clearest images within the leaf-on season (May–September) for Switzerland or within a period of 12 months in tropical forest regions with frequent cloud cover.

Our work is, to our knowledge, the first to demonstrate large-scale vegetation height mapping from optical satellites at 10 m GSD. The model is able to retrieve tree heights up to ≈55 m, well beyond the saturation level of existing high-resolution canopy height maps (e.g., Hansen et al., 2016). At the technical level, we are not aware of any other work that employs deep CNNs for canopy height estimation from optical satellite data.

Based on the present work, the next goal is to generate a global, wall-to-wall map of canopy height.

Section snippets

Remote sensing of vegetation height

The most straightforward approach to measure canopy height over large areas is airborne or spaceborne LiDAR. By directly measuring range from the sensor to both points near the tree tops and points on the ground (as well as further ones in between), LiDAR delivers a direct and very accurate observation of the canopy height over ground, and also makes it possible to derive further information about vegetation structure. That approach was developed as soon as airborne LiDAR systems were available

Sentinel-2

Sentinel-2 is a satellite mission within the European Space Agency's (ESA) Copernicus program, consisting of two identical satellites launched in 2015 and 2017, respectively, with an expected lifetime of 7.25 years. The satellites each carry a multi-spectral instrument, and together reach a revisit time of 5 days2. The sensor captures 13 spectral bands with varying spatial resolution (10 m, 20 m, 60 m). Four bands provide 10 m ground

Preprocessing

ESA's sen2cor toolbox provides standard algorithms to correct atmospheric effects (Mueller-Wilm, 2018). As a best practice, we use this toolbox for radiometric correction and create the Level 2A product, i.e., bottom-of-atmosphere reflectance. By decreasing variability due to atmospheric effects, the distribution of the image values is homogenised across different sensing dates and geographic regions, which simplifies the regression problem and may lead to improved generalisation. Moreover, the

Results and discussion

We quantitatively evaluate our approach on 7 regions in total, 5 in Gabon (GA) and 2 in Switzerland (CH). See Fig. 1. Each region is split into spatially disjoint training, validation, and test sets. Depending on the region, four to twelve Sentinel-2 images are available that have overall cloud coverage <70% (Table 1). The CNN is trained on images from multiple acquisition dates, assuming that the vegetation height did not change significantly within the investigated time interval. This makes

Conclusion

Our proposed data-driven approach allows one to map vegetation height at 10 m resolution. We show that the regression from few Sentinel-2 images achieves low error in the tropics as well as in central Europe, and that our method is suitable for country-scale canopy height mapping in terms of generalisation and computation time. Our CNN-based learning engine, which is able to exploit spatial context and texture features, can predict a high-resolution vegetation height map from a single

Acknowledgement

We thank Christian Ginzler from WSL for sharing the reference data for Switzerland. We greatly appreciate the open data policies of the LVIS project and the ESA Copernicus program. The project received funding from Barry Callebaut Sourcing AG, as a part of a Research Project Agreement.

References (58)

  • D. Marmanis et al.

    Classification with an edge: improving semantic image segmentation with boundary detection

    ISPRS J. Photogramm. Remote Sens.

    (2018)
  • E. Naesset

    Determination of mean tree height of forest stands using airborne laser scanner data

    ISPRS J. Photogramm. Remote Sens.

    (1997)
  • O. Abdel-Hamid et al.

    Convolutional neural networks for speech recognition

    IEEE/ACM IEEE Trans. Audio Speech Lang. Process.

    (2014)
  • J.B. Abshire et al.

    Geoscience laser altimeter system (GLAS) on the ICESat mission: on-orbit measurement performance

    Geophys. Res. Lett.

    (2005)
  • G. Asner et al.

    High-resolution mapping of forest carbon stocks in the Colombian Amazon

    Biogeosciences

    (2012)
  • A. Baccini et al.

    A first map of tropical Africa's above-ground biomass derived from satellite imagery

    Environ. Res. Lett.

    (2008)
  • N.N. Baghdadi et al.

    Viability statistics of GLAS/ICESat data acquired over tropical forests

    IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens.

    (2013)
  • E. Bendersky

    Depthwise separable convolutions for machine learning

  • J.B. Blair et al.

    AfriSAR LVIS L2 geolocated surface elevation product, version 1. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center.

    (2018)
  • Y. Chen et al.

    Deep learning-based classification of hyperspectral data

    IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens.

    (2014)
  • F. Chollet

    Xception: deep learning with depthwise separable convolutions

  • I. Chrysafis et al.

    Assessing the relationships between growing stock volume and Sentinel-2 imagery in a Mediterranean forest ecosystem

    Remote Sens. Lett.

    (2017)
  • S. Clerc et al.

    S2 MPC - Data Quality Report. ESA, reference S2-PDGS-MPC-DQR,issue 36.

  • D. Eigen et al.

    Depth map prediction from a single image using a multi-scale deep network

  • G. Foody et al.

    Classification of tropical forest classes from Landsat TM data

    Int. J. Remote Sens.

    (1996)
  • GEDI Team

    GEDI ecosystem LiDAR NASA/University of Maryland

  • C. Ginzler et al.

    Countrywide stereo-image matching for updating digital surface models in the framework of the swiss national forest inventory

    Remote Sens.

    (2015)
  • K. He et al.

    Deep residual learning for image recognition

  • M. Immitzer et al.

    First experience with Sentinel-2 data for crop and tree species classifications in central Europe

    Remote Sens.

    (2016)
  • Cited by (122)

    • A deep learning framework for 3D vegetation extraction in complex urban environments

      2024, International Journal of Applied Earth Observation and Geoinformation
    View all citing articles on Scopus
    View full text