Elsevier

Environmental Modelling & Software

Volume 109, November 2018, Pages 420-428
Environmental Modelling & Software

A new rapid watershed delineation algorithm for 2D flow direction grids

https://doi.org/10.1016/j.envsoft.2018.08.017Get rights and content

Abstract

In this paper we propose an algorithm for retrieving an arbitrary watershed boundary from a 2D Flow Direction Grid. The proposed algorithm and associated data model provides geometric speed increases in watershed boundary retrieval while keeping storage constraints linear in comparison to existing techniques. The algorithm called Watershed Marching Algorithm (WMA) relies on an existing data structure, the modified nested set model, originally described by Celko and applied to hydrodynamic models by Haag and Shokoufandeh in 2017. In contrast to existing algorithms that scale proportionally to the area of the underlying region, the complexity of the WMA algorithm is proportional to the boundary length. Results for a group of tested watersheds (n = 14,718) in the 36,000 km2 Delaware River Watershed show a reduction of between 0 and 99% in computational complexity using a 30 m DEM vs. existing techniques.

Introduction

Environmental models and associated Decision Support Systems (DSS) rely on the retrieval and modelling of environmental objects and analysis to satisfy users data needs Matthies et al. (2007). Returning relevant information in these support tools can be problematic due to the complexity of human and natural systems van Delden et al. (2007). This is a general trend and “as better data and methods become available … realism and relevance are increasing … allowing direct support for management and policy development.” Matthies et al. (2007). DSS systems allow resource managers to test a number of scenarios and their potential impact on natural cycles de Kort and Booij (2007). Building these systems have been shown to contribute to the long term stability of economically important natural resources Carrick and Ostendorf (2007).

To support the creation of DSS with a focus on hydrological systems, a sustained effort to organize and distribute hydrological data is ongoing. Ames et al. (2012) and Goodall et al. (2008) worked on the development of a standard web based platform to facilitate hydrological data discovery, visualization and analysis from disparate sources. Andrews et al. (2011) worked on developing a collection of open source software packages for the analysis of hydrological data. While, Castronova and Goodall (2010) focused on a standard modelling interfaces for hydrologic systems. Lastly, Beran and Piasecki (2009) describe techniques to discover existing hydrologic data resources from a variety of disparate sources. This work demonstrates the need for more interconnected data, models, and standards to support integration of hydrological data.

Watershed delineation and retrieval is the “basic modelling element” (Tesfa et al., 2011) for hydrological problems and as such watershed boundaries are a key component in hydrologically focused DSS. Watersheds are important for a number of areas of interest including biological (Daniel et al., 2011; Wang et al., 1997), ecological (Daniel et al., 2011; Malvadkar et al., 2015), infrastructure (e.g. cadastral) flood (Al-Sabhan et al., 2003), and engineering. Because of their importance, a number of software packages (e.g., ESRI Hydrological Toolset (ESRI, 2017), TAUDEM (Tarboton, 2015)) have been designed to support the creation of watersheds as polygonal boundaries for a given pour point (e.g., a vector of coordinates that represent a polygon on a two dimensional planar surface).

Watershed boundaries are currently represented in DSS as either static pre-calculated or cached boundaries (vector) from national or regional datasets (e.g., National Hydrography Plus Version 2 (NHDPlusv2), USGS Hydrologic Unit Codes (HUC), or HydroSheds) or they are delineated on the fly based on user input from Digital Elevation Models (e.g., wikiwatersheds, USGS Stream Stats, Iowa Flood Information System, ESRI's watershed API, and others). These two approaches both exhibit shortcomings, namely predefined watershed boundaries limit the users ability to obtain results on specific geographies of interest. Conversely, retrieval of watershed boundaries ‘on the fly’ is computationally prohibitive using existing published techniques.

In this manuscript we focus on data structures and algorithms to efficiently retrieve watershed boundaries from digital flow direction grids. Baker et al. (2006) discussed standard practices to create polygonal watershed boundaries using a flow direction grid approach as 1) creating a digital elevation model (DEM), 2) filling spurious sinks, 3) creating a flow direction grid, 4) burning in fluvial hydrological features (e.g., streams and rivers), and 5) using an accumulation of cell contributions across the landscape. Our proposed algorithm substitutes for the last step a general data structure and its corresponding vertex labelling called the Modified Nested Set Model (MNSM) as described in Haag and Shokoufandeh (2017). A novel algorithm as described in this manuscript Watershed Marching Algorithm (WMA) is then applied to retrieve watershed boundaries from the MNSM.

The WMA belongs to a family of methods known as marching algorithm. Sethian (1999) describes the application of marching algorithms to problems “in … physics, chemistry, fluid mechanics, combustion, image processing, [and] material science”. Fundamentally marching algorithms separate surfaces into inside and outside components. In biomedical image processing, Lorensen and Cline (1987) describe a three-dimensional marching algorithm also known as, “Marching Cubes” that divides medical images into triangles and associated 3-D objects to identify and retrieve regions of interest (e.g., organs or abnormalities). Efficient marching algorithms make local identity decisions as Boolean conditions (on the boundary or not on the boundary) (Sethian, 1999) to make a forward march. Therefore, the application of marching algorithms to the watershed delineation problem requires a solution to the global problem of watershed inclusion given any input pour point. Local FDG and DEM values are indeterminate for the global watershed identity problem. The modified nested set labelling allows local determination of the global watershed identity problem for any grid cell with an associated linear increase of storage costs. To the best of our knowledge the only other data model that allows local determination of watershed identity is the identity matrix as used in the SWAT model as described by Olivera et al. (2006). The identity matrix stores every cell-to-cell relationship and therefore storage costs scales quadraticly in comparison to our linear solution. Storage requirements for larger watersheds become too large to effectively use, for example the Amazon River basin results in a difference of 2 * 6.7 billion vs. (6.7 billion)2 for the identity matrix vis-à-vis the nested set index with a 30 m grid cell size.

Section snippets

Notation

Before describing the WMA algorithm in detail, we present the notation and structures used throughout the rest of the manuscript. We use the set of natural numbers {0,1,2,3} for encoding cardinal directions North, East, South, and West, respectively. We apply the WMA algorithm to delineate watershed boundaries for any surface using a regularly spaced flow direction D8 grid (Jenson and Domingue, 1988) as input. Within a D8 structure every grid cell contains a value in {20,21,,27} that denotes

Method

Our algorithm relies on certain structural properties of a large flow direction raster grid. Specifically, we distinguish between root and internal cells of a flow grid. The root cells are identified as grid cells that flow into cells with null values (e.g., large bodies of water, oceans, and lakes), or off the grid. All other grid cells that flow to cells with valid flow directions are internal cells. As we discussed earlier, our marching algorithm utilizes an auxiliary data structure known as

Data model pre-processing cost

The computational cost for converting the original flow direction grid into the modified nested set model is linear based on the size of the grid Haag and Shokoufandeh (2017). The algorithm crosses every edge between grid cells exactly two times on the graph traversal. The D8 flow direction grid only allows one edge (the downstream connection) for every grid cell (except for the root which has no edges), therefore the application runs in exactly (2n1) where n is the number of grid cells. The

Complexity reduction

The observed reduction in computational complexity is consistent with the observation that the proposed algorithm has the same computational complexity as the size of watershed boundary, or to put it another way in the length of the polygonal chain Ω(v*). In contrast, existing algorithms scale in proportionality to the area of the watershed. We expect that based on the watershed shape the proposed algorithm retrieval complexity would be geometrically smaller then existing techniques. Empirical

Conclusions

  • 1.

    This manuscript describes an algorithm named Watershed Marching Algorithm (WMA) to retrieve a watershed boundary from a D8 flow direction grid.

  • 2.

    We show that the WMA algorithm returns the same watershed boundary as existing grid searching techniques.

  • 3.

    The WMA algorithm marches around the exterior boundary never entering the interior or exterior of the watershed.

  • 4.

    Results for the DE River Watershed show a reduction from 35 million to 45 thousand read operations.

  • 5.

    Testing on an ordinary desktop machine

References (24)

  • H. van Delden et al.

    Integration of multi-scale dynamic spatial models of socio-economic and physical processes for river basin management

    Environ. Model. Software

    (2007)
  • M.E. Baker et al.

    Comparison of automated watershed delineations

    Photogramm. Eng. Rem. Sens.

    (2006)
  • Cited by (0)

    View full text