Elsevier

Computers & Geosciences

Volume 56, July 2013, Pages 104-118
Computers & Geosciences

Adaptive spatial clustering in the presence of obstacles and facilitators

https://doi.org/10.1016/j.cageo.2013.03.002Get rights and content

Highlights

  • The ASCDT+ algorithm can consider both obstacles (e.g. mountain) and facilitators (e.g. highway).

  • The ASCDT+ algorithm can detect clusters with different shapes and densities at both global and local levels.

  • The ASCDT+ algorithm is easy to implement with no need of user-specified parameters.

Abstract

An intersection-and-combination strategy for clustering spatial point data in the presence of obstacles (e.g. mountain) and facilitators (e.g. highway) is proposed in this paper, and an adaptive spatial clustering algorithm, called ASCDT+, is also developed. The ASCDT+ algorithm can take both obstacles and facilitators into account without additional preprocessing, and automatically detects spatial clusters adjacent to each other with arbitrary shapes and/or different densities. In addition, the ASCDT+ algorithm has the ability to find clustering patterns at both global and local levels so that users can make a more complete interpretation of the clustering results. Several simulated and real-world datasets are utilized to evaluate the effectiveness of the ASCDT+ algorithm. Comparison with two related algorithms, AUTOCLUST+ and DBRS+, demonstrates the advantages of the ASCDT+ algorithm.

Introduction

As one of the main tasks of spatial data mining, spatial clustering aims to separate a spatial dataset into a series of meaningful groups (also called clusters) without prior labeling (Liu et al., 2012). Clustering spatial points, known as a powerful technology for exploratory data analysis, has been widely applied to epidemic monitoring, geographic customer segmentation, crime hotspot analysis, land use detection, seismicity research, and so on (Miller and Han, 2009). Most spatial clustering algorithms utilize Euclidean distance to measure the proximity of spatial points, and a spatial cluster is usually defined as a set of spatial points that Euclidean distances among them are relatively small. However, there are often some obstacles and facilitators in real applications which make the commonly-used Euclidean distance measure ineffective. Obstacles (e.g. mountains, rivers and lakes) are physical objects which can hinder straight reachability among points, while facilitators (e.g. bridges, highways and high-speed railways) are physical objects which can enhance straight reachability among points.

Taking the synthetic dataset in Fig. 1(a) as an example, where the points can be assumed to be the locations of houses, the rivers are obstacles, and the highway is a facilitator. Most existing spatial clustering algorithms can obtain the clustering result shown in Fig. 1(b). When only the obstacles are considered, the clustering result in Fig. 1(c) can be obtained. If the obstacles and the facilitator are taken into account, the clustering result in Fig. 1(d) can be obtained. In Fig. 1(b), three clusters are detected, though the Euclidean distance among all clusters is uniform and entities in the same cluster have different levels of reachability. In Fig. 1(c), the obstacles are considered; however, the facilitator (highway) between cluster C2 and C6 is ignored. Fig. 1(d) is indeed a good interpretation of the clustering patterns with consideration of both the obstacles and the facilitator.

Spatial clustering in the presence of obstacles and facilitators belongs to the field of constraint-based spatial clustering. The consideration of obstacles and facilitators is able to increase the effectiveness of spatial clustering and capture application semantics. In the research field of facility locations, when planning the location of ATMs or supermarkets, the reachability between residential area and these facilities is seriously influenced by obstacles and facilitators (Tung et al., 2001). In the research field of crime hot spot analysis, the clustering results may be useless or distorted if obstacles and facilitators are ignored (Wang et al., 2011). In the research field of image processing, clustering with certain kinds of pixels as obstacle can improve the effectiveness of image segmentation (Estivill-Castro and Lee, 2004). In addition, the result of spatial clustering that considers obstacles and facilitators may be further used in the fields of spatial association rules mining, cartographic generalization and geographic customer segmentation (Li, 2007, Miller and Han, 2009). Thus, though a constraint-based spatial clustering algorithm is more complex than a traditional spatial clustering algorithm, it is indeed helpful for exploratory spatial analysis. Moreover, a good spatial clustering algorithm is also expected to have the following characteristics (Ester et al., 1996; Estivill-Castro and Lee, 2002a; Deng et al., 2011):

  • Adaptiveness (less input parameters, local parameter rather than global parameter);

  • Multi-level (discover clusters at both global and local levels);

  • Identify clusters with different densities and arbitrary shapes;

  • Clustering the whole dataset rather than a sample;

  • High efficiency and effectiveness.

Based on the above, a new strategy for considering obstacles and facilitators in this study, and a novel spatial clustering algorithm is developed with the help of the ASCDT algorithm (Deng et al., 2011), named ASCDT+. The rest of this paper is organized as follows. In Section 2, an overview of related work is first provided, and then the strategy for spatial clustering with obstacles and facilitators is illustrated. In Section 3, the principle of the ASCDT algorithm is briefly mentioned. The ASCDT+ algorithm is fully elaborated in Section 4. In Section 5, the ASCDT+ algorithm and two related algorithms, AUTOCLUST+ and DBRS+, are evaluated using four simulated datasets, and two real world datasets are utilized to test the practicability of the ASCDT+ algorithm. Conclusions and main findings are presented in Section 6.

Section snippets

Related work

Currently, only a few algorithms consider obstacles and/or facilitators in the spatial clustering process. In what follows, a brief overview of some previous algorithms will be provided.

COD-CLARANS (Tung et al., 2001) was the first spatial clustering algorithm designed to consider obstacles in a spatial database. COD-CLARANS is an extension of the classic partitioning-based spatial clustering algorithm called CLARANS (Ng and Han, 1994). The COD-CLARANS algorithm involves three main procedures.

ASCDT algorithm

Delaunay triangulation network has been proven to be a powerful tool for spatial clustering (Eldershaw and Hegland, 1997, Kang et al., 1997, Estivill-Castro and Lee, 2002a, Estivill-Castro and Lee, 2002b; Liu, 2008; Yang and Cui, 2010). The ASCDT algorithm is a novel graph-based spatial clustering algorithm based on Delaunay triangulation for two-dimensional spatial points. Compared with existing spatial clustering algorithms by using Delaunay triangulation, hierarchical constraints are

ASCDT+ algorithm

ASCDT+ is a generalization of the ASCDT algorithm. The generalization is embodied in three aspects, as shown in Fig. 2. The first aspect considers obstacles. The second considers facilitators. The third considers obstacles and facilitators to obtain multi-level clustering results. In what follows, the methods for extending ASCDT to ASCDT+ will be introduced.

Experiments and comparisons

Four simulated datasets S1 to S4 (shown in Fig. 6) and one real world datasets are used to evaluate the ASCDT+ algorithm. The simulated datasets used in this study are designed according to several previous studies (Liu et al., 2008, Deng et al., 2011), and these datasets are used as the benchmark datasets to test the proposed algorithm. Two previous algorithms, AUTOCLUST+ and DBRS+, are utilized for comparison. The DBRS+ algorithm requires two input parameters, i.e. Eps and MinPts, and they

Conclusion and future work

An adaptive spatial clustering algorithm (i.e. ASCDT+) with considerations of obstacles and facilitators has been proposed in this study, which is based on the intersection-and-combination strategy. Through experiments on both synthetic and real world datasets, three good characteristics of the ASCDT+ algorithm have been demonstrated. First, the ASCDT+ algorithm has the ability to effectively consider obstacles and facilitators simultaneously with less prior knowledge compared to the related

Acknowledgments

The work was supported by the Major State Basic Research Development Program of China (973 Program), No. 2012CB719906, Program for New Century Excellent Talents in University (NCET), No. NCET-10–0831, and National Natural Science Foundation of China (NSFC), No. 40871180.

References (20)

There are more references available in the full text version of this article.

Cited by (14)

  • To centralize or to decentralize? A systematic framework for optimizing rural wastewater treatment planning

    2021, Journal of Environmental Management
    Citation Excerpt :

    In this research, a modified adaptive spatial clustering algorithm (ASCA) which based on Delaunay Triangle (DT), was used (Deng et al., 2011; Cetinkaya et al., 2014). And the main modifications are: a) edge length restrictions have been added, and while the length of a DT edge is greater than the given value (default value was set as 100 m), it will be deleted; b) the space barriers were considered (such as rivers, roads and ridges), and the DT edge intersects the barrier in two dimensions will be deleted (Liu et al., 2013) (see supplementary material for more details). In this research, the household cluster in each research area was defined as community.

  • An interactive web-based geovisual analytics platform for co-clustering spatio-temporal data

    2020, Computers and Geosciences
    Citation Excerpt :

    Most previous studies on clustering analysis of spatio-temporal data reply on one-way clustering, also called traditional clustering (Fig. 1), which analyzes the data from spatial or temporal aspect separately. Specifically, spatial clustering groups locations in the data into clusters with similar values of the attribute(s) along all timestamps (Fig. 1b) whereas temporal clustering groups timestamps into clusters with similar value along all locations (Fig. 1c) (Wu et al., 2013; Liu et al., 2013). However, such clustering analysis only consider the space- or time-varying behavior of the data (Cheng et al., 2014).

  • AN AREA MERGING METHOD in MAP GENERALIZATION CONSIDERING the BOUNDARY CHARACTERISTICS of STRUCTURED GEOGRAPHIC OBJECTS

    2019, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives
  • AGGREGATION in LAND-COVER DATA GENERALIZATION CONSIDERING SPATIAL STRUCTURE CHARACTERISTICS

    2019, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
View all citing articles on Scopus
View full text