Elsevier

Expert Systems with Applications

Volume 46, 15 March 2016, Pages 324-335
Expert Systems with Applications

Spatial co-location pattern mining for location-based services in road networks

https://doi.org/10.1016/j.eswa.2015.10.010Get rights and content

Highlights

  • A new approach for mining spatial co-location patterns was presented.

  • The neighborhood was refined using network distances rather than Euclidean ones.

  • An efficient algorithm to build the neighborhood relationship graph was proposed.

  • The performance of the proposed algorithm was explored.

Abstract

With the evolution of geographic information capture and the emergency of volunteered geographic information, it is getting more important to extract spatial knowledge automatically from large spatial datasets. Spatial co-location patterns represent the subsets of spatial features whose objects are often located in close geographic proximity. Such pattern is one of the most important concepts for geographic context awareness of location-based services (LBS). In the literature, most existing methods of co-location mining are used for events taking place in a homogeneous and isotropic space with distance expressed as Euclidean, while the physical movement in LBS is usually constrained by a road network. As a result, the interestingness value of co-location patterns involving network-constrained events cannot be accurately computed. In this paper, we propose a different method for co-location mining with network configurations of the geographical space considered. First, we define the network model with linear referencing and refine the neighborhood of traditional methods using network distances rather than Euclidean ones. Then, considering that the co-location mining in networks suffers from expensive spatial-join operation, we propose an efficient way to find all neighboring object pairs for generating clique instances. By comparison with the previous approaches based on Euclidean distance, this approach can be applied to accurately calculate the probability of occurrence of a spatial co-location on a network. Our experimental results from real and synthetic data sets show that the proposed approach is efficient and effective in identifying co-location patterns which actually rely on a network.

Introduction

With the ubiquity of wireless Internet access, GPS-enabled mobile terminals and the advance in spatial database management systems, a new generation of mobile services, known as location-based services (LBS), has been developed. These services are capable of delivering geographic information and geoprocessing power to mobile users according to their current location (Beatty, 2002). Understanding the geographic context is the essential question in the LBS area. Geographic context, the information of the surroundings or circumstances, has been an important topic in many applications. By introducing the spatial data analysis and geo-sensor data collection into the geographic context-awareness, the service provider could establish reliable and high-quality services to help users in their trip planning, activity re-scheduling, and the decision-making process.

A spatial co-location pattern is a set of spatial features that are frequently located together in spatial proximity (Shekhar & Huang, 2001). As an important concept for spatial analysis and geographic context awareness, the spatial co-location pattern mining has been popularly applied in discovering the spatial dependency of objects. Spatial dependency is a tendency of observed objects located close to another in the geographic space to show a higher degree of similarity or dissimilarity (Miller & Han, 2009). All computed dependency patterns concerning objects in space depend on the definition of spatial neighborhood. In the literature, closeness can be defined by using different types of distance metric, like Euclidean distance, path distance, etc. To provide users realistic geographic information, providers should choose the distance measure which best fits the geographic context in study regions.

So far, many different studies on spatial co-location mining have been carried out. However, all of the previous studies are based on the Euclidean (or planar) space assumption. The researchers (Huang et al., 2004, Shekhar and Huang, 2001, Yoo and Shekhar, 2004) assume that the patterns of interest are occurred in an infinitely homogeneous and isotropic space, and spatial proximity between two objects is measured by the straight-line Euclidean distance. Miller (1994) pointed out that this can be ill-suited as many human activities are constrained only to the network portion of the planar space. Network distance is indeed a more meaningful and reliable distance measure for analyses related to social and economic process (Okabe et al., 2009, Okabe et al., 2006, Yamada and Thill, 2007). For description simplicity, we term the spatial co-location pattern mining based on network distance as NM-Coloc, while the spatial co-location pattern mining based on Euclidean distance as PM-Coloc.

Road transport networks are emerging nowadays at an urban or extra-urban level. There are many cases that need to capture co-location patterns with network distance. For example, in mobile service applications, clients usually request services with regard to the co-location patterns of urban facilities. These patterns need to be measured using the distance along road network instead of Euclidean distance, because the driven vehicles are restricted to move on pre-defined roads under certain transformation condition. In fact, the computed patterns of facilities that take the limit of the movements of people into account can help service providers provide more attractive location-sensitive advertisements, recommendations, etc. As another example, these co-location patterns would also benefit location choosing for a business and advertisement. By extracting the network-constrained patterns of facilities of different types, the decision makers from a company can plan a better location for a new store considering the profitability of similar retail stores depending on the surrounding objects.

Motivated by the great potential demand and lack in research of network-constrained co-locations, we develop an efficient spatial data mining technique to help domain experts discover useful knowledge from the given network data set. As opposed to the Euclidean distance measure used in Huang et al. (2004) and Shekhar and Huang (2001), we try to find dependencies among features with the network distance. Besides, this study also takes the particular characteristics of network space into consideration and we got two major conclusions: (1) in many LBS applications, the distance of shortest path is more suitable to measure the spatial connection between two locations in the urban environment; (2) spatial neighbors searching in networks is generally a time-consuming task, and an efficient method needs to be performed for building the neighborhood relationship graph.

We conduct a series of comparisons and analyses using our experimental system of knowledge extraction. The results show that our method is effective, efficient and scalable for mining large spatial data sets without the repeated computation of the shortest path. Notably, for creating high order rules, the classical Apriori-like algorithm is also used in our research, but it is not a focus in our topic.

The remainder of the paper is organized as follows. Section 2 highlights related works. Section 3 describes the data model and a new method of NM-Coloc. Experimental results are analyzed in Section 4. Finally, Section 5 concludes this paper.

Section snippets

Related work

The co-location patterns mining is an important offshoot of spatial association rules mining. The notion of spatial association rule was first defined in Koperski and Han (1995), where a model named as reference feature centric model is proposed with attention paid to a particular type of spatial objects. Each set of spatial instances that have neighbor relationships with an instance of the reference feature is considered as a transaction, and traditional rules mining methods (e.g. Apriori

Spatial co-location pattern mining in network spaces

Our proposed solution is based on a framework of mining co-location patterns which is presented in Shekhar and Huang (2001). In this section, we review this model and adopt their prevalence measure of participation index in our method. By comparing the definition of neighborhood in planar space with that in network space, we will point out why such a framework may overestimate the prevalence of patterns in a network. Besides, we will propose an efficient algorithm to find network-constrained

Experimental evaluation

We first compared our approach in finding network co-location patterns with a common co-location mining approach based on Euclidean distance (see Yoo & Shekhar, 2006 ). Then we conducted extensive experiments to show the efficiency of ENS by comparing its results with the performance of NNS. Finally we compared the computation performances of subtasks in the PM-Coloc and NM-Coloc. Real and synthetic data sets were used for our experiments and all the experiments were performed on a DELL

Conclusions and future works

This paper formalized the problem of spatial co-location pattern mining over a study region characterized by the presence of a street network that produces constraints and limitations to the movements of people and transport means. Therefore, the study urban environment represents a different environment than the plain homogeneous plane usually considered in many spatial data mining techniques and instruments.

The proposed method is based on a network model using the linear referencing, and then

Acknowledgments

We are grateful to the editor and the anonymous referees for their valuable comments and suggestions.

References (36)

  • AgrawalR. et al.

    Fast algorithms for mining association rules

  • AiT. et al.

    Generation of constrained network Voronoi diagram using linear tessellation and expansion method

    Computers, Environment and Urban Systems

    (2015)
  • AndrzejewskiW. et al.

    GPU-accelerated collocation pattern discovery. advances in databases and information systems

    (2013)
  • AppiceA. et al.

    Discovery of spatial association rules in geo-referenced census data: a relational mining approach

  • BeattyC.

    Location-based services: navigation for the masses, at last!

    Journal of Navigation

    (2002)
  • BembenikR. et al.

    FARICS: a method of mining spatial association rules and collocations using clustering and delaunay diagrams

    Journal of Intelligent Information Systems

    (2009)
  • BoinskiP. et al.

    Algorithms for spatial collocation pattern mining in a limited memory environment: a summary of results

    Journal of Intelligent Information Systems

    (2014)
  • ClementiniE. et al.

    Mining multiple-level spatial association rules for objects with a broad boundary

    Data & Knowledge Engineering

    (2000)
  • DijkstraE.W.

    A note on two problems in connexion with graphs

    Numerische Mathematik

    (1959)
  • FlouvatF. et al.

    Domain-driven co-location mining

    Geoinformatica

    (2015)
  • HuangY. et al.

    Discovering colocation patterns from spatial data sets: a general approach

    IEEE Transactions on Knowledge & Data Engineering

    (2004)
  • HuangY. et al.

    Mining co-location patterns with rare events from spatial data sets

    Geoinformatica

    (2006)
  • KolahdouzanM.R. et al.

    Alternative solutions for continuous k nearest neighbor queries in spatial network databases

    Geoinformatica

    (2005)
  • KoperskiK. et al.

    Discovery of spatial association rules in geographic information databases

  • LeeI. et al.

    Geographic knowledge discovery from web map segmentation through generalized Voronoi diagrams

    Expert Systems with Applications

    (2012)
  • MennisJ. et al.

    Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change

    Transactions in GIS

    (2005)
  • MillerH.J.

    Market area delimitation within networks using geographic information systems

    Geographical Systems

    (1994)
  • MillerH.J. et al.

    Geographic data mining and knowledge discovery

    (2009)
  • Cited by (88)

    View all citing articles on Scopus
    View full text