Spatial co-location pattern mining for location-based services in road networks

doi:10.1016/j.eswa.2015.10.010

Expert Systems with Applications

Volume 46, 15 March 2016, Pages 324-335

https://doi.org/10.1016/j.eswa.2015.10.010 Get rights and content

Highlights

•
A new approach for mining spatial co-location patterns was presented.
•
The neighborhood was refined using network distances rather than Euclidean ones.
•
An efficient algorithm to build the neighborhood relationship graph was proposed.
•
The performance of the proposed algorithm was explored.

Abstract

With the evolution of geographic information capture and the emergency of volunteered geographic information, it is getting more important to extract spatial knowledge automatically from large spatial datasets. Spatial co-location patterns represent the subsets of spatial features whose objects are often located in close geographic proximity. Such pattern is one of the most important concepts for geographic context awareness of location-based services (LBS). In the literature, most existing methods of co-location mining are used for events taking place in a homogeneous and isotropic space with distance expressed as Euclidean, while the physical movement in LBS is usually constrained by a road network. As a result, the interestingness value of co-location patterns involving network-constrained events cannot be accurately computed. In this paper, we propose a different method for co-location mining with network configurations of the geographical space considered. First, we define the network model with linear referencing and refine the neighborhood of traditional methods using network distances rather than Euclidean ones. Then, considering that the co-location mining in networks suffers from expensive spatial-join operation, we propose an efficient way to find all neighboring object pairs for generating clique instances. By comparison with the previous approaches based on Euclidean distance, this approach can be applied to accurately calculate the probability of occurrence of a spatial co-location on a network. Our experimental results from real and synthetic data sets show that the proposed approach is efficient and effective in identifying co-location patterns which actually rely on a network.

Introduction

With the ubiquity of wireless Internet access, GPS-enabled mobile terminals and the advance in spatial database management systems, a new generation of mobile services, known as location-based services (LBS), has been developed. These services are capable of delivering geographic information and geoprocessing power to mobile users according to their current location (Beatty, 2002). Understanding the geographic context is the essential question in the LBS area. Geographic context, the information of the surroundings or circumstances, has been an important topic in many applications. By introducing the spatial data analysis and geo-sensor data collection into the geographic context-awareness, the service provider could establish reliable and high-quality services to help users in their trip planning, activity re-scheduling, and the decision-making process.

A spatial co-location pattern is a set of spatial features that are frequently located together in spatial proximity (Shekhar & Huang, 2001). As an important concept for spatial analysis and geographic context awareness, the spatial co-location pattern mining has been popularly applied in discovering the spatial dependency of objects. Spatial dependency is a tendency of observed objects located close to another in the geographic space to show a higher degree of similarity or dissimilarity (Miller & Han, 2009). All computed dependency patterns concerning objects in space depend on the definition of spatial neighborhood. In the literature, closeness can be defined by using different types of distance metric, like Euclidean distance, path distance, etc. To provide users realistic geographic information, providers should choose the distance measure which best fits the geographic context in study regions.

So far, many different studies on spatial co-location mining have been carried out. However, all of the previous studies are based on the Euclidean (or planar) space assumption. The researchers (Huang et al., 2004, Shekhar and Huang, 2001, Yoo and Shekhar, 2004) assume that the patterns of interest are occurred in an infinitely homogeneous and isotropic space, and spatial proximity between two objects is measured by the straight-line Euclidean distance. Miller (1994) pointed out that this can be ill-suited as many human activities are constrained only to the network portion of the planar space. Network distance is indeed a more meaningful and reliable distance measure for analyses related to social and economic process (Okabe et al., 2009, Okabe et al., 2006, Yamada and Thill, 2007). For description simplicity, we term the spatial co-location pattern mining based on network distance as NM-Coloc, while the spatial co-location pattern mining based on Euclidean distance as PM-Coloc.

Road transport networks are emerging nowadays at an urban or extra-urban level. There are many cases that need to capture co-location patterns with network distance. For example, in mobile service applications, clients usually request services with regard to the co-location patterns of urban facilities. These patterns need to be measured using the distance along road network instead of Euclidean distance, because the driven vehicles are restricted to move on pre-defined roads under certain transformation condition. In fact, the computed patterns of facilities that take the limit of the movements of people into account can help service providers provide more attractive location-sensitive advertisements, recommendations, etc. As another example, these co-location patterns would also benefit location choosing for a business and advertisement. By extracting the network-constrained patterns of facilities of different types, the decision makers from a company can plan a better location for a new store considering the profitability of similar retail stores depending on the surrounding objects.

Motivated by the great potential demand and lack in research of network-constrained co-locations, we develop an efficient spatial data mining technique to help domain experts discover useful knowledge from the given network data set. As opposed to the Euclidean distance measure used in Huang et al. (2004) and Shekhar and Huang (2001), we try to find dependencies among features with the network distance. Besides, this study also takes the particular characteristics of network space into consideration and we got two major conclusions: (1) in many LBS applications, the distance of shortest path is more suitable to measure the spatial connection between two locations in the urban environment; (2) spatial neighbors searching in networks is generally a time-consuming task, and an efficient method needs to be performed for building the neighborhood relationship graph.

We conduct a series of comparisons and analyses using our experimental system of knowledge extraction. The results show that our method is effective, efficient and scalable for mining large spatial data sets without the repeated computation of the shortest path. Notably, for creating high order rules, the classical Apriori-like algorithm is also used in our research, but it is not a focus in our topic.

The remainder of the paper is organized as follows. Section 2 highlights related works. Section 3 describes the data model and a new method of NM-Coloc. Experimental results are analyzed in Section 4. Finally, Section 5 concludes this paper.

Section snippets

Related work

The co-location patterns mining is an important offshoot of spatial association rules mining. The notion of spatial association rule was first defined in Koperski and Han (1995), where a model named as reference feature centric model is proposed with attention paid to a particular type of spatial objects. Each set of spatial instances that have neighbor relationships with an instance of the reference feature is considered as a transaction, and traditional rules mining methods (e.g. Apriori

Spatial co-location pattern mining in network spaces

Our proposed solution is based on a framework of mining co-location patterns which is presented in Shekhar and Huang (2001). In this section, we review this model and adopt their prevalence measure of participation index in our method. By comparing the definition of neighborhood in planar space with that in network space, we will point out why such a framework may overestimate the prevalence of patterns in a network. Besides, we will propose an efficient algorithm to find network-constrained

Experimental evaluation

We first compared our approach in finding network co-location patterns with a common co-location mining approach based on Euclidean distance (see Yoo & Shekhar, 2006 ). Then we conducted extensive experiments to show the efficiency of ENS by comparing its results with the performance of NNS. Finally we compared the computation performances of subtasks in the PM-Coloc and NM-Coloc. Real and synthetic data sets were used for our experiments and all the experiments were performed on a DELL

Conclusions and future works

This paper formalized the problem of spatial co-location pattern mining over a study region characterized by the presence of a street network that produces constraints and limitations to the movements of people and transport means. Therefore, the study urban environment represents a different environment than the plain homogeneous plane usually considered in many spatial data mining techniques and instruments.

The proposed method is based on a network model using the linear referencing, and then

Acknowledgments

We are grateful to the editor and the anonymous referees for their valuable comments and suggestions.

References (36)

AgrawalR. et al.
Fast algorithms for mining association rules
AiT. et al.
Generation of constrained network Voronoi diagram using linear tessellation and expansion method
Computers, Environment and Urban Systems
(2015)
AndrzejewskiW. et al.
GPU-accelerated collocation pattern discovery. advances in databases and information systems
(2013)
AppiceA. et al.
Discovery of spatial association rules in geo-referenced census data: a relational mining approach
BeattyC.
Location-based services: navigation for the masses, at last!
Journal of Navigation
(2002)
BembenikR. et al.
FARICS: a method of mining spatial association rules and collocations using clustering and delaunay diagrams
Journal of Intelligent Information Systems
(2009)
BoinskiP. et al.
Algorithms for spatial collocation pattern mining in a limited memory environment: a summary of results
Journal of Intelligent Information Systems
(2014)
ClementiniE. et al.
Mining multiple-level spatial association rules for objects with a broad boundary
Data & Knowledge Engineering
(2000)
DijkstraE.W.
A note on two problems in connexion with graphs
Numerische Mathematik
(1959)
FlouvatF. et al.
Domain-driven co-location mining
Geoinformatica
(2015)

HuangY. et al.

Discovering colocation patterns from spatial data sets: a general approach

IEEE Transactions on Knowledge & Data Engineering

(2004)

HuangY. et al.

Mining co-location patterns with rare events from spatial data sets

Geoinformatica

(2006)

KolahdouzanM.R. et al.

Alternative solutions for continuous k nearest neighbor queries in spatial network databases

Geoinformatica

(2005)

KoperskiK. et al.

Discovery of spatial association rules in geographic information databases

LeeI. et al.

Geographic knowledge discovery from web map segmentation through generalized Voronoi diagrams

Expert Systems with Applications

(2012)

MennisJ. et al.

Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change

Transactions in GIS

(2005)

MillerH.J.

Market area delimitation within networks using geographic information systems

Geographical Systems

(1994)

MillerH.J. et al.

Geographic data mining and knowledge discovery

(2009)

Cited by (88)

Geo-Fencing or Geo-Conquesting? a strategic analysis of Location-Based coupon under different market structures
2023, Transportation Research Part E: Logistics and Transportation Review
Location-based technology enables firms to target consumers with personalized coupons based on their real-time locations, making location-based coupons (LBC) an innovative marketing tool. In this paper, we consider two types of LBC strategies, namely defensive geo-fencing versus offensive geo-conquesting. With a defensive LBC strategy, a company sets a virtual geo-fence by offering coupons with deeper discounts to consumers located closer to the focal firm. Using a spatial model, we examine how two competing companies choose between defensive and offensive LBC strategies, as well as the impact of LBC strategies on company profits, consumer surplus, and social welfare. The results show that the defensive LBC strategy lowers revenue in a monopoly market but increases it under duopoly conditions. In a duopoly market, both firms adopting the defensive LBC strategy is the Nash equilibrium outcome, leading to the highest profits but lowest consumer surplus. The misalignment of interests among firms and consumers requires policymakers to regulate firms’ LBC strategies for the sake of customers.
Meta-PCP: A concise representation of prevalent co-location patterns discovered from spatial data
2023, Expert Systems with Applications
A prevalent co-location pattern (PCP), which is a group of spatial features whose spatial instances frequently appear together in nearby geographical areas, can expose valuable information and knowledge that can be applied to many fields. In traditional PCP mining, to filter interesting PCPs, a minimum prevalence threshold is employed. This threshold should be set to a small value to obtain as much information and knowledge as possible from spatial data sets. However, at this time, not only too many redundant patterns are found, but also mining efficiency is extremely low and memory space consumption is very high. To solve this self-contradiction, this paper proposes a new concept that is called meta-prevalent co-location pattern (meta-PCP). Meta-PCPs can eliminate redundant information and concisely represent the mining result. Although meta-PCPs are a lossy representation of all PCPs, they can be controlled by users according to their application scenarios. Moreover, a query-based mining algorithm is designed to improve mining performance when the prevalence threshold is set to very low. This algorithm discovers meta-PCPs without generating candidates (to improve efficiency) and does not collect and remain co-location instances of each pattern (to reduce memory consumption). The comprehensive experimental results on both synthetic and real data sets show that the proposed method is effective and efficient.
Spatial co-location pattern mining over extended objects based on cell-relation operations
2023, Expert Systems with Applications
Spatial co-location pattern mining (SCPM) is intended to discover subsets of spatial features whose instances are frequently located together in geographic areas. Traditional SCPM methods are designed for point spatial instances. However, in reality, instances are mostly in the form of extended objects, e.g., lines, polygons. In addition, current SCPM methods with extended objects are less well researched and have two disadvantages: (1) Existing researches cannot effectively capture neighborhood relationships between extended objects and their mining results cannot properly reflect the distribution dependence of features; (2) These methods are not efficient enough with large datasets. This paper proposes a novel framework called cell-relation operations framework to overcome these issues. To eliminate the first shortcoming, the framework uses the area overlapping of buffers between objects to gain the neighbor relationships between extended objects and introduces participation index under buffer size k to identify prevalent co-location patterns. To address the second problem, our framework employs cell-relation operations rather than instance relation computing as the basic computing unit for co-location mining, which substantially speeds up the computation. The framework obtains spatial co-locations by counting the feature transactions of the cells and calculates the feature overlap ratio of the cells to generate co-locations. We implement experiments with real datasets to demonstrate that our framework’s mining results are more reasonable and the proposed framework’s runtime outperforms the baselines by 2 to 4 orders of magnitude.
A maximal ordered ego-clique based approach for prevalent co-location pattern mining
2022, Information Sciences
Spatial data often exhibit a tendency highly similar to spatial objects located close to each other. Thus, prevalent co-location pattern (PCP) mining has been studied extensively to discover this tendency. The organization of neighboring relationships on spatial data, called neighborhood materialization (NM), is critical to the PCP problem. However, the previous NM methods suffer from poor efficiency and a large set of results. To this end, a new NM model based on maximal cliques with ego-centric points is proposed in this study, called the maximal ordered ego-clique (MOEC). Here, the correctness of the materialized neighboring relationships of spatial data is proven, and the complexity is further analyzed. In addition, a generalized algorithm GMOEC is designed to effectively transform the neighboring relationships of a spatial data set into MOECs. The geometry of the spatial data set is fully exploited to develop several optimization strategies to enhance efficiency. Furthermore, a novel generalized PCP mining method, GPCP, is proposed to avoid multiple scans of the materialized neighborhood. The GPCP method discovers all PCPs based on the materialized neighborhood using the vertical data format. Finally, extensive experiments on both synthetic and real data sets demonstrate that the proposed solution is highly effective and efficient.
Knowledge-based Discovery of Multi-level Co-location Patterns Using Ontology
2024, Research Square
The Urban Facilities Before and After the COVID-19 Pandemic: Spatial Association Patterns Mining in Wuhan, China
2023, Applied Spatial Analysis and Policy

View all citing articles on Scopus

View full text

Spatial co-location pattern mining for location-based services in road networks

Highlights

Abstract

Introduction

Section snippets

Related work

Spatial co-location pattern mining in network spaces

Experimental evaluation

Conclusions and future works

Acknowledgments

Fast algorithms for mining association rules

Generation of constrained network Voronoi diagram using linear tessellation and expansion method

Computers, Environment and Urban Systems

GPU-accelerated collocation pattern discovery. advances in databases and information systems

Discovery of spatial association rules in geo-referenced census data: a relational mining approach

Location-based services: navigation for the masses, at last!

Journal of Navigation

FARICS: a method of mining spatial association rules and collocations using clustering and delaunay diagrams

Journal of Intelligent Information Systems

Algorithms for spatial collocation pattern mining in a limited memory environment: a summary of results

Journal of Intelligent Information Systems

Mining multiple-level spatial association rules for objects with a broad boundary

Data & Knowledge Engineering

A note on two problems in connexion with graphs

Numerische Mathematik

Domain-driven co-location mining

Geoinformatica

Discovering colocation patterns from spatial data sets: a general approach

IEEE Transactions on Knowledge & Data Engineering

Mining co-location patterns with rare events from spatial data sets

Geoinformatica

Alternative solutions for continuous k nearest neighbor queries in spatial network databases

Geoinformatica

Discovery of spatial association rules in geographic information databases

Geographic knowledge discovery from web map segmentation through generalized Voronoi diagrams

Expert Systems with Applications

Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change

Transactions in GIS

Market area delimitation within networks using geographic information systems

Geographical Systems

Geographic data mining and knowledge discovery