PLDP-TD: Personalized-location differentially private data analysis on trajectory databases
Introduction
A moving objects trajectory database (or just trajectory database in short) is a multiset of trajectories, each of which represents the movement history (geographical positions) of a moving object during a period of time. In recent years, the popularity of pervasive and mobile computing applications has led to a significant increase in the number of trajectory databases [1]. However, these databases often contain personal information about moving objects, and disclosing such information to the public could pose serious privacy concerns [2], [3].
Differential privacy (DP) [4], [5], as a de facto standard for private data analysis, aims to protect the disclosure of sensitive personal information when releasing statistical data. Hence, it ensures that the output of a data analysis mechanism remains approximately the same, even if any single data record in the input database is arbitrarily added or removed. During the years since DP was introduced, it has been widely applied to various databases including spatial databases [6], [7], [8], trajectory databases [9], [10], [11], [12], and sequential databases [13], [14], [15].
For a trajectory database, DP ensures moving objects that any privacy breach will not be a result of their participation in this database. In last few years, several mechanisms have been proposed to answer statistical queries about trajectory databases while preserving DP [9], [10], [11], [12]. However, all of the mechanisms implicitly assume that all locations of the underlying geographical map have the same level of privacy protection requirement and, based on this assumption, provide a uniform privacy guarantee for any location. This is despite the fact that some locations may be less sensitive than others and, thus, providing a uniform privacy guarantee for all locations may degrade the utility of query answers for data analysis. To address this problem, we need to consider a recently introduced notion of DP, known as personalized differential privacy (PDP) [7], [16]. In contrast to the traditional notion, PDP takes into account that different entities may have different privacy protection requirements. In this paper, we continue this line of research by introducing a special concept of PDP for trajectory databases, called personalized-location differential privacy (PLDP). Based on this new concept, we present PLDP-TD, a personalized-location differentially private algorithm to provide non-uniform privacy guarantees for trajectory databases that allows a data owner to assign a privacy descriptor to each location of the underlying geographical map proportional to the privacy protection requirement of that location. PLDP-TD makes use of a tree structure, called a personalized noisy trajectory tree, to answer queries. Each node of the noisy trajectory tree represents a particular subtrajectory and we assign a privacy level to each subtrajectory, and thus its corresponding node, based on the privacy descriptor of the most sensitive location it contains. Afterward, we allocate a personal privacy budget to each node inversely proportional to the node’s privacy level. Noisy statistics about a subtrajectory will be then obtained by considering the personal privacy budget allocated to its corresponding node. Finally, we enforce some consistency constraints on the personalized noisy trajectory tree such that the total distance of the obtained consistent noisy statistics from their original noisy statistics is minimized.
Note that, in this paper, we concentrate on moving objects trajectory databases, but our work with slight modification can be applied to other types of trajectory databases as well. In the following, we list the main contributions of this paper:
- •
For the first time, we introduce the concept of personalized-location differential privacy for trajectory databases, in which a differentially private algorithm tries to guarantee the required level of privacy protection for locations of a geographical map. In addition, we devise a personalized-location differentially private algorithm for trajectory databases that provides non-uniform privacy guarantees and, thus, achieves better utility than traditional differentially private algorithms.
- •
We present a tree structure, known as a personalized noisy trajectory tree, for storing subtrajectories of a trajectory database along with their privacy levels, in which each node is uniquely associated with a subtrajectory. We assign a privacy level to each node of the noisy trajectory tree, which is equal to the privacy level of its associated subtrajectory. We also present two adaptive and non-adaptive strategies for allocating a personal privacy budget to each node inversely proportional to the node’s privacy level. These strategies make a distinction between allocated privacy budgets based on the privacy levels of nodes that receive them. Therefore, we will be able to allocate less/more privacy budgets to nodes with higher/lower privacy levels.
- •
We enforce some consistency constraints on the personalized noisy trajectory tree by considering the personal privacy budgets of nodes.
- •
We show, through extensive experiments on synthetic and real datasets, how our personalized-location differentially private algorithm increases the utility of query answers in comparison to non-personalized algorithms, while guaranteeing the required level of privacy protection for each location.
The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 provides some preliminaries and basic definitions. Section 4 describes the main concepts of trajectory databases. Sections 5 Privacy model, 6 Personalized-location differential privacy for trajectory databases present our privacy model and personalized-location differentially private algorithm for trajectory databases, respectively. The privacy guarantee of our algorithm is analyzed in Section 7. The experimental results are reported in Section 8 and, finally, a summary and discussion are given in Section 9.
Section snippets
Related work
Since its introduction in 2006 [4], differential privacy (DP) has been successfully applied to a wide range of data analysis tasks and applications [17], [18], [19], [20], [21], [22]. In this section, we review the state of the art of mechanisms and notions in the literature that are closely related to ours. However, none of the existing work addresses the issue of personalized-location differential privacy (PLDP) for trajectory databases. Therefore, we classify the related work into two
Preliminaries
In this section, we give some definitions and preliminaries which are used throughout the paper.
Trajectory database
Let be the set of possible locations in a geographical map, where is the cardinality of . Without loss of generality, we consider locations as discrete regions in a geographical map. Each trajectory is a sequence of locations drawn from at regular time intervals, representing the movement history of a moving object. The length of , denoted by , is defined as the number of locations it contains. For example, is a trajectory of length . A location may
Privacy model
In this section, we introduce a special concept of PDP, called personalized-location differential privacy (PLDP), for trajectory databases that allows data owners to take the privacy protection requirements of different locations into consideration. In the following, we give some definitions which are used in our privacy model.
In practice, it may be difficult to directly quantify the privacy protection requirements of locations; therefore, we assume that there is a limited set of user-friendly
Personalized-location differential privacy for trajectory databases
In this section, we introduce PLDP-TD, a personalized-location differentially private algorithm for trajectory databases. The main idea of PLDP-TD is to design a personalized differentially private algorithm that takes the privacy protection requirements of different locations into account, in order to achieve better utility than could be achieved by traditional differentially private algorithms. In general, PLDP-TD consists of two main steps: personalized noisy trajectory tree construction and
Privacy analysis
In this section, we analyze the privacy guarantee of PLDP-TD. To do so, we first prove that PLDP-TD satisfies -PLDP. Then, we show that, in particular, if all locations of a trajectory database have the highest privacy descriptor, as denoted by “Critical” in this paper, PLDP-TD turns into a traditional -differentially private mechanism.
Theorem 4 PLDP-TD satisfies -PLDP.
-PLDP
Proof Let and be two neighboring trajectory databases that differ in a trajectory data
Experiments
In this section, we empirically study the effectiveness of PLDP-TD for differentially private data analysis on trajectory databases. Our main goal is to demonstrate that by taking the privacy protection requirements of different locations into account, PLDP-TD can attain better utility compared to traditional differentially private algorithms that only provide a uniform privacy guarantee. The experimental results are presented in two parts. First, we investigate the impact of personalization on
Conclusion and discussion
In this paper, we have studied the application of differential privacy to trajectory databases and devised a personalized-location differentially private algorithm, called PLDP-TD, to provide non-uniform privacy guarantees for them. PLDP-TD addresses a major limitation of traditional differentially private algorithms for trajectory databases, which is their uniform privacy guarantees. More specifically, existing differentially private algorithms assume that all locations of a trajectory
References (40)
- et al.
Privacy protection in pervasive systems: State of the art and technical challenges
Pervasive Mob. Comput.
(2015) - et al.
Achieving differential privacy of trajectory data publishing in participatory sensing
Inf. Sci.
(2017) - et al.
Incremental release of differentially-private check-in data
Pervasive Mob. Comput.
(2015) - et al.
Differentially private random decision forests using smooth sensitivity
Expert Syst. Appl.
(2017) - et al.
Enhancing social network privacy with accumulated non-zero prior knowledge
Inf. Sci.
(2018) - et al.
Outsourcing high-dimensional healthcare data to cloud with personalized privacy preservation
Comput. Netw.
(2015) - et al.
PPTD: Preserving personalized privacy in trajectory data publishing by sensitive attribute generalization and trajectory local suppression
Knowl.-Based Syst.
(2016) - et al.
Privacy-Preserving trajectory data publishing by local suppression
Inf. Sci.
(2013) - et al.
Privacy-preserving data publishing: A survey of recent developments
ACM Comput. Surv.
(2010) - et al.
Location privacy in cognitive radio networks: A survey
IEEE Commun. Surv. Tutor.
(2017)