Elsevier

Pervasive and Mobile Computing

Volume 49, September 2018, Pages 1-22
Pervasive and Mobile Computing

PLDP-TD: Personalized-location differentially private data analysis on trajectory databases

https://doi.org/10.1016/j.pmcj.2018.06.005Get rights and content

Highlights

  • The concept of personalized-location differential privacy for trajectory databases is introduced.

  • A personalized-location differentially private algorithm for data analysis on trajectory databases is devised.

  • Two different strategies are presented for personal privacy budget allocation.

  • Non-uniform privacy is guaranteed for locations with different privacy protection requirements.

  • Some consistency constraints are enforced to make a personalized noisy trajectory tree consistent in an optimal way.

Abstract

The ubiquity of location-aware mobile devices and information systems has made it possible to collect large amounts of movement data such as trajectories of moving objects. However, it must be carefully managed to ensure that the privacy of each moving object or sensitive location is guaranteed. In this paper, we investigate how different locations of a geographical map can meet their individual privacy protection requirements using differential privacy (DP). More specifically, we aim to guarantee that the inclusion of any trajectory data record in a trajectory database does not substantially increase the risk to its privacy, while ensuring the required level of privacy protection for each location. To achieve this, we introduce the concept of personalized-location differential privacy (PLDP) for trajectory databases, and devise a differentially private algorithm, called PLDP-TD, that implements this new concept. PLDP-TD makes use of a so-called personalized noisy trajectory tree, which is constructed from the underlying trajectory database to answer statistical queries in a differentially private way. We propose novel strategies for privacy level assignment and personal privacy budget allocation to nodes of the personalized noisy trajectory tree. In addition, we enforce some consistency constraints on the personalized noisy trajectory tree to make it consistent such that the noisy count of each non-leaf node is equal to the sum of its children’s noisy counts, while minimizing the total distance of consistent noisy counts from their original noisy counts. Extensive experiments demonstrate that PLDP-TD substantially decreases the average relative error of query answers (up to 52 percent) in comparison to traditional differentially private algorithms.

Introduction

A moving objects trajectory database (or just trajectory database in short) is a multiset of trajectories, each of which represents the movement history (geographical positions) of a moving object during a period of time. In recent years, the popularity of pervasive and mobile computing applications has led to a significant increase in the number of trajectory databases [1]. However, these databases often contain personal information about moving objects, and disclosing such information to the public could pose serious privacy concerns [2], [3].

Differential privacy (DP) [4], [5], as a de facto standard for private data analysis, aims to protect the disclosure of sensitive personal information when releasing statistical data. Hence, it ensures that the output of a data analysis mechanism remains approximately the same, even if any single data record in the input database is arbitrarily added or removed. During the years since DP was introduced, it has been widely applied to various databases including spatial databases [6], [7], [8], trajectory databases [9], [10], [11], [12], and sequential databases [13], [14], [15].

For a trajectory database, DP ensures moving objects that any privacy breach will not be a result of their participation in this database. In last few years, several mechanisms have been proposed to answer statistical queries about trajectory databases while preserving DP [9], [10], [11], [12]. However, all of the mechanisms implicitly assume that all locations of the underlying geographical map have the same level of privacy protection requirement and, based on this assumption, provide a uniform privacy guarantee for any location. This is despite the fact that some locations may be less sensitive than others and, thus, providing a uniform privacy guarantee for all locations may degrade the utility of query answers for data analysis. To address this problem, we need to consider a recently introduced notion of DP, known as personalized differential privacy (PDP) [7], [16]. In contrast to the traditional notion, PDP takes into account that different entities may have different privacy protection requirements. In this paper, we continue this line of research by introducing a special concept of PDP for trajectory databases, called personalized-location differential privacy (PLDP). Based on this new concept, we present PLDP-TD, a personalized-location differentially private algorithm to provide non-uniform privacy guarantees for trajectory databases that allows a data owner to assign a privacy descriptor to each location of the underlying geographical map proportional to the privacy protection requirement of that location. PLDP-TD makes use of a tree structure, called a personalized noisy trajectory tree, to answer queries. Each node of the noisy trajectory tree represents a particular subtrajectory and we assign a privacy level to each subtrajectory, and thus its corresponding node, based on the privacy descriptor of the most sensitive location it contains. Afterward, we allocate a personal privacy budget to each node inversely proportional to the node’s privacy level. Noisy statistics about a subtrajectory will be then obtained by considering the personal privacy budget allocated to its corresponding node. Finally, we enforce some consistency constraints on the personalized noisy trajectory tree such that the total distance of the obtained consistent noisy statistics from their original noisy statistics is minimized.

Note that, in this paper, we concentrate on moving objects trajectory databases, but our work with slight modification can be applied to other types of trajectory databases as well. In the following, we list the main contributions of this paper:

  • For the first time, we introduce the concept of personalized-location differential privacy for trajectory databases, in which a differentially private algorithm tries to guarantee the required level of privacy protection for locations of a geographical map. In addition, we devise a personalized-location differentially private algorithm for trajectory databases that provides non-uniform privacy guarantees and, thus, achieves better utility than traditional differentially private algorithms.

  • We present a tree structure, known as a personalized noisy trajectory tree, for storing subtrajectories of a trajectory database along with their privacy levels, in which each node is uniquely associated with a subtrajectory. We assign a privacy level to each node of the noisy trajectory tree, which is equal to the privacy level of its associated subtrajectory. We also present two adaptive and non-adaptive strategies for allocating a personal privacy budget to each node inversely proportional to the node’s privacy level. These strategies make a distinction between allocated privacy budgets based on the privacy levels of nodes that receive them. Therefore, we will be able to allocate less/more privacy budgets to nodes with higher/lower privacy levels.

  • We enforce some consistency constraints on the personalized noisy trajectory tree by considering the personal privacy budgets of nodes.

  • We show, through extensive experiments on synthetic and real datasets, how our personalized-location differentially private algorithm increases the utility of query answers in comparison to non-personalized algorithms, while guaranteeing the required level of privacy protection for each location.

The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 provides some preliminaries and basic definitions. Section 4 describes the main concepts of trajectory databases. Sections 5 Privacy model, 6 Personalized-location differential privacy for trajectory databases present our privacy model and personalized-location differentially private algorithm for trajectory databases, respectively. The privacy guarantee of our algorithm is analyzed in Section 7. The experimental results are reported in Section 8 and, finally, a summary and discussion are given in Section 9.

Section snippets

Related work

Since its introduction in 2006 [4], differential privacy (DP) has been successfully applied to a wide range of data analysis tasks and applications [17], [18], [19], [20], [21], [22]. In this section, we review the state of the art of mechanisms and notions in the literature that are closely related to ours. However, none of the existing work addresses the issue of personalized-location differential privacy (PLDP) for trajectory databases. Therefore, we classify the related work into two

Preliminaries

In this section, we give some definitions and preliminaries which are used throughout the paper.

Trajectory database

Let L={L1,L2,,L|L|} be the set of possible locations in a geographical map, where |L| is the cardinality of L. Without loss of generality, we consider locations as discrete regions in a geographical map. Each trajectory T is a sequence of locations drawn from L at regular time intervals, representing the movement history of a moving object. The length of T, denoted by |T|, is defined as the number of locations it contains. For example, T=L1,L3,L2 is a trajectory of length 3. A location may

Privacy model

In this section, we introduce a special concept of PDP, called personalized-location differential privacy (PLDP), for trajectory databases that allows data owners to take the privacy protection requirements of different locations into consideration. In the following, we give some definitions which are used in our privacy model.

In practice, it may be difficult to directly quantify the privacy protection requirements of locations; therefore, we assume that there is a limited set of user-friendly

Personalized-location differential privacy for trajectory databases

In this section, we introduce PLDP-TD, a personalized-location differentially private algorithm for trajectory databases. The main idea of PLDP-TD is to design a personalized differentially private algorithm that takes the privacy protection requirements of different locations into account, in order to achieve better utility than could be achieved by traditional differentially private algorithms. In general, PLDP-TD consists of two main steps: personalized noisy trajectory tree construction and

Privacy analysis

In this section, we analyze the privacy guarantee of PLDP-TD. To do so, we first prove that PLDP-TD satisfies ϵ-PLDP. Then, we show that, in particular, if all locations of a trajectory database have the highest privacy descriptor, as denoted by “Critical” in this paper, PLDP-TD turns into a traditional ε-differentially private mechanism.

Theorem 4

ϵ-PLDP

PLDP-TD satisfies ϵ-PLDP.

Proof

Let T1 and T2 be two neighboring trajectory databases that differ in a trajectory data

Experiments

In this section, we empirically study the effectiveness of PLDP-TD for differentially private data analysis on trajectory databases. Our main goal is to demonstrate that by taking the privacy protection requirements of different locations into account, PLDP-TD can attain better utility compared to traditional differentially private algorithms that only provide a uniform privacy guarantee. The experimental results are presented in two parts. First, we investigate the impact of personalization on

Conclusion and discussion

In this paper, we have studied the application of differential privacy to trajectory databases and devised a personalized-location differentially private algorithm, called PLDP-TD, to provide non-uniform privacy guarantees for them. PLDP-TD addresses a major limitation of traditional differentially private algorithms for trajectory databases, which is their uniform privacy guarantees. More specifically, existing differentially private algorithms assume that all locations of a trajectory

References (40)

  • DworkC.

    Differential privacy

  • DworkC.

    Differential privacy: A survey of results

  • CormodeG. et al.

    Differentially private spatial decompositions

  • NiknamiN. et al.

    SpatialPDP: A personalized differentially private mechanism for range counting queries over spatial databases

  • ZhangJ. et al.

    PrivTree: A differentially private algorithm for hierarchical decompositions

  • ChenR. et al.

    Differentially private transit data publication: A case study on the Montreal transportation system

  • HoS.-S. et al.

    Preserving privacy for interesting location pattern mining from trajectory data

    Trans. Data Priv.

    (2013)
  • HeX. et al.

    DPT: Differentially private trajectory synthesis using hierarchical reference systems

    Proc. VLDB Endow.

    (2015)
  • ChenR. et al.

    Differentially private sequential data publication via variable-length n-grams

  • BonomiL. et al.

    A two-phase algorithm for mining sequential patterns with differential privacy

  • Cited by (0)

    View full text