Abstract
Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the “data sparsity problem”, i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into “synthesised” trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.




















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
The demonstration system can be accessed following this link: http://spatialanalytics.cis.unimelb.edu.au/subsyndemo/.
Note the difference between \(p_{i\rightarrow j}\) and \(p_{ij}\). The latter (without the arrow) is the transition probability defined in Markov model, and its definition was given in Eq. (4).
Consecutive cells are two cells next to each other in a trajectory; adjacent cells are two cells next to each other in a grid.
References
Ali, M.E., Zhang, R., Tanin, E., Kulik, L.: A motion-aware approach to continuous retrieval of 3d objects. In: Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, IEEE, pp. 843–852 (2008)
Ali, M.E., Tanin, E., Zhang, R., Kulik, L.: A motion-aware approach for efficient evaluation of continuous queries on 3d object databases. VLDB J. Int. J. Very Large Data Bases 19(5), 603–632 (2010)
Alvarez-Garcia, J.A., Ortega, J.A., Gonzalez-Abril, L., Velasco, F.: Trip destination prediction based on past GPS log using a hidden markov model. Expert Syst. Appl. Int. J. 37, 8166–8171 (2010)
Alvarez-Lozano, J., García-Macías, J.A., Chávez, E.: Learning and user adaptation in location forecasting. In: Proceedings of the 2013 ACM Conference on Pervasive and Ubiquitous Computing Adjunct Publication, UbiComp ’13 Adjunct, pp. 461–470 (2013)
Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. Pers. Ubiquitous Comput. 7, 275–286 (2003)
Bhattacharya, A., Das, S.K.: Lezi-update: an information-theoretic approach to track mobile users in pcs networks. In: MobiCom, pp. 1–12 (1999)
Chen, L., Lv, M., Chen, G.: A system for destination and future route prediction based on trajectory mining. Pervasive Mob. Comput. 6, 657–676 (2010)
Gogate, V., Dechter, R., Bidyuk, B.: Modeling transportation routines using hybrid dynamic mixed networks. In: UAI, pp. 217–224 (2005)
GPSExchange: GPS track log route exchange forum. http://www.gpsexchange.com/ (2012)
Hashem, T., Kulik, L., Zhang, R.: Privacy preserving group nearest neighbor queries. In: EDBT, pp. 489–500 (2010)
Hashem, T., Kulik, L., Zhang, R.: Countering overlapping rectangle privacy attack for moving knn queries. Inf. Syst. 38(3), 430–453 (2013)
Horvitz, E., Krumm, J.: Some help on the way: opportunistic routing under uncertainty. In: UbiComp, pp. 371–380 (2012)
Jensen, C.S., Lin, D., Ooi, B.C., Zhang, R.: Effective density queries on continuouslymoving objects. In: Data Engineering, 2006. ICDE’06. Proceedings of the 22nd International Conference on, IEEE, pp. 71–71 (2006)
Jensen, J.: Sur les fonctions convexes et les ingalits entre les valeurs moyennes. Acta Mathematica 30(1), pp. 175–193. doi:10.1007/BF02418571, http://dx.doi.org/10.1007/BF02418571 (1906)
Krumm, J., Horvitz, E.: Predestination: inferring destinations from partial trajectories. In: UbiComp, pp. 243–260 (2006)
Krumm, J., Horvitz, E.: Predestination: Where do you want to go today? IEEE Comput. 40(4), 105–107 (2007)
Liao, L., Patterson, D.J., Fox, D., Kautz, H.: Learning and inferring transportation routines. Artif. Intell. 171(5–6), 311–331 (2007)
Marmasse, N., Schmandt, C.: A user-centered location model. Pers. Ubiquitous Comput. 6, 318–321 (2002)
Microsoft Research: T-drive trajectory data sample. http://research.microsoft.com/apps/pubs/?id=152883 (2012)
Nutanong, S., Zhang, R., Tanin, E., Kulik, L.: Analysis and evaluation of V*- knn: an efficient algorithm for moving knn queries. VLDB J. 19(3), 307–332 (2010)
Patterson, D.J., Liao, L., Fox, D., Kautz, H.: Inferring high-level behavior from low-level sensors. In: UbiComp, pp. 73–89 (2003)
Qiu, D., Papotti, P., Blanco, L.: Future locations prediction with uncertain data. Machine Learning and Knowledge Discovery in Databases, vol. 8188, pp. 417–432. Springer, Berlin Heidelberg (2013)
ShareMyRoute: Share my route. http://www.sharemyroutes.com (2012)
Simmons, R., Browning, B., Zhang, Y., Sadekar, V.: Learning to predict driver route and destination intent. In: ITSC, pp. 127–132 (2006)
Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 13(4), 354–356 (1969)
Tiesyte, D., Jensen, C.S.: Similarity-based prediction of travel times for vehicles traveling on known routes. GIS, pp. 14:1–14:10 (2008)
Williams, V.V.: Breaking the coppersmith-winograd barrier (2011, unpublished)
Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Huang, J., Xu, Z.: Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: ICDE (2013)
Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Yu, J., Tang, Y.: Desteller: A system for destination prediction based on trajectories with privacy protection. In: International Conference on Very Large Data Bases (VLDB) (2013)
Yuan, J., Zheng, Y., Zhang, C., Xie, W., Xie, X., Sun, G., Huang, Y.: T-drive: Driving directions based on taxi trajectories. In: GIS, pp. 99–108 (2010)
Yuan, J., Zheng, Y., Xie, X., Sun, G.: Driving with knowledge from the physical world. In: KDD, pp. 316–324 (2011)
Zhang, R., Qi, J., Lin, D., Wang, W., Wong, R.C.W.: A highly optimized algorithm for continuous intersection join queries over moving objects. VLDB J. Int. J. Very Large Data Bases 21(4), 561–586 (2012)
Zheng, Y., Zhou, X. (eds.): Computing with Spatial Trajectories. Springer, Berlin (2011)
Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior. In: UbiComp, pp. 322–331 (2008)
Acknowledgments
This work is supported by Australian Research Council (ARC) Discovery Project DP130104587 and Australian Research Council (ARC) Future Fellowships Project FT120100832.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xue, A.Y., Qi, J., Xie, X. et al. Solving the data sparsity problem in destination prediction. The VLDB Journal 24, 219–243 (2015). https://doi.org/10.1007/s00778-014-0369-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-014-0369-7