Skip to main content

Advertisement

Log in

A Spatiotemporal Causality Based Governance Framework for Noisy Urban Sensory Data

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Urban sensing is one of the fundamental building blocks of urban computing. It uses various types of sensors deployed in different geospatial locations to continuously and cooperatively monitor the natural and cultural environment in urban areas. Nevertheless, issues such as uneven distribution, low sampling rate and high failure ratio of sensors often make their readings less reliable. This paper provides an innovative framework to detect the noise data as well as to repair them from a spatial-temporal causality perspective rather than to deal with them individually. This can be achieved by connecting data through monitored objects, using the Skip-gram model to estimate spatial correlation and long short-term memory to estimate temporal correlation. The framework consists of three major modules: 1) a space embedded Bidirectional Long Short-Term Memory (BiLSTM)-based sequence labeling module to detect the noise data and the latent missing data; 2) a space embedded BiLSTM-based sequence predicting module calculating the value of the missing data; 3) an object characteristics fusion repairing module to correct the spatial and temporal dislocation sensory data. The approach is evaluated with real-world data collected by over 3 000 electronic traffic bayonet devices in a citywide scale of a medium-sized city in China, and the result is superior to those of several referenced approaches. With a 12.9% improvement in data accuracy over the raw data, the proposed framework plays a significant role in various real-world use cases in urban governance, such as criminal investigation, traffic violation monitoring, and equipment maintenance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Gajda J, Burnos P. Identification of the spatial impulse response of inductive loop detectors. In Proc. the 2015 IEEE International Instrumentation and Measurement Technology Conference, May 2015, pp.1997-2002.

  2. Jin Y, Tan E, Li L, Wang G, Wang J, Wang P. Hybrid traffic forecasting model with fusion of multiple spatial toll collection data and remote microwave sensor data. IEEE Access, 2018, 6(79): 211-221.

    Google Scholar 

  3. Cai Y, Tong H, Fan W, Ji P. Fast mining of a network of coevolving time series. In Proc. the 2015 SIAM International Conference on Data Mining, April 2015, pp.298-306.

  4. Nguyen H V, Vreeken J. Linear-time detection of non-linear changes in massively high dimensional time series. In Proc. the 2016 SIAM International Conference on Data Mining, May 2016, pp.828-836.

  5. Méger N, Rigotti C, Pothier C. Swap randomization of bases of sequences for mining satellite image times series. In Proc. the 2015 European Conference on Machine Learning and Knowledge Discovery in Databases, September 2015, pp.190-205.

  6. Shi W, Zhu Y, Yu P S, Zhang J, Huang T, Wang C, Chen Y. Effective prediction of missing data on Apache Spark over multivariable time series. IEEE Transactions on Big Data, 2018, 4(4): 473-486.

    Article  Google Scholar 

  7. Shi N. Intelligent detection of license plate-related illegal vehicles using bayonet images [Ph.D. Thesis]. Nanjing Normal University, 2013. (in Chinese)

  8. Yuan W, Deng P, Taleb T, Wan J, Bi C. An unlicensed taxi identification model based on big data analysis. IEEE Trans. Intelligent Transportation Systems, 2016, 17(6): 1703-1713

    Article  Google Scholar 

  9. Hsueh Y, Chen H, Huang W. A hidden Markov model-based map-matching approach for low-sampling-rate GPS trajectories. In Proc. the 7th IEEE International Symposium on Cloud and Service Computing, November 2017, pp.271-274.

  10. Zheng K, Zheng Y, Xie X et al. Reducing uncertainty of low-sampling-rate trajectories. In Proc. the 28th International Conference on Data Engineering, April 2012, pp.1144-1155.

  11. Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y. T-drive: Driving directions based on taxi trajectories. In Proc. the 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, November 2010, pp.99-108.

  12. Chen M, Liu Y, Yu X. Predicting next locations with object clustering and trajectory clustering. In Proc. the 19th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, May 2015, pp.344-356.

  13. Rendle S, Freudenthaler C, Schemidt-Thieme L. Factoring personalized Markov chains for next-basket recommendation. In Proc. the 19th International Conference on World Wide Web, April 2010, pp.811-820.

  14. Liu Q, Wu S, Wang L et al. Predicting the next location: A recurrent model with spatial and temporal contexts. In Proc. the 13th Conference on Artificial Intelligence, February 2016, pp.12-17.

  15. Al-Molegi A, Jabreel M, Ghaleb B. STF-RNN: Space time features-based recurrent neural network for predicting people next location. In Proc. the 2016 IEEE Symposium Series on Computational Intelligence, December 2016.

  16. Rahm E, Do H H. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., 2000, 23(4): 3-13.

    Google Scholar 

  17. Chomicki J, Marcinkowski J. Minimal-change integrity maintenance using tuple deletions. Information Computation, 2005, 197(1/2): 90-121.

  18. Wijsen J. Condensed representation of database repairs for consistent query answering. In Proc. the 9th International Conference on Database Theory, January 2003, pp.375-390.

  19. Bohannon P, Fan W, Geerts F, Jia X, Kementsietsidis A. Conditional functional dependencies for data cleaning. In Proc. the 3rd International Conference on Data Engineering, April 2007, pp.746-755.

  20. Zheng Y. Trajectory data mining: An overview. ACM Transactions on Intelligent Systems and Technology, 2015, 6(3): Article No. 29.

  21. Kong X, Li M, Ma K et al. Big trajectory data: A survey of applications and services. IEEE Access, 2018, 6: 58295-58306.

    Article  Google Scholar 

  22. Bian J, Tian D, Tang Y et al. Trajectory data classification: A review. ACM Transactions on Intelligent Systems and Technology, 2019, 10(4): Article No. 33

  23. Gambs S, Killijian M, Cortez M. Next place prediction using mobility Markov chains. In Proc. the 1st Workshop on Measurement Privacy and Mobility, April 2012, Article No. 3.

  24. Dong B, Ju H, Lu Y, Shi Z. CURE: Curvature regularization for missing data recovery. arXiv:1901.09548, 2019. http://arxiv.org/abs/1901.09548, Sept. 2019.

  25. Kong X, Li M, Li J, Tian K, Hu X, Xia F. CoPFun: An urban co-occurrence pattern mining scheme based on regional function discovery. World Wide Web: Internet and Web Information Systems, 2018, 22(3): 1029-1054.

    Article  Google Scholar 

  26. Morzy M. Mining frequent trajectories of moving objects for location prediction. In Proc. the 5th International Conference on Machine Learning and Data Mining in Pattern Recognition, July 2007, pp.667-680.

  27. Agrawal R, Srikant R. Mining sequential patterns. In Proc. the 11th Int. Conf. Data Engineering, March 1995, pp.3-14.

  28. Gomariz A, Campos M, Marín R, Goethals B. ClaSP: An efficient algorithm for mining frequent closed sequences. In Proc. the 17th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, April 2013, pp.50-61.

  29. Pinto H, Han J, Dayal U, Wang J, Chen Q, Pei J, Mortazavi-Asl B, Hsu M. Mining sequential patterns by pattern-growth: The prefixspan approach. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(11): 1424-1440.

    Article  Google Scholar 

  30. Zaki M J. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning, 2001, 42(1/2): 31-60

    Article  Google Scholar 

  31. Ayres J, Flannick J, Gehrke J, Yiu T. Sequential pattern mining using a bitmap representation. In Proc. the 8th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, July 2002, pp.429-435.

  32. Ratinov L, Roth D. Design challenges and misconceptions in named entity recognition. In Proc. the 13th Conference on Computational Natural, June 2009, pp.147-155.

  33. Passos A, Kumar V, McCallum A. Lexicon infused phrase embeddings for named entity resolution. In Proc. the 18th Computational Natural Language Learning, June 2014, pp.78-86.

  34. Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In Proc. the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2016, pp.260-270.

  35. Ma X, Hovy E H. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv:1603.01354, 2016. https://arxiv.org/abs/1603.01354, Sept. 2019.

  36. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In Proc. the 1st Int. Conf. Learning Representations, May 2013, Article No. 22.

  37. Peters M E, Ammar W, Bhagavatula C, Power R. Semi-supervised sequence tagging with bidirectional language models. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1756-1765.

  38. Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991, 2015. http://arxiv.org/abs/1508.01991, Sept. 2019.

Download references

Acknowledgements

The authors would like to thank anonymous reviewers for their constructive comments. The authors also thank contributions of other participants of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pan Deng.

Electronic supplementary material

ESM 1

(PDF 584 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, BY., Yang, C., Deng, P. et al. A Spatiotemporal Causality Based Governance Framework for Noisy Urban Sensory Data. J. Comput. Sci. Technol. 35, 1084–1098 (2020). https://doi.org/10.1007/s11390-020-9724-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-9724-x

Keywords

Navigation