Skip to main content
Log in

TSLOD: a coupled generalized subsequence local outlier detection model for multivariate time series

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Unsupervised subsequence outlier detection on multivariate time series (MTS) is a valuable problem in practice that can observably save the cost of labeling and provide interpretability in real applications. For the task, most of the classic methods are under two strong assumptions: (i) stationary MTS. it may have difficulty coping with the phenomenon of time drift. (ii) Attribute-level IIDness (independent and identically distributed), it may ignore the relationship between attribute when measuring the similarity between multivariate subsequences. The above assumptions limit the availability of existing methods in real scenarios. To address this issue, this paper introduces a novel coupled generalized local outlier detection model for MTS, which extends the traditional generalized local outlier detection model to cope with subsequence outlier detection tasks by incorporating a novel Non-IID similarity metric. Specifically, the proposed method mainly includes three aspects: (i) represents the MTS relationship in symbolic space which provides a lower complexity and satisfactory sensitivity. (ii) Proposes a Non-IID coupled similarity metric (TSDis) which considers the intrinsic intra-attribute and inter-attribute coupling between segments. (iii) Extends the traditional generalized local outlier detection model to handle subsequence outlier detection tasks by embedding Non-IID coupled similarity metric. Experimental results show the proposed method can utilize the potential characteristics of MTS effectively and stably. Meanwhile, it detects outliers more accurately than baseline approaches on 12 time-series datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Subsequence is an exchangeable term for segment in this paper.

  2. Attribute are also an interchangeable term for variable in the context of relational data.

References

  1. Aggarwal CC (2017) Outlier analysis. Springer, Berlin

    Book  Google Scholar 

  2. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660

    Article  MathSciNet  Google Scholar 

  3. Bagnall AJ, Dau HA, Lines J, Flynn M, Large J, Bostrom A, Southam P, Keogh EJ (2018) The UEA multivariate time series classification archive, 2018. CoRR abs/1811.0

  4. Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop 1994, pp 359–370

  5. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of ACM SIGMOD 2000, pp 1–12

  6. Bu Y, Leung OTW, Fu AWC, Keogh EJ, Pei J, Meshkin S (2007) WAT: finding top-K discords in time series database. In: Proceedings of ICDM 2007, pp 449–454

  7. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv CSUR 41(3):15

    Google Scholar 

  8. Chandola V, Banerjee A, Kumar V (2012) Anomaly detection for discrete sequences: a survey. IEEE Trans Knowl Data Eng 24(5):823–839

    Article  Google Scholar 

  9. Chau PM, Duc BM, Anh DT (2018) Discord discovery in streaming time series based on an improved HOT SAX algorithm. In: Proceedings of SoICT 2018, ACM, pp 24–30

  10. Esling P, Agón C (2012) Time-series data mining. ACM Comput Surv 45(1):12:1-12:34

    Article  Google Scholar 

  11. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings of ACM SIGMOD, pp 419–429

  12. Hyndman RJ, Wang E, Laptev N (2015) Large-scale unusual time series detection. In: 2015 IEEE ICDM workshop, pp 1616–1619

  13. Jian S, Cao L, Lu K, Gao H (2018) Unsupervised coupled metric similarity for non-IID categorical data. IEEE Trans Knowl Data Eng 30:1810–23

    Article  Google Scholar 

  14. Keogh EJ, Lin J, Fu AWC (2005) HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of ICDM 2005, pp 226–233

  15. Keogh EJ, Rakthanmanon T (2013) Fast Shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of ICDM 2013, pp 668–676

  16. Li G, Bräysy O, Jiang L, Wu Z, Wang Y (2013) Finding time series discord based on bit representation clustering. Knowl Based Syst 54:243–254

    Article  Google Scholar 

  17. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop, ACM, pp 2–11

  18. Rebbapragada U, Protopapas P, Brodley CE, Alcock CR (2009) Finding anomalous periodic time series. Mach Learn 74(3):281–313

    Article  Google Scholar 

  19. Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: International conference on information processing in medical imaging, Springer, pp 146–157

  20. Schubert E, Zimek A, Kriegel HP (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237

    Article  MathSciNet  Google Scholar 

  21. Su Y, Zhao Y, Niu C, Liu R, Sun W, Pei D (2019) Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2828–2837

  22. Vlachos M, Gunopulos D, Kollios G (2002) discovering similar multidimensional trajectories. In: Proceedings of ICDE 2002, pp 673–684

  23. Wagner RA, Fischer MJ (1974) The string-to-string correction problem. J ACM 21(1):168–173

    Article  MathSciNet  Google Scholar 

  24. Ye L, Keogh EJ (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of SIGKDD 2009, pp 947–956

  25. Zhang C, Song D, Chen Y, Feng X, Lumezanu C, Cheng W, Ni J, Zong B, Chen H, Chawla NV (2019) A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 1409–1416

  26. Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674

Download references

Acknowledgements

This research has been supported in part by the Science and Technology Program of State Grid Corporation of China (SGJS0000DKJS2000952), by the Special Fundation of Jiangsu Provincial Industry and Information Transformation and Upgrading in 2020 (Research on key core technologies of artificial intelligence algorithm frameworks, tools and platforms), by the National Natural Science Foundation of China under Grant (61806096).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, F., Gao, Y., Wang, H. et al. TSLOD: a coupled generalized subsequence local outlier detection model for multivariate time series. Int. J. Mach. Learn. & Cyber. 13, 1493–1504 (2022). https://doi.org/10.1007/s13042-021-01462-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01462-x

Keywords

Navigation