ABSTRACT
Anomaly detection is an important field that aims to identify unexpected patterns or data points, and it is closely related to many real-world problems, particularly to applications in finance, manufacturing, cyber security, and so on. While anomaly detection has been studied extensively in various fields, detecting future anomalies before they occur remains an unexplored territory. In this paper, we present a novel type of anomaly detection, called Precursor-of-Anomaly (PoA) detection. Unlike conventional anomaly detection, which focuses on determining whether a given time series observation is an anomaly or not, PoA detection aims to detect future anomalies before they happen. To solve both problems at the same time, we present a neural controlled differential equation-based neural network and its multi-task learning algorithm. We conduct experiments using 17 baselines and 3 datasets, including regular and irregular time series, and demonstrate that our presented method outperforms the baselines in almost all cases. Our ablation studies also indicate that the multitasking training method significantly enhances the overall performance for both anomaly and PoA detection.
Supplemental Material
- Shikha Agrawal and Jitendra Agrawal. 2015. Survey on anomaly detection using data mining techniques. Procedia Computer Science, Vol. 60 (2015), 708--713.Google ScholarCross Ref
- Jinwon An and Sungzoon Cho. 2015. Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE, Vol. 2, 1 (2015), 1--18.Google Scholar
- Oliver D. Anderson and M. G. Kendall. 1976. Time-Series. 2nd edn. The Statistician, Vol. 25 (1976), 308.Google Scholar
- Julien Audibert, Pietro Michiardi, Frédéric Guyard, Sébastien Marti, and Maria A Zuluaga. 2020. Usad: Unsupervised anomaly detection on multivariate time series. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3395--3404.Google ScholarDigital Library
- Emel Ay, Maxime Devanne, Jonathan Weber, and Germain Forestier. 2022. A study of knowledge distillation in fully convolutional network for time series classification. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.Google ScholarCross Ref
- Anshuman Bhardwaj, Shaktiman Singh, Lydia Sam, Pawan K Joshi, Akanksha Bhardwaj, F Javier Mart'in-Torres, and Rajesh Kumar. 2017. A review on remotely sensed land surface temperature anomaly as an earthquake precursor. International journal of applied earth observation and geoinformation, Vol. 63 (2017), 158--166.Google ScholarCross Ref
- Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. 93--104.Google ScholarDigital Library
- Chris U. Carmona, François-Xavier Aubet, Valentin Flunkert, and Jan Gasthaus. 2022. Neural Contextual Anomaly Detection for Time Series. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22,, Lud De Raedt (Ed.). International Joint Conferences on Artificial Intelligence Organization, 2843--2851. https://doi.org/10.24963/ijcai.2022/394 Main Track.Google ScholarCross Ref
- Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2010. Anomaly detection for discrete sequences: A survey. IEEE transactions on knowledge and data engineering, Vol. 24, 5 (2010), 823--839.Google Scholar
- Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. 2018. Neural Ordinary Differential Equations. In NeurIPS.Google Scholar
- Shijie Chen, Yu Zhang, and Qiang Yang. 2021. Multi-task learning in natural language processing: An overview. arXiv preprint arXiv:2109.09138 (2021).Google Scholar
- Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, and Sal Stolfo. 2002. A geometric framework for unsupervised anomaly detection. In Applications of data mining in computer security. Springer, 77--101.Google Scholar
- Dipak Ghosh, Argha Deb, and Rosalima Sengupta. 2009. Anomalous radon emission as precursor of earthquake. Journal of Applied Geophysics, Vol. 69, 2 (2009), 67--81.Google ScholarCross Ref
- Mike Giles and Niles Pierce. 2000. An Introduction to the Adjoint Approach to Design. Flow, Turbulence and Combustion, Vol. 65 (2000), 393--415. https://doi.org/10.1023/A:1011430410075Google ScholarCross Ref
- Jonathan Goh, Sridhar Adepu, Khurum Nazir Junejo, and Aditya Mathur. 2017. A dataset to support research in the design of secure water treatment systems. In Critical Information Infrastructures Security: 11th International Conference, CRITIS 2016, Paris, France, October 10--12, 2016, Revised Selected Papers 11. Springer, 88--99.Google ScholarCross Ref
- Markus Goldstein and Seiichi Uchida. 2016. A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PloS one, Vol. 11, 4 (2016), e0152173.Google ScholarCross Ref
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.1406.2661Google ScholarCross Ref
- William Hager. 2000. Runge-Kutta Methods in Optimal Control and the Transformed Adjoint System. Numer. Math., Vol. 87 (2000), 247--282. https://doi.org/10.1007/s002110000178Google ScholarCross Ref
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
- Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 387--395.Google ScholarDigital Library
- Patrick Kidger, James Morrill, James Foster, and Terry Lyons. 2020. Neural Controlled Differential Equations for Irregular Time Series. In NeurIPS.Google Scholar
- Florian Knorn and Douglas J Leith. 2008. Adaptive kalman filtering for anomaly detection in software appliances. In IEEE INFOCOM Workshops 2008. IEEE, 1--6.Google ScholarCross Ref
- Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, and Tomas Pfister. 2021a. CutPaste: Self-Supervised Learning for Anomaly Detection and Localization. https://doi.org/10.48550/ARXIV.2104.04015Google Scholar
- Zhihan Li, Youjian Zhao, Jiaqi Han, Ya Su, Rui Jiao, Xidao Wen, and Dan Pei. 2021b. Multivariate time series anomaly detection and interpretation using hierarchical inter-metric and temporal embedding. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3220--3230.Google ScholarDigital Library
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 eighth ieee international conference on data mining. IEEE, 413--422.Google ScholarDigital Library
- Shikun Liu, Edward Johns, and Andrew J Davison. 2019. End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1871--1880.Google ScholarCross Ref
- Yi Liu, Sahil Garg, Jiangtian Nie, Yang Zhang, Zehui Xiong, Jiawen Kang, and M Shamim Hossain. 2020. Deep anomaly detection for time-series data in industrial IoT: A communication-efficient on-device federated learning approach. IEEE Internet of Things Journal, Vol. 8, 8 (2020), 6348--6358.Google ScholarCross Ref
- Terry Lyons, M. Caruana, and T. Lévy. 2004. Differential Equations Driven by Rough Paths. Springer. École D'Eté de Probabilités de Saint-Flour XXXIV - 2004.Google Scholar
- Aditya P Mathur and Nils Ole Tippenhauer. 2016. SWaT: A water treatment testbed for research and training on ICS security. In 2016 international workshop on cyber-physical systems for smart water networks (CySWater). IEEE, 31--36.Google ScholarCross Ref
- Sky McKinley and Megan Levine. 1998. Cubic spline interpolation. College of the Redwoods, Vol. 45, 1 (1998), 1049--1060.Google Scholar
- Ramin Moghaddass and Shuangwen Sheng. 2019. An anomaly detection framework for dynamic systems using a Bayesian hierarchical framework. Applied energy, Vol. 240 (2019), 561--582.Google Scholar
- Daehyung Park, Yuuna Hoshi, and Charles C Kemp. 2018. A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder. IEEE Robotics and Automation Letters, Vol. 3, 3 (2018), 1544--1551.Google ScholarCross Ref
- L.S. Pontryagin, E.F. Mishchenko, V.G. Boltyanski, and R.V. Gamkrelidze. 1962. The mathematical theory of optimal processes. Interscience Publishers.Google Scholar
- Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).Google Scholar
- Lukas Ruff, Robert A Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, and Marius Kloft. 2019. Deep semi-supervised anomaly detection. arXiv preprint arXiv:1906.02694 (2019).Google Scholar
- Mohammadreza Salehi, Ainaz Eftekhar, Niousha Sadjadi, Mohammad Hossein Rohban, and Hamid R. Rabiee. 2020. Puzzle-AE: Novelty Detection in Images through Solving Puzzles. https://doi.org/10.48550/ARXIV.2008.12959Google Scholar
- Arun K Saraf, Vineeta Rawat, Swapnamita Choudhury, Sudipta Dasgupta, and Josodhir Das. 2009. Advances in understanding of the mechanism for generation of earthquake thermal precursors detected by satellites. International Journal of Applied Earth Observation and Geoinformation, Vol. 11, 6 (2009), 373--379.Google ScholarCross Ref
- Lifeng Shen, Zhuocong Li, and James Kwok. 2020. Timeseries anomaly detection using temporal hierarchical one-class network. Advances in Neural Information Processing Systems, Vol. 33 (2020), 13016--13026.Google Scholar
- Youjin Shin, Sangyup Lee, Shahroz Tariq, Myeong Shin Lee, Okchul Jung, Daewon Chung, and Simon S Woo. 2020. Itad: integrative tensor-based anomaly detection system for reducing false positives of satellite systems. In Proceedings of the 29th ACM international conference on information & knowledge management. 2733--2740.Google ScholarDigital Library
- Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, and Dan Pei. 2019. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2828--2837.Google ScholarDigital Library
- Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. 2020. CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances. https://doi.org/10.48550/ARXIV.2007.08176Google Scholar
- Shahroz Tariq, Sangyup Lee, Youjin Shin, Myeong Shin Lee, Okchul Jung, Daewon Chung, and Simon S Woo. 2019. Detecting anomalies in space using multivariate convolutional LSTM with mixtures of probabilistic PCA. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2123--2133.Google ScholarDigital Library
- David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine learning, Vol. 54, 1 (2004), 45--66.Google Scholar
- Chee-Wooi Ten, Junho Hong, and Chen-Ching Liu. 2011. Anomaly detection for cybersecurity of the substations. IEEE Transactions on Smart Grid, Vol. 2, 4 (2011), 865--873.Google ScholarCross Ref
- Di Wu, Zhongkai Jiang, Xiaofeng Xie, Xuetao Wei, Weiren Yu, and Renfa Li. 2019. LSTM learning with Bayesian and Gaussian processing for anomaly detection in industrial IoT. IEEE Transactions on Industrial Informatics, Vol. 16, 8 (2019), 5244--5253.Google ScholarCross Ref
- Jiehui Xu, Haixu Wu, Jianmin Wang, and Mingsheng Long. 2021. Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642 (2021).Google Scholar
- Qing Xu, Zhenghua Chen, Mohamed Ragab, Chao Wang, Min Wu, and Xiaoli Li. 2022. Contrastive adversarial knowledge distillation for deep model compression in time-series regression tasks. Neurocomputing, Vol. 485 (2022), 242--251.Google ScholarDigital Library
- Takehisa Yairi, Naoya Takeishi, Tetsuo Oda, Yuta Nakajima, Naoki Nishimura, and Noboru Takata. 2017. A data-driven health monitoring method for satellite housekeeping data based on probabilistic clustering and dimensionality reduction. IEEE Trans. Aerospace Electron. Systems, Vol. 53, 3 (2017), 1384--1401.Google ScholarCross Ref
- Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision. 6023--6032.Google ScholarCross Ref
- Houssam Zenati, Chuan Sheng Foo, Bruno Lecouat, Gaurav Manek, and Vijay Ramaseshan Chandrasekhar. 2018. Efficient gan-based anomaly detection. arXiv preprint arXiv:1802.06222 (2018).Google Scholar
- Meng Zhang, Anand Raghunathan, and Niraj K Jha. 2013. MedMon: Securing medical devices through wireless monitoring and anomaly detection. IEEE Transactions on Biomedical circuits and Systems, Vol. 7, 6 (2013), 871--881.Google ScholarCross Ref
- Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, Vol. 34, 12 (2021), 5586--5609.Google ScholarCross Ref
- Bin Zhou, Shenghua Liu, Bryan Hooi, Xueqi Cheng, and Jing Ye. 2019. BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series.. In IJCAI. 4433--4439.Google Scholar
- Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International conference on learning representations.Google Scholar
Index Terms
- Precursor-of-Anomaly Detection for Irregular Time Series
Recommendations
Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges
AbstractAnomaly detection has recently been applied to various areas, and several techniques based on deep learning have been proposed for the analysis of multivariate time series. In this study, we classify the anomalies into three types, ...
Highlights- The methods for anomaly detection on multivariate time series are reviewed.
- The ...
Optimal Parameter Selection Using Explainable AI for Time-Series Anomaly Detection
PRIMA 2022: Principles and Practice of Multi-Agent SystemsAbstractTime-series anomaly detection is a technique for detecting unusual values, changes, or movements in a large amount of data arranged in time-series. It is primarily used in the fields of intrusion detection, medical diagnosis, and industrial defect ...
Exact variable-length anomaly detection algorithm for univariate and multivariate time series
The problem of anomaly detection in time series has received a lot of attention in the past two decades. However, existing techniques cannot locate where the anomalies are within anomalous time series, or they require users to provide the length of ...
Comments