skip to main content
10.1145/3468784.3471606acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiaitConference Proceedingsconference-collections
research-article

OutViz: Visualizing the Outliers of Multivariate Time Series

Published:20 July 2021Publication History

ABSTRACT

This paper proposes OutViz, a dual view framework for representing and filtering multivariate time series data to highlight abnormal patterns in a dataset. The first view of the proposed visualization incorporates a parallel coordinate chart that allows the user to analyze the scores of features extracted from a dimensionality reduction density-based clustering outlier detection algorithm to determine why a particular time series is predicted to be an outlier. Also included on the parallel coordinates chart is an outlier score rank axis that allows the user to select a range of time series data to be filtered and displayed on the second view of the framework. The second view of our proposed framework uses a multi-line chart to represent how each time series variable changes over a range of time. Each time series is represented as a line with the position on the horizontal axis representing a point in time, while the vertical axis encodes the data value. Use cases using real-world multivariate time series data are demonstrated to show the advantages of using the proposed framework for data analytics as well as some findings uncovered while using OutViz on life expectancy data from 236 countries between the year 1960 and 2018, and carbon dioxide emissions data from 210 countries between the year 1960 and 2016.

References

  1. [n.d.]. Life expectancy at birth, total (years). https://data.worldbank.org/indicator/SP.DYN.LE00.IN?view=chartGoogle ScholarGoogle Scholar
  2. [n.d.]. Nutrient Parallel Coordinates. http://bl.ocks.org/syntagmatic/3150059Google ScholarGoogle Scholar
  3. 2006. Data science and classification. Springer-Verlag.Google ScholarGoogle Scholar
  4. 2016. Parallel Coordinates Visual Multidimensional Geometry and Its Applications. Springer Verlag.Google ScholarGoogle Scholar
  5. 2019. Human Development Report 2019. Human Development Report(2019). https://doi.org/10.18356/838f78fd-enGoogle ScholarGoogle ScholarCross RefCross Ref
  6. 2019. Rwanda genocide: 100 days of slaughter. https://www.bbc.com/news/world-africa-26875506Google ScholarGoogle Scholar
  7. Ali H Abuzaid. 2020. Identifying density‐based local outliers in medical multivariate circular data. Statistics in medicine 39, 21 (2020), 2793–2798.Google ScholarGoogle Scholar
  8. Wolfgang Aigner, Silvia Miksch, Heidrun Schumann, and Christian Tominski. 2011. Visualization of Time-Oriented Data. Springer London, Limited, London.Google ScholarGoogle Scholar
  9. Eve The Analyst. [n.d.]. Making a Line Chart in D3.js v.5. https://datawanderings.com/2019/10/28/tutorial-making-a-line-chart-in-d3-js-v-5/Google ScholarGoogle Scholar
  10. Ane Blázquez-García, Angel Conde, Usue Mori, and Jose A. Lozano. 2020. A review on outlier/anomaly detection in time series data. arxiv:2002.04236 [cs.LG]Google ScholarGoogle Scholar
  11. J. Bor, A. J. Herbst, M.-L. Newell, and T. Barnighausen. 2013. Increases in Adult Life Expectancy in Rural South Africa: Valuing the Scale-Up of HIV Treatment. Science 339, 6122 (2013), 961–965. https://doi.org/10.1126/science.1230413Google ScholarGoogle ScholarCross RefCross Ref
  12. Hui Cao, Gangquan Si, Yanbin Zhang, and Lixin Jia. 2010. Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighbor-based outlier factor. Expert systems with applications 37, 12 (2010), 8090–8101.Google ScholarGoogle Scholar
  13. Haoran Dai, Yubo Tao, and Hai Lin. 2019. Visual Analytics of Urban Transportation from a Bike-Sharing and Taxi Perspective. Proceedings of the 12th International Symposium on Visual Information Communication and Interaction(2019). https://doi.org/10.1145/3356422.3356433Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Tuan Nhon Dang, Anushka Anand, and Leland Wilkinson. 2012. Timeseer: Scagnostics for high-dimensional time series. IEEE Transactions on Visualization and Computer Graphics 19, 3(2012), 470–483.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Takanori Fujiwara, Jianping Kelvin Li, Misbah Mubarak, Caitlin Ross, Christopher D. Carothers, Robert B. Ross, and Kwan-Liu Ma. 2018. A visual analytics system for optimizing the performance of large-scale networks in supercomputing systems. Visual Informatics 2, 1 (2018), 98–110. https://doi.org/10.1016/j.visinf.2018.04.010Google ScholarGoogle ScholarCross RefCross Ref
  16. Takanori Fujiwara, Shilpika, Naohisa Sakamoto, Jorji Nonaka, Keiji Yamamoto, and Kwan-Liu Ma. 2021. A Visual Analytics Framework for Reviewing Multivariate Time-Series Data with Dimensionality Reduction. IEEE transactions on visualization and computer graphics 27, 2(2021), 1601–1611.Google ScholarGoogle ScholarCross RefCross Ref
  17. T. Fujiwara, Shilpika, N. Sakamoto, J. Nonaka, K. Yamamoto, and K. L. Ma. 2021. A Visual Analytics Framework for Reviewing Multivariate Time-Series Data with Dimensionality Reduction. IEEE Transactions on Visualization and Computer Graphics 27, 2(2021), 1601–1611. https://doi.org/10.1109/TVCG.2020.3028889Google ScholarGoogle ScholarCross RefCross Ref
  18. Ben D. Fulcher and Nick S. Jones. 2014. Highly Comparative Feature-Based Time-Series Classification. IEEE Transactions on Knowledge and Data Engineering 26, 12(2014), 3026–3037. https://doi.org/10.1109/tkde.2014.2316504Google ScholarGoogle ScholarCross RefCross Ref
  19. Daniel K Giles and Lucianne Walkowicz. 2020. Density-based outlier scoring on Kepler data. Monthly Notices of the Royal Astronomical Society 499, 1 (09 2020), 524–542. https://doi.org/10.1093/mnras/staa2736 arXiv:https://academic.oup.com/mnras/article-pdf/499/1/524/33857219/staa2736.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  20. Henning Gruendl, Patrick Riehmann, Yves Pausch, and Bernd Froehlich. 2016. Time-Series Plots Integrated in Parallel-Coordinates Displays. Computer Graphics Forum 35, 3 (2016), 321–330. https://doi.org/10.1111/cgf.12908Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rongchen Guo, Takanori Fujiwara, Yiran Li, Kelly M. Lima, Soman Sen, Nam K. Tran, and Kwan-Liu Ma. 2020. Comparative visual analytics for assessing medical records with sequence embedding. Visual Informatics 4, 2 (2020), 72–85. https://doi.org/10.1016/j.visinf.2020.04.001Google ScholarGoogle ScholarCross RefCross Ref
  22. Karel Haal, Anja Smith, and Eddy Van Doorslaer. 2018. The rise and fall of mortality inequality in South Africa in the HIV era. SSM - Population Health 5 (2018), 239–248. https://doi.org/10.1016/j.ssmph.2018.06.007Google ScholarGoogle ScholarCross RefCross Ref
  23. Alexander Laban. Hinton. 2005. Why did they kill?: Cambodia in the shadow of genocide. University of California Press.Google ScholarGoogle Scholar
  24. Rob J. Hyndman, Earo Wang, and Nikolay Laptev. 2015. Large-scale unusual time series detection. In Proceedings - 15th IEEE International Conference on Data Mining Workshop, Peng Cui, Jennifer Dry, Charu Aggarwal, Zhi-Hua Zhou, Alexander Tuzhilin, Hui Xiong, and Xindong Wu (Eds.). IEEE, Institute of Electrical and Electronics Engineers, United States of America, 1616–1619. https://doi.org/10.1109/ICDMW.2015.104 IEEE International Conference on Data Mining Workshops 2015, ICDMW 2015 ; Conference date: 14-11-2015 Through 17-11-2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Alfred Inselberg, Mordechai Reif, and Tuval Chomut. 1987. Convexity algorithms in parallel coordinates. J. ACM 34, 4 (1987), 765–801. https://doi.org/10.1145/31846.32221Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ben Kiernan. 2003. The Demography of Genocide in Southeast Asia: The Death Tolls in Cambodia, 1975-79, and East Timor, 1975-80. Critical Asian Studies 35, 4 (2003), 585–597. https://doi.org/10.1080/1467271032000147041Google ScholarGoogle ScholarCross RefCross Ref
  27. Tung Kieu, Bin Yang, Chenjuan Guo, and Christian S. Jensen. 2019. Outlier Detection for Time Series with Recurrent Autoencoder Ensembles. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence(2019). https://doi.org/10.24963/ijcai.2019/378Google ScholarGoogle ScholarCross RefCross Ref
  28. Shusen Liu, Dan Maljovec, Bei Wang, Peer-Timo Bremer, and Valerio Pascucci. 2017. Visualizing High-Dimensional Data: Advances in the Past Decade. IEEE transactions on visualization and computer graphics 23, 3(2017), 1249–1268.Google ScholarGoogle Scholar
  29. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. , 2579–2605 pages.Google ScholarGoogle Scholar
  30. Shawn Martin and Tu-Toan Quach. 2016. Interactive Visualization of Multivariate Time Series Data. In Foundations of Augmented Cognition: Neuroergonomics and Operational Neuroscience, Dylan D. Schmorrow and Cali M. Fidopiastis (Eds.). Springer International Publishing, Cham, 322–332.Google ScholarGoogle Scholar
  31. Chidochashe L. Munangagwa. 2009. The Economic Decline of Zimbabwe. Gettysburg Economic Review 3, 7 (2009), 585–597. https://cupola.gettysburg.edu/ger/vol3/iss1/9Google ScholarGoogle Scholar
  32. Bao Dien Quoc Nguyen, Rattikorn Hewett, and Tommy Dang. 2020. Congnostics: Visual Features for Doubly Time Series Plots. In EuroVis Workshop on Visual Analytics (EuroVA), Cagatay Turkayand Katerina Vrotsou (Eds.). The Eurographics Association. https://doi.org/10.2312/eurova.20201086Google ScholarGoogle ScholarCross RefCross Ref
  33. P Pudil and J Hovovicova. 1998. Novel methods for subset selection with respect to problem knowledge. IEEE intelligent systems their applications 13, 2 (1998), 66–74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yan-Fang Sang, Zhonggen Wang, and Changming Liu. 2012. Period identification in hydrologic time series using empirical mode decomposition and maximum entropy spectral analysis. Journal of hydrology (Amsterdam) 424-425 (2012), 154–164.Google ScholarGoogle Scholar
  35. Jorge L. Serras, Susana Vinga, and Alexandra M. Carvalho. 2021. Outlier Detection for Multivariate Time Series Using Dynamic Bayesian Networks. Applied Sciences 11, 4 (2021), 1955. https://doi.org/10.3390/app11041955Google ScholarGoogle ScholarCross RefCross Ref
  36. Ruey S. Tsay. 2014. Multivariate time series analysis : with R and financial applications. Wiley.Google ScholarGoogle Scholar
  37. Peter Uvin. 2001. Reading the Rwandan Genocide. International Studies Review 3, 3 (2001), 75–99. http://www.jstor.org/stable/3186243Google ScholarGoogle ScholarCross RefCross Ref
  38. Laurens Van Der Maaten. 2014. Accelerating t-SNE using tree-based algorithms. , 3221–3245 pages.Google ScholarGoogle Scholar
  39. Xiaozhe Wang, Kate Smith, and Rob Hyndman. 2006. Characteristic-Based Clustering for Time Series Data. Data Mining and Knowledge Discovery 13, 3 (2006), 335–364. https://doi.org/10.1007/s10618-005-0039-xGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  40. Edward J. Wegman. 1990. Hyperdimensional Data Analysis Using Parallel Coordinates. J. Amer. Statist. Assoc. 85, 411 (1990), 664–675. https://doi.org/10.1080/01621459.1990.10474926Google ScholarGoogle ScholarCross RefCross Ref
  41. William W. S. Wei. 2019. Multivariate time series analysis and applications (1st edition ed.). Wiley.Google ScholarGoogle Scholar
  42. Leland Wilkinson, Anushka Anand, and Robert Grossman. 2005. Graph-theoretic scagnostics. (2005).Google ScholarGoogle Scholar

Index Terms

  1. OutViz: Visualizing the Outliers of Multivariate Time Series
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          IAIT '21: Proceedings of the 12th International Conference on Advances in Information Technology
          June 2021
          281 pages
          ISBN:9781450390125
          DOI:10.1145/3468784

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 July 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate20of47submissions,43%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format