skip to main content
research-article
Public Access

Observation-Level and Parametric Interaction for High-Dimensional Data Analysis

Published:13 June 2018Publication History
Skip Abstract Section

Abstract

Exploring high-dimensional data is challenging. Dimension reduction algorithms, such as weighted multidimensional scaling, support data exploration by projecting datasets to two dimensions for visualization. These projections can be explored through parametric interaction, tweaking underlying parameterizations, and observation-level interaction, directly interacting with the points within the projection. In this article, we present the results of a controlled usability study determining the differences, advantages, and drawbacks among parametric interaction, observation-level interaction, and their combination. The study assesses both interaction technique effects on domain-specific high-dimensional data analyses performed by non-experts of statistical algorithms. This study is performed using Andromeda, a tool that enables both parametric and observation-level interaction to provide in-depth data exploration. The results indicate that the two forms of interaction serve different, but complementary, purposes in gaining insight through steerable dimension reduction algorithms.

References

  1. Jamal Alsakran, Yang Chen, Ye Zhao, Jing Yang, and Dongning Luo. 2011. STREAMIT: Dynamic visualization and interactive exploration of text streams. In Proceedings of the 2011 IEEE Pacific Visualization Symposium. IEEE, 131--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Amar, J. Eagan, and J. Stasko. 2005. Low-level components of analytic activity in information visualization. In Proceedings of the IEEE Symposium on Information Visualization (INFOVIS’05). 111--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Mag. 35, 4 (2014), 105--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Bradel, C. North, L. House, and S. Leman. 2014. Multi-model semantic interaction for text analytics. In Proceedings of the 2014 IEEE Conference on Visual Analytics Science and Technology (VAST’14). 163--172. Google ScholarGoogle ScholarCross RefCross Ref
  5. Matthew Brehmer, Michael Sedlmair, Stephen Ingram, and Tamara Munzner. 2014. Visualizing dimensionally-reduced data: Interviews with analysts and a characterization of task sequences. In Proceedings of the 5th Workshop on Beyond Time and Errors: Novel Evaluation Methods for Visualization. ACM, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. T. Brown, J. Liu, C. E. Brodley, and R. Chang. 2012. Dis-function: Learning distance functions interactively. In Proceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST’12). 83--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Cakmak, C. Chao, and A. L. Thomaz. 2010. Designing interactions for robot active learners. IEEE Trans. Auton. Mental Dev. 2, 2 (Jun. 2010), 108--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stuart K. Card, Jock D. Mackinlay, and Ben Shneiderman. 1999. Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xin Chen, Leanna House, Jessica Zeitz Self, Scotland Leman, Jane Robertson Evia, James Thomas Fry, and Chris North. 2016. Be the data: An embodied experience for data analytics. In Proceedings of the 2016 Annual Meeting of the American Educational Research Association (AERA’16). 20.Google ScholarGoogle Scholar
  10. J. Choo, H. Lee, J. Kihm, and H. Park. 2010. iVisClassifier: An interactive visual analytics system for classification based on supervised dimension reduction. In Proceedings of the 2010 IEEE Symposium on Visual Analytics Science and Technology. 27--34. Google ScholarGoogle ScholarCross RefCross Ref
  11. Kristin A. Cook and James J. Thomas. 2005. Illuminating the Path: The Research and Development Agenda for Visual Analytics. Technical Report. Pacific Northwest National Laboratory, Richland, WA.Google ScholarGoogle Scholar
  12. E. P. dos Santos Amorim, E. V. Brazil, J. Daniels, P. Joia, L. G. Nonato, and M. C. Sousa. 2012. iLAMP: Exploring high-dimensional spacing through backward multidimensional projection. In Proceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST’12). 53--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alex Endert, Patrick Fiaux, and Chris North. 2012a. Semantic interaction for sensemaking: Inferring analytical reasoning for model steering. IEEE Trans. Vis. Comput. Graph. 18, 12 (2012), 2879--2888. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alex Endert, Patrick Fiaux, and Chris North. 2012b. Semantic interaction for visual text analytics. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 473--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alex Endert, Chao Han, Dipayan Maiti, Leanna House, Scotland Leman, and Chris North. 2011. Observation-level interaction with statistical models for visual analytics. In Proceedings of the 2011 IEEE Conference on Visual Analytics Science and Technology (VAST’11). 121--130. Google ScholarGoogle ScholarCross RefCross Ref
  16. Jerry Alan Fails and Dan R. Olsen, Jr.2003. Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (IUI’03). ACM, New York, NY, 39--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Eric D. Feigelson and G. Jogesh Babu. 2012. Modern Statistical Methods for Astronomy: With R Applications. Cambridge University Press. Google ScholarGoogle ScholarCross RefCross Ref
  18. James Fogarty, Desney Tan, Ashish Kapoor, and Simon Winder. 2008. CueFlik: Interactive concept learning in image search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’08). ACM, New York, NY, 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Stephen L. France and J. Douglas Carroll. 2011. Two-way multidimensional scaling: A review. IEEE Trans. Syst. Man. Cybernet. C 41, 5 (2011), 644--661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Keinosuke Fukunaga. 2013. Introduction to Statistical Pattern Recognition. Academic Press.Google ScholarGoogle Scholar
  21. D. J. Gilmore and T. R. G. Green. 1984. Comprehension and recall of miniature programs. Int. J. Man-Mach. Stud. 21, 1 (1984), 31--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Michael Gleicher. 2013. Explainers: Expert explorations with crafted projections. IEEE Trans. Vis. Comput. Graph. 19, 12 (2013), 2042--2051. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. J. Mach. Learn. Res. 3 (Mar.2003), 1157--1182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Leanna House and Chao Han. 2015. Bayesian visual analytics: BaVA. Stat. Anal. Data Min. 8, 1 (2015), 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xinran Hu, Lauren Bradel, Dipayan Maiti, Leanna House, and Chris North. 2013. Semantics of directly manipulating spatializations. IEEE Trans. Vis. Comput. Graph. 19, 12 (2013), 2052--2059. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Ingram, T. Munzner, V. Irvine, M. Tory, S. Bergner, and T. MÃűller. 2010. DimStiller: Workflows for dimensional analysis and reduction. In Proceedings of the 2010 IEEE Symposium on Visual Analytics Science and Technology. 3--10. Google ScholarGoogle ScholarCross RefCross Ref
  27. Anil K. Jain, Robert P. W. Duin, and Jianchang Mao. 2000. Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1 (2000), 4--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Dong Hyun Jeong, Caroline Ziemkiewicz, Brian Fisher, William Ribarsky, and Remco Chang. 2009. iPCA: An interactive system for PCA-based visual analytics. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 767--774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sara Johansson and Jimmy Johansson. 2009. Interactive dimensionality reduction through user-defined combinations of quality metrics. IEEE Trans. Vis. Comput. Graph. 15, 6 (2009), 993--1000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Paulo Joia, Danilo Coimbra, Jose A. Cuminato, Fernando V. Paulovich, and Luis G. Nonato. 2011. Local affine multidimensional projection. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011), 2563--2571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.Google ScholarGoogle Scholar
  32. Eser Kandogan. 2000. Star coordinates: A multi-dimensional visualization technique with uniform treatment of dimensions. In Proceedings of the IEEE Information Visualization Symposium, Vol. 650. 22.Google ScholarGoogle Scholar
  33. E. Kandogan. 2012. Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. In Proceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST’12). 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tasneem Kaochar, Raquel Peralta, Clayton Morrison, Ian Fasel, Thomas Walsh, and Paul Cohen. 2011. Towards understanding how humans teach robots. In International Conference on User Modeling, Adaptation and Personalization. 347--352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, New York, NY, 1343--1352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Joseph B. Kruskal and Myron Wish. 1978. Multidimensional Scaling. Vol. 11. Sage. Google ScholarGoogle ScholarCross RefCross Ref
  37. Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI’15). ACM, New York, NY, 126--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Todd Kulesza, Simone Stumpf, Weng-Keen Wong, Margaret M. Burnett, Stephen Perona, Andrew Ko, and Ian Oberst. 2011. Why-oriented end-user debugging of naïve bayes text classification. ACM Trans. Interact. Intell. Syst. 1, 1, Article 2 (Oct. 2011), 31 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Heidi Lam, Enrico Bertini, Petra Isenberg, Catherine Plaisant, and Sheelagh Carpendale. 2012. Empirical studies in information visualization: Seven scenarios. IEEE Transactions on Visualization and Computer Graphics 18, 9 (2012), 1520--1536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Christoph H. Lampert, Hannes Nickisch, Stefan Harmeling, and Jens Weidmann. 2009. Animals with Attributes: A Dataset for Attribute Based Classification.Google ScholarGoogle Scholar
  41. Scotland C. Leman, Leanna House, Dipayan Maiti, Alex Endert, and Chris North. 2013. Visual to parametric interaction (V2PI). PloS One 8, 3 (2013), e50474.Google ScholarGoogle ScholarCross RefCross Ref
  42. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (Nov.2008), 2579--2605.Google ScholarGoogle Scholar
  43. Kantilal Varichand Mardia, John T. Kent, and John M. Bibby. 1980. Multivariate analysis. (1980).Google ScholarGoogle Scholar
  44. Tamara Munzner. 2014. Visualization Analysis and Design. CRC Press, Boca Raton, FL.Google ScholarGoogle Scholar
  45. Jakob Nielsen. 1993. Iterative user-interface design. Computer 26, 11 (1993), 32--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Chris North. 2006. Toward measuring visualization insight. IEEE Comput. Graph. Appl. 26, 3 (May 2006), 6--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Paulo Pagliosa, Fernando V Paulovich, Rosane Minghim, Haim Levkowitz, and Luis Gustavo Nonato. 2015. Projection inspector: Assessment and synthesis of multidimensional projections. Neurocomputing 150 (2015), 599--610. Google ScholarGoogle ScholarCross RefCross Ref
  48. Fernando V. Paulovich, Cláudio T. Silva, and Luis Gustavo Nonato. 2012. User-centered multidimensional projection techniques. Comput. Sci. Eng. 14, 4 (2012), 74--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Daniel Pérez, Leishi Zhang, Matthias Schaefer, Tobias Schreck, Daniel Keim, and Ignacio Díaz. 2015. Interactive feature space extension for multidimensional data projection. Neurocomputing 150 (2015), 611--626.Google ScholarGoogle ScholarCross RefCross Ref
  50. PNNL. 2010. IN-SPIRE Visual Document Analysis. (2010).Google ScholarGoogle Scholar
  51. Brian D. Ripley. 2007. Pattern Recognition and Neural Networks. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Purvi Saraiya, Chris North, and Karen Duca. 2005. An insight-based methodology for evaluating bioinformatics visualizations. IEEE Trans. Vis. Comput. Graph. 11, 4 (2005), 443--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Matthias Schaefer, Leishi Zhang, Tobias Schreck, Andrada Tatu, John A. Lee, Michel Verleysen, and Daniel A. Keim. 2013. Improving projection-based data analysis by feature space transformations. In IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 86540H--86540H. Google ScholarGoogle ScholarCross RefCross Ref
  54. Jessica Zeitz Self, Nathan Self, Leanna House, Jane Robertson Evia, Scotland Leman, and Chris North. 2017. Bringing interactive visual analytics to the classroom for developing EDA skills. In Proceedings of the Consortium for Computing Sciences in Colleges, Eastern Region (CCSC-ER). 10.Google ScholarGoogle Scholar
  55. Jessica Zeitz Self, R. K. Vinayagam, James Thomas Fry, and Chris North. 2016. Bridging the gap between user intention and model parameters for data analytics. In Proceedings of the SIGMOD 2016 Workshop on Human-In-the-Loop Data Analytics (HILDA’16). 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jinwook Seo and Ben Shneiderman. 2006. Knowledge discovery in high-dimensional data: Case studies and a user survey for the rank-by-feature framework. IEEE Trans. Vis. Comput. Graph. 12, 3 (2006), 311--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ben Shneiderman. 1994. Dynamic queries for visual information seeking. IEEE Softw. 11, 6 (Nov. 1994), 70--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Ben Shneiderman. 2010. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Pearson Education India.Google ScholarGoogle Scholar
  59. Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. Int. J. Hum.-Comput. Stud. 67, 8 (2009), 639--662. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Justin Talbot, Bongshin Lee, Ashish Kapoor, and Desney S. Tan. 2009. Ensemblematrix: Interactive visualization to support machine learning with multiple classifiers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’09). ACM, New York, NY, 1283--1292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Joshua B. Tenenbaum, Vin De Silva, and John C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500 (2000), 2319--2323. Google ScholarGoogle ScholarCross RefCross Ref
  62. Warren S. Torgerson. 1958. Theory and methods of scaling. (1958).Google ScholarGoogle Scholar
  63. Cagatay Turkay, Arvid Lundervold, Astri Johansen Lundervold, and Helwig Hauser. 2012. Representative factor generation for the interactive visual analysis of high-dimensional data. IEEE Trans. Vis. Comput. Graph. 18, 12 (2012), 2621--2630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Iris Vessey. 1991. Cognitive fit: A theory-based analysis of the graphs versus tables literature*. Dec. Sci. 22, 2 (1991), 219--240. Google ScholarGoogle ScholarCross RefCross Ref
  65. Michael J. Way, Jeffrey D. Scargle, Kamal M. Ali, and Ashok N. Srivastava. 2012. Advances in Machine Learning and Data Mining for Astronomy. CRC Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. J. Wenskovitch, I. Crandell, N. Ramakrishnan, L. House, S. Leman, and C. North. 2018. Towards a systematic combination of dimension reduction and clustering in visual analytics. IEEE Trans. Vis. Comput. Graph. 24, 1 (Jan. 2018), 131--141. Google ScholarGoogle ScholarCross RefCross Ref
  67. John Wenskovitch and Chris North. 2017. Observation-level interaction with clustering and dimension reduction algorithms. In Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics (HILDA’17). ACM, New York, NY, Article 14, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Hadley Wickham, Dianne Cook, Heike Hofmann, Andreas Buja, and others. 2011. Tourr: An R package for exploring multivariate data with projections. J. Stat. Softw. 40, 2 (2011), 1--18. Google ScholarGoogle ScholarCross RefCross Ref
  69. Ji Soo Yi, Youn ah Kang, John Stasko, and Julie Jacko. 2007. Toward a deeper understanding of the role of interaction in information visualization. IEEE Trans. Vis. Comput. Graph. 13, 6 (2007), 1224--1231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Ji Soo Yi, Rachel Melton, John Stasko, and Julie A. Jacko. 2005. Dust 8 magnet: Multivariate information visualization using a magnet metaphor. Inf. Vis. 4, 4 (2005), 239--256. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Observation-Level and Parametric Interaction for High-Dimensional Data Analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Interactive Intelligent Systems
      ACM Transactions on Interactive Intelligent Systems  Volume 8, Issue 2
      Special Issue on Human-Centered Machine Learning
      June 2018
      259 pages
      ISSN:2160-6455
      EISSN:2160-6463
      DOI:10.1145/3232718
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2018
      • Accepted: 1 October 2017
      • Revised: 1 August 2017
      • Received: 1 December 2016
      Published in tiis Volume 8, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader