skip to main content
10.1145/2806416.2806519acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Interactive User Group Analysis

Published:17 October 2015Publication History

ABSTRACT

User data is becoming increasingly available in multiple domains ranging from phone usage traces to data on the social Web. The analysis of user data is appealing to scientists who work on population studies, recommendations, and large-scale data analytics. We argue for the need for an interactive analysis to understand the multiple facets of user data and address different analytics scenarios. Since user data is often sparse and noisy, we propose to produce labeled groups that describe users with common properties and develop IUGA, an interactive framework based on group discovery primitives to explore the user space. At each step of IUGA, an analyst visualizes group members and may take an action on the group (add/remove members) and choose an operation (exploit/explore) to discover more groups and hence more users. Each discovery operation results in k most relevant and diverse groups. We formulate group exploitation and exploration as optimization problems and devise greedy algorithms to enable efficient group discovery. Finally, we design a principled validation methodology and run extensive experiments that validate the effectiveness of IUGA on large datasets for different user space analysis scenarios.

References

  1. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications, volume 27. ACM, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In SIGMOD, pages 207--216, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bhuiyan, S. Mukhopadhyay, and M. A. Hasan. Interactive pattern mining on hidden data: a sampling-based solution. In CIKM, pages 95--104, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Boley, B. Kang, P. Tokmakov, M. Mampaey, and S. Wrobel. One click mining: Interactive local pattern discovery through implicit preference and performance learning. IDEAS (ACM SIGKDD Workshop), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Bonchi, F. Giannotti, A. Mazzanti, and D. Pedreschi. Exante: Anticipated data reduction in constrained pattern mining. In PKDD, pages 59--70, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Bucila, J. Gehrke, D. Kifer, and W. M. White. Dualminer: a dual-pruning algorithm for itemsets with constraints. In Knowledge Discovery and Data Mining, pages 42--51, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. C. Cao, J. She, Y. Tong, and L. Chen. Whom to ask?: jury selection for decision making tasks on micro-blog services. VLDB, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. G. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. U. Cetintemel, M. Cherniack, J. DeBrabant, Y. Diao, K. Dimitriadou, A. Kalinin, O. Papaemmanouil, and S. B. Zdonik. Query steering for interactive data exploration. In CIDR, 2013.Google ScholarGoogle Scholar
  11. O. Chapelle, S. Ji, C. Liao, E. Velipasaoglu, L. Lai, and S.-L. Wu. Intent-based diversification of web search results: metrics and algorithms. Information Retrieval, 14(6):572--592, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29(3):410--421, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Friedman, M. Goldszmidt, et al. Discretizing continuous attributes while learning bayesian networks. In ICML, pages 157--165, 1996.Google ScholarGoogle Scholar
  14. L. Geng and H. J. Hamilton. Interestingness measures for data mining: A survey. ACM Computing Surveys (CSUR), 38(3):9, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Goethals, S. Moens, and J. Vreeken. Mime: A framework for interactive visual pattern mining. In PKDD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Indyk, S. Mahabadi, M. Mahdian, and V. S. Mirrokni. Composable core-sets for diversity and coverage maximization. In ACM SIGMOD SIGART, pages 100--108. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. S. Johnson. Approximation algorithms for combinatorial problems. In Proceedings of the fifth annual ACM symposium on Theory of computing, pages 38--49. ACM, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Leuski and J. Allan. Strategy-based interactive cluster visualization for information retrieval. International Journal on Digital Libraries, 3:170--184, 2000.Google ScholarGoogle Scholar
  19. B. Omidvar-Tehrani, S. Amer-Yahia, and A. Termier. Interactive user group analysis. Research Report RR-LIG-048, LIG, Grenoble, France, 2015.Google ScholarGoogle Scholar
  20. B. Omidvar-Tehrani, S. Amer-Yahia, A. Termier, A. Bertaux, E. Gaussier, and M.-C. Rousset. Towards a framework for semantic exploration of frequent patterns. IMMoA, 2013.Google ScholarGoogle Scholar
  21. L. Parida. Redescription mining: Structure theory and algorithms. In In Proc. AAAI'05, pages 837--844, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. K. sang Leung, P. P. Irani, and C. L. Carmichael. WiFIsViz: Effective Visualization of Frequent Itemsets. In ICDM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Siebes, J. Vreeken, and M. van Leeuwen. Item sets that compress. In SDM, volume 6, pages 393--404. SIAM, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  24. T. Uno, M. Kiyomi, and H. Arimura. Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In FIMI, 2004.Google ScholarGoogle Scholar
  25. R. West and J. Leskovec. Automatic versus human navigation in information networks. In ICWSM, 2012.Google ScholarGoogle Scholar

Index Terms

  1. Interactive User Group Analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
      October 2015
      1998 pages
      ISBN:9781450337946
      DOI:10.1145/2806416

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 October 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '15 Paper Acceptance Rate165of646submissions,26%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader