research-article

Interactive User Group Analysis

Authors:
Behrooz Omidvar-Tehrani

Université Grenoble Alpes - LIG, CNRS, Grenoble, France

Université Grenoble Alpes - LIG, CNRS, Grenoble, France
View Profile

,
Sihem Amer-Yahia

Université Grenoble Alpes - LIG, CNRS, Grenoble, France

Université Grenoble Alpes - LIG, CNRS, Grenoble, France
View Profile

,
Alexandre Termier

University of Rennes 1, IRISA/INRIA, Rennes, France

University of Rennes 1, IRISA/INRIA, Rennes, France
View Profile

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementOctober 2015Pages 403–412https://doi.org/10.1145/2806416.2806519

Published:17 October 2015Publication History

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Pages 403–412

ABSTRACT

User data is becoming increasingly available in multiple domains ranging from phone usage traces to data on the social Web. The analysis of user data is appealing to scientists who work on population studies, recommendations, and large-scale data analytics. We argue for the need for an interactive analysis to understand the multiple facets of user data and address different analytics scenarios. Since user data is often sparse and noisy, we propose to produce labeled groups that describe users with common properties and develop IUGA, an interactive framework based on group discovery primitives to explore the user space. At each step of IUGA, an analyst visualizes group members and may take an action on the group (add/remove members) and choose an operation (exploit/explore) to discover more groups and hence more users. Each discovery operation results in k most relevant and diverse groups. We formulate group exploitation and exploration as optimization problems and devise greedy algorithms to enable efficient group discovery. Finally, we design a principled validation methodology and run extensive experiments that validate the effectiveness of IUGA on large datasets for different user space analysis scenarios.

References

R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications, volume 27. ACM, 1998. Google ScholarDigital Library
R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In SIGMOD, pages 207--216, 1993. Google ScholarDigital Library
M. Bhuiyan, S. Mukhopadhyay, and M. A. Hasan. Interactive pattern mining on hidden data: a sampling-based solution. In CIKM, pages 95--104, 2012. Google ScholarDigital Library
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003. Google ScholarDigital Library
M. Boley, B. Kang, P. Tokmakov, M. Mampaey, and S. Wrobel. One click mining: Interactive local pattern discovery through implicit preference and performance learning. IDEAS (ACM SIGKDD Workshop), 2013. Google ScholarDigital Library
F. Bonchi, F. Giannotti, A. Mazzanti, and D. Pedreschi. Exante: Anticipated data reduction in constrained pattern mining. In PKDD, pages 59--70, 2003.Google ScholarCross Ref
C. Bucila, J. Gehrke, D. Kifer, and W. M. White. Dualminer: a dual-pruning algorithm for itemsets with constraints. In Knowledge Discovery and Data Mining, pages 42--51, 2002. Google ScholarDigital Library
C. C. Cao, J. She, Y. Tong, and L. Chen. Whom to ask?: jury selection for decision making tasks on micro-blog services. VLDB, 2012. Google ScholarDigital Library
J. G. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages 335--336, 1998. Google ScholarDigital Library
U. Cetintemel, M. Cherniack, J. DeBrabant, Y. Diao, K. Dimitriadou, A. Kalinin, O. Papaemmanouil, and S. B. Zdonik. Query steering for interactive data exploration. In CIDR, 2013.Google Scholar
O. Chapelle, S. Ji, C. Liao, E. Velipasaoglu, L. Lai, and S.-L. Wu. Intent-based diversification of web search results: metrics and algorithms. Information Retrieval, 14(6):572--592, 2011. Google ScholarDigital Library
U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29(3):410--421, 2001.Google ScholarDigital Library
N. Friedman, M. Goldszmidt, et al. Discretizing continuous attributes while learning bayesian networks. In ICML, pages 157--165, 1996.Google Scholar
L. Geng and H. J. Hamilton. Interestingness measures for data mining: A survey. ACM Computing Surveys (CSUR), 38(3):9, 2006. Google ScholarDigital Library
B. Goethals, S. Moens, and J. Vreeken. Mime: A framework for interactive visual pattern mining. In PKDD, 2011. Google ScholarDigital Library
P. Indyk, S. Mahabadi, M. Mahdian, and V. S. Mirrokni. Composable core-sets for diversity and coverage maximization. In ACM SIGMOD SIGART, pages 100--108. ACM, 2014. Google ScholarDigital Library
D. S. Johnson. Approximation algorithms for combinatorial problems. In Proceedings of the fifth annual ACM symposium on Theory of computing, pages 38--49. ACM, 1973. Google ScholarDigital Library
A. Leuski and J. Allan. Strategy-based interactive cluster visualization for information retrieval. International Journal on Digital Libraries, 3:170--184, 2000.Google Scholar
B. Omidvar-Tehrani, S. Amer-Yahia, and A. Termier. Interactive user group analysis. Research Report RR-LIG-048, LIG, Grenoble, France, 2015.Google Scholar
B. Omidvar-Tehrani, S. Amer-Yahia, A. Termier, A. Bertaux, E. Gaussier, and M.-C. Rousset. Towards a framework for semantic exploration of frequent patterns. IMMoA, 2013.Google Scholar
L. Parida. Redescription mining: Structure theory and algorithms. In In Proc. AAAI'05, pages 837--844, 2005. Google ScholarDigital Library
C. K. sang Leung, P. P. Irani, and C. L. Carmichael. WiFIsViz: Effective Visualization of Frequent Itemsets. In ICDM, 2008. Google ScholarDigital Library
A. Siebes, J. Vreeken, and M. van Leeuwen. Item sets that compress. In SDM, volume 6, pages 393--404. SIAM, 2006.Google ScholarCross Ref
T. Uno, M. Kiyomi, and H. Arimura. Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In FIMI, 2004.Google Scholar
R. West and J. Leskovec. Automatic versus human navigation in information networks. In ICWSM, 2012.Google Scholar

Index Terms

Interactive User Group Analysis
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Data Pipelines for User Group Analytics
SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

User data is becoming increasingly available in various domains ranging from the social Web to electronic patient health records (EHRs). User data is characterized by a combination of demographics (e.g., age, gender, life status) and user actions (e.g., ...
Read More
Visual exploration of rating datasets and user groups
Abstract
The increasing availability of rating datasets (i.e., datasets containing user evaluations on items such as products and services) constitutes a new opportunity in various applications ranging from behavioral analytics to ...
Highlights
- A Visual Analytics (VA) system for exploring users and forming and exploring groups.
Read More
User group analytics: hypothesis generation and exploratory analysis of user data

User data is becoming increasingly available in multiple domains ranging from the social Web to retail store receipts. User data is described by user demographics (e.g., age, gender, occupation) and user actions (e.g., rating a movie, publishing a paper,...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
October 2015
1998 pages
ISBN:9781450337946
DOI:10.1145/2806416
General Chairs:
James Bailey
The University of Melbourne
,
Alistair Moffat
The University of Melbourne
,
Program Chairs:
Charu C. Aggarwal
IBM
,
Maarten de Rijke
University of Amsterdam
,
Ravi Kumar
Google
,
Vanessa Murdock
Microsoft
,
Timos Sellis
RMIT University
,
Jeffrey Xu Yu
Chinese University of Hong Kong
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
interactive analysis
user data
validation
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '15 Paper Acceptance Rate165of646submissions,26%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 366
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interactive User Group Analysis

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Data Pipelines for User Group Analytics

Visual exploration of rating datasets and user groups

User group analytics: hypothesis generation and exploratory analysis of user data