ABSTRACT
To increase the relevancy of local patterns discovered from noisy relations, it makes sense to formalize error-tolerance. Our starting point is to address the limitations of state-of-the-art methods for this purpose. Some extractors perform an exhaustive search w.r.t. a declarative specification of error-tolerance. Nevertheless, their computational complexity prevents the discovery of large relevant patterns. Alpha is a 3-step method that (1) computes complete collections of closed patterns, possibly error-tolerant ones, from arbitrary n-ary relations, (2) enlarges them by hierarchical agglomeration, and (3) selects the relevant agglomerated patterns.
- S. Blachon, R. Pensa, J. Besson, C. Robardet, J.-F. Boulicaut, and O. Gandrillon. Clustering formal concepts to discover biologically relevant knowledge from gene expression data. In Silico Biology, 7(0033):1--15, July 2007.Google Scholar
- L. Cerf, J. Besson, T. K. N. Nguyen, and J.-F. Boulicaut. An exhaustive search for error-tolerant patterns in arbitrary n-ary relations. Technical report, LIRIS, June 2009. Under evaluation.Google Scholar
- L. Cerf, J. Besson, C. Robardet, and J.-F. Boulicaut. Closed patterns meet n-ary relations. ACM Trans. on Knowledge Discovery from Data, 3(1):1--36, March 2009. Google ScholarDigital Library
- V. Ganti, J. Gehrke, and R. Ramakrishnan. CACTUS-Clustering categorical data using summaries. In KDD '99: Proc. of the fifth SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 73--83. ACM Press, 1999. Google ScholarDigital Library
- R. Gupta, G. Fang, B. Field, M. Steinbach, and V. Kumar. Quantitative evaluation of approximate frequent pattern mining algorithms. In KDD '08: Proc. of the 14th SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 301--309. ACM Press, 2008. Google ScholarDigital Library
- D. J. Hand. Pattern detection and discovery. In Proc. of the ESF Exploratory Workshop on Pattern Detection and Discovery, volume 2447 of LNCS, pages 1--12. Springer, Heidelberg, 2002. Google ScholarDigital Library
- J. Liu, S. Paulsen, X. Sun, W. Wang, A. B. Nobel, and J. Prins. Mining approximate frequent itemsets in the presence of noise: Algorithm and analysis. In SDM '06: Proc. of the 6th SIAM Int. Conf. on Data Mining, pages 405--416. SIAM, 2006. Google ScholarDigital Library
- H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hätönen, and H. Mannila. Pruning and grouping discovered association rules. In Proc. of the ECML '95 Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases, pages 47--52, 1995.Google Scholar
- A. K. C. Wong and G. C. L. Li. Simultaneous pattern and data clustering for pattern cluster analysis. IEEE Trans. on Knowledge and Data Engineering, 20(7):911--923, July 2008. Google ScholarDigital Library
- M. J. Zaki, M. Peters, I. Assent, and T. Seidl. Clicks: An effective algorithm for mining subspace clusters in categorical datasets. Data&Knowledge Engineering, 60(1):51--70, January 2007. Google ScholarDigital Library
- L. Zhao and M. J. Zaki. TriCluster: An effective algorithm for mining coherent clusters in 3D microarray data. In SIGMOD '05: Proc. of the 24th SIGMOD Int. Conf. on Management of Data, pages 694--705. ACM Press, 2005. Google ScholarDigital Library
Index Terms
- Agglomerating local patterns hierarchically with ALPHA
Recommendations
EL-hyperstructures associated to n-ary relations
This contribution deals with n-ary relations and hyperstructure theory. There exists a way of creating semihypergroups and hypergroups from (partially) quasi-ordered (semi)groups known as Ends lemma construction. In this paper, we use this method to ...
Self-Organizing-Map Based Clustering Using a Local Clustering Validity Index
Classical clustering methods, such as partitioning and hierarchical clustering algorithms, often fail to deliver satisfactory results, given clusters of arbitrary shapes. Motivated by a clustering validity index based on inter-cluster and intra-cluster ...
Cross-sentence N-ary Relation Extraction using Entity Link and Discourse Relation
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementThis paper presents an efficient method of extracting n-ary relations from multiple sentences which is called Entity-path and Discourse relation-centric Relation Extractor (EDCRE). Unlike previous approaches, the proposed method focuses on an entity ...
Comments