Abstract
Formal concept analysis (FCA) has been successfully used in several Computer Science fields such as databases, software engineering, and information retrieval, and in many domains like medicine, psychology, linguistics and ecology. In data warehouses, users exploit data hypercubes (i.e., multi-way tables) mainly through online analytical processing (OLAP) techniques to extract useful information from data for decision support purposes.
Many topics have attracted researchers in the area of data warehousing: data warehouse design and multidimensional modeling, efficient cube materialization (pre-computation), physical data organization, query optimization and approximation, discovery-driven data exploration as well as cube compression and mining. Recently, there has been an increasing interest to apply or adapt data mining approaches and advanced statistical analysis techniques for extracting knowledge (e.g., outliers, clusters, rules, closed n-sets) from multidimensional data. Such approaches or techniques cover (but are not limited to) FCA, cluster analysis, principal component analysis, log-linear modeling, and non-negative multi-way array factorization. Since data cubes are generally large and highly dimensional, and since cells contain consolidated (e.g., mean value), multidimensional and temporal data, such facts lead to challenging research issues in mining data cubes. In this presentation, we will give an overview of related work and show how FCA theory (with possible extensions) can be used to extract valuable and actionable knowledge from data warehouses.
Partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: ICDE 1997: Proceedings of the Thirteenth International Conference on Data Engineering, Washington, DC, USA, 1997, pp. 232–243. IEEE Computer Society Press, Los Alamitos (1997)
Babcock, B., Chaudhuri, S., Das, G.: Dynamic sample selection for approximate query processing. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 539–550. ACM Press, New York (2003)
Barbará, D., Wu, X.: Using loglinear models to compress datacubes. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, pp. 311–323. Springer, Heidelberg (2000)
Barbara, D., Wu, X.: Loglinear-based quasi cubes. J. Intell. Inf. Syst. 16(3), 255–276 (2001)
Bellatreche, L., Missaoui, R., Necir, H., Drias, H.: A data mining approach for selecting bitmap join indices. Journal of Computing Science and Engineering 1(2), 177–194 (2007)
Besson, J., Robardet, C., Boulicaut, J.-F.: Mining a new fault-tolerant pattern type as an alternative to formal concept discovery. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS, vol. 4068, pp. 144–157. Springer, Heidelberg (2006)
Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Convex cube: Towards a unified structure for multidimensional databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 572–581. Springer, Heidelberg (2007)
Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Closed Cube Lattices. In: New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, vol. 3, pp. 1–20. Springer, Heidelberg (2009)
Cerf, L., Besson, J., Robardet, C., Boulicaut, J.-F.: Data peeler: Constraint-based closed pattern mining in n-ary relations. In: SDM, pp. 37–48. SIAM, Philadelphia (2008)
Chakrabarti, K., Garofalakis, M.N., Rastogi, R., Shim, K.: Approximate query processing using wavelets. VLDB J. 10(2-3), 199–223 (2001)
Chaudhuri, S., Datar, M., Narasayya, V.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)
Chaudhuri, S., Dayal, U.: An overview of data warehousing and olap technology. SIGMOD Rec. 26(1), 65–74 (1997)
Dong, G., Han, J., Lam, J.M.W., Pei, J., Wang, K.: Mining multi-dimensional constrained gradients in data cubes. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 321–330. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Gabler, S., Wolff, K.E.: Comparison of visualizations in formal concept analysis and correspondence analysis. In: Greenacre, M., Blasius, J. (eds.) Visualization of Categorical Data, pp. 85–97. Academic Press, San Diego (1998)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: SIGMOD 1996: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 205–216. ACM Press, New York (1996)
Imielinski, T., Khachiyan, L., Abdulghani, A.: Cubegrades: Generalizing association rules. Data Min. Knowl. Discov. 6(3), 219–257 (2002)
Jaeschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias - an algorithm for mining iceberg tri-lattices. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, December 2006, pp. 907–911. IEEE Computer Society Press, Los Alamitos (2006)
Ji, L., Tan, K.-L., Tung, A.K.H.: Mining frequent closed cubes in 3d datasets. In: VLDB 2006: Proceedings of the 32nd international conference on Very large data bases, pp. 811–822. VLDB Endowment (2006)
Kamber, M., Han, J., Chiang, J.: Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), Newport Beach, CA, USA, August 1997, pp. 207–210. The AAAI Press, Menlo Park (1997)
Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. The VLDB Journal 8(3-4), 237–253 (2000)
Lakshmanan, L.V.S., Pei, J., Zhao, Y.: Quotient cube: How to summarize the semantics of a data cube. In: Proceedings of the 28th International Conference on Very Large Databases, VLDB, pp. 778–789 (2002)
Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In: Ellis, G., Rich, W., Levinson, R., Sowa, J.F. (eds.) ICCS 1995. LNCS, vol. 954, pp. 32–43. Springer, Heidelberg (1995)
Li, C.-P., Tung, K.-H., Wang, S.: Incremental maintenance of quotient cube based on galois lattice. J. Comput. Sci. Technol. 19(3), 302–308 (2004)
Lu, H., Feng, L., Han, J.: Beyond intratransaction association analysis: mining multidimensional intertransaction association rules. ACM Trans. Inf. Syst. 18(4), 423–454 (2000)
Messaoud, R.B., Boussaid, O., Rabaséda, S.: A new olap aggregation based on the ahc technique. In: DOLAP 2004: Proceedings of the 7th ACM international workshop on Data warehousing and OLAP, pp. 65–72. ACM Press, New York (2004)
Missaoui, R., Goutte, C., Choupo, A.K., Boujenoui, A.: A probabilistic model for data cube compression and query approximation. In: DOLAP 2007: Proceedings of the ACM tenth international workshop on Data warehousing and OLAP, pp. 33–40. ACM Press, New York (2007)
Palpanas, T., Koudas, N., Mendelzon, A.: Using datacube aggregates for approximate querying and deviation detection. IEEE Transactions on Knowledge and Data Engineering 17(11), 1465–1477 (2005)
Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven exploration of olap data cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998)
Shanmugasundaram, J., Fayyad, U., Bradley, P.S.: Compressed data cubes for olap aggregate query approximation on continuous dimensions. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 223–232. ACM Press, New York (1999)
Stumme, G.: Conceptual on-line analytical processing. In: Information organization and databases: foundations of data organization, pp. 191–203 (2000)
Tjioe, H.C., Taniar, D.: Mining association rules in data warehouses. International Journal of Data Warehousing and Mining 1(3), 28–62 (2005)
Ventos, V., Soldano, H.: Alpha galois lattices: an overview. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS, vol. 3403, pp. 298–313. Springer, Heidelberg (2005)
Voutsadakis, G.: Polyadic concept analysis. Order 19(3), 295–304 (2002)
White, D.R.: Statistical entailments and the galois lattice. Social Networks 18, 201–215 (1996)
Wolff, K.E.: Comparison of graphical data analysis methods. In: Faulbaum, F., Bandilla, W. (eds.) SoftStat 1995. Advances in Statistical Software, vol. 5, Lucius&Lucius, Stuttgart, pp. 139–151 (1996)
Xin, D., Han, J., Li, X., Wah, B.W.: Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In: VLDB (2003)
Yu, F., Shan, W.: Compressed data cube for approximate olap query processing. J. Comput. Sci. Technol. 17(5), 625–635 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Missaoui, R., Kwuida, L. (2009). What Can Formal Concept Analysis Do for Data Warehouses?. In: Ferré, S., Rudolph, S. (eds) Formal Concept Analysis. ICFCA 2009. Lecture Notes in Computer Science(), vol 5548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01815-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-01815-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01814-5
Online ISBN: 978-3-642-01815-2
eBook Packages: Computer ScienceComputer Science (R0)