Skip to main content

What Can Formal Concept Analysis Do for Data Warehouses?

  • Conference paper
Formal Concept Analysis (ICFCA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5548))

Included in the following conference series:

Abstract

Formal concept analysis (FCA) has been successfully used in several Computer Science fields such as databases, software engineering, and information retrieval, and in many domains like medicine, psychology, linguistics and ecology. In data warehouses, users exploit data hypercubes (i.e., multi-way tables) mainly through online analytical processing (OLAP) techniques to extract useful information from data for decision support purposes.

Many topics have attracted researchers in the area of data warehousing: data warehouse design and multidimensional modeling, efficient cube materialization (pre-computation), physical data organization, query optimization and approximation, discovery-driven data exploration as well as cube compression and mining. Recently, there has been an increasing interest to apply or adapt data mining approaches and advanced statistical analysis techniques for extracting knowledge (e.g., outliers, clusters, rules, closed n-sets) from multidimensional data. Such approaches or techniques cover (but are not limited to) FCA, cluster analysis, principal component analysis, log-linear modeling, and non-negative multi-way array factorization. Since data cubes are generally large and highly dimensional, and since cells contain consolidated (e.g., mean value), multidimensional and temporal data, such facts lead to challenging research issues in mining data cubes. In this presentation, we will give an overview of related work and show how FCA theory (with possible extensions) can be used to extract valuable and actionable knowledge from data warehouses.

Partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: ICDE 1997: Proceedings of the Thirteenth International Conference on Data Engineering, Washington, DC, USA, 1997, pp. 232–243. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  2. Babcock, B., Chaudhuri, S., Das, G.: Dynamic sample selection for approximate query processing. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 539–550. ACM Press, New York (2003)

    Google Scholar 

  3. Barbará, D., Wu, X.: Using loglinear models to compress datacubes. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, pp. 311–323. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Barbara, D., Wu, X.: Loglinear-based quasi cubes. J. Intell. Inf. Syst. 16(3), 255–276 (2001)

    Article  MATH  Google Scholar 

  5. Bellatreche, L., Missaoui, R., Necir, H., Drias, H.: A data mining approach for selecting bitmap join indices. Journal of Computing Science and Engineering 1(2), 177–194 (2007)

    Article  Google Scholar 

  6. Besson, J., Robardet, C., Boulicaut, J.-F.: Mining a new fault-tolerant pattern type as an alternative to formal concept discovery. In: Schärfe, H., Hitzler, P., Øhrstrøm, P. (eds.) ICCS 2006. LNCS, vol. 4068, pp. 144–157. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Convex cube: Towards a unified structure for multidimensional databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 572–581. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Closed Cube Lattices. In: New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, vol. 3, pp. 1–20. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  9. Cerf, L., Besson, J., Robardet, C., Boulicaut, J.-F.: Data peeler: Constraint-based closed pattern mining in n-ary relations. In: SDM, pp. 37–48. SIAM, Philadelphia (2008)

    Google Scholar 

  10. Chakrabarti, K., Garofalakis, M.N., Rastogi, R., Shim, K.: Approximate query processing using wavelets. VLDB J. 10(2-3), 199–223 (2001)

    MATH  Google Scholar 

  11. Chaudhuri, S., Datar, M., Narasayya, V.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)

    Article  Google Scholar 

  12. Chaudhuri, S., Dayal, U.: An overview of data warehousing and olap technology. SIGMOD Rec. 26(1), 65–74 (1997)

    Article  Google Scholar 

  13. Dong, G., Han, J., Lam, J.M.W., Pei, J., Wang, K.: Mining multi-dimensional constrained gradients in data cubes. In: VLDB 2001: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 321–330. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  14. Gabler, S., Wolff, K.E.: Comparison of visualizations in formal concept analysis and correspondence analysis. In: Greenacre, M., Blasius, J. (eds.) Visualization of Categorical Data, pp. 85–97. Academic Press, San Diego (1998)

    Google Scholar 

  15. Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., San Francisco (2000)

    MATH  Google Scholar 

  16. Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: SIGMOD 1996: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 205–216. ACM Press, New York (1996)

    Chapter  Google Scholar 

  17. Imielinski, T., Khachiyan, L., Abdulghani, A.: Cubegrades: Generalizing association rules. Data Min. Knowl. Discov. 6(3), 219–257 (2002)

    Article  MathSciNet  Google Scholar 

  18. Jaeschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias - an algorithm for mining iceberg tri-lattices. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, December 2006, pp. 907–911. IEEE Computer Society Press, Los Alamitos (2006)

    Google Scholar 

  19. Ji, L., Tan, K.-L., Tung, A.K.H.: Mining frequent closed cubes in 3d datasets. In: VLDB 2006: Proceedings of the 32nd international conference on Very large data bases, pp. 811–822. VLDB Endowment (2006)

    Google Scholar 

  20. Kamber, M., Han, J., Chiang, J.: Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD 1997), Newport Beach, CA, USA, August 1997, pp. 207–210. The AAAI Press, Menlo Park (1997)

    Google Scholar 

  21. Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. The VLDB Journal 8(3-4), 237–253 (2000)

    Article  Google Scholar 

  22. Lakshmanan, L.V.S., Pei, J., Zhao, Y.: Quotient cube: How to summarize the semantics of a data cube. In: Proceedings of the 28th International Conference on Very Large Databases, VLDB, pp. 778–789 (2002)

    Google Scholar 

  23. Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In: Ellis, G., Rich, W., Levinson, R., Sowa, J.F. (eds.) ICCS 1995. LNCS, vol. 954, pp. 32–43. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  24. Li, C.-P., Tung, K.-H., Wang, S.: Incremental maintenance of quotient cube based on galois lattice. J. Comput. Sci. Technol. 19(3), 302–308 (2004)

    Article  Google Scholar 

  25. Lu, H., Feng, L., Han, J.: Beyond intratransaction association analysis: mining multidimensional intertransaction association rules. ACM Trans. Inf. Syst. 18(4), 423–454 (2000)

    Article  Google Scholar 

  26. Messaoud, R.B., Boussaid, O., Rabaséda, S.: A new olap aggregation based on the ahc technique. In: DOLAP 2004: Proceedings of the 7th ACM international workshop on Data warehousing and OLAP, pp. 65–72. ACM Press, New York (2004)

    Google Scholar 

  27. Missaoui, R., Goutte, C., Choupo, A.K., Boujenoui, A.: A probabilistic model for data cube compression and query approximation. In: DOLAP 2007: Proceedings of the ACM tenth international workshop on Data warehousing and OLAP, pp. 33–40. ACM Press, New York (2007)

    Chapter  Google Scholar 

  28. Palpanas, T., Koudas, N., Mendelzon, A.: Using datacube aggregates for approximate querying and deviation detection. IEEE Transactions on Knowledge and Data Engineering 17(11), 1465–1477 (2005)

    Article  Google Scholar 

  29. Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-driven exploration of olap data cubes. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 168–182. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  30. Shanmugasundaram, J., Fayyad, U., Bradley, P.S.: Compressed data cubes for olap aggregate query approximation on continuous dimensions. In: KDD 1999: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 223–232. ACM Press, New York (1999)

    Chapter  Google Scholar 

  31. Stumme, G.: Conceptual on-line analytical processing. In: Information organization and databases: foundations of data organization, pp. 191–203 (2000)

    Google Scholar 

  32. Tjioe, H.C., Taniar, D.: Mining association rules in data warehouses. International Journal of Data Warehousing and Mining 1(3), 28–62 (2005)

    Article  Google Scholar 

  33. Ventos, V., Soldano, H.: Alpha galois lattices: an overview. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS, vol. 3403, pp. 298–313. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  34. Voutsadakis, G.: Polyadic concept analysis. Order 19(3), 295–304 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  35. White, D.R.: Statistical entailments and the galois lattice. Social Networks 18, 201–215 (1996)

    Article  Google Scholar 

  36. Wolff, K.E.: Comparison of graphical data analysis methods. In: Faulbaum, F., Bandilla, W. (eds.) SoftStat 1995. Advances in Statistical Software, vol. 5, Lucius&Lucius, Stuttgart, pp. 139–151 (1996)

    Google Scholar 

  37. Xin, D., Han, J., Li, X., Wah, B.W.: Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In: VLDB (2003)

    Google Scholar 

  38. Yu, F., Shan, W.: Compressed data cube for approximate olap query processing. J. Comput. Sci. Technol. 17(5), 625–635 (2002)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Missaoui, R., Kwuida, L. (2009). What Can Formal Concept Analysis Do for Data Warehouses?. In: Ferré, S., Rudolph, S. (eds) Formal Concept Analysis. ICFCA 2009. Lecture Notes in Computer Science(), vol 5548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01815-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01815-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01814-5

  • Online ISBN: 978-3-642-01815-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics