Skip to main content

Introduction

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 333))

Abstract

Large amounts of data are collected and stored by different government, industrial, commercial or scientific organizations. As the complexity and volume of the data continue to increase, the task of classifying new unseen data and extracting useful knowledge from the data is becoming practically impossible for humans to do. This makes the automatic knowledge acquisition process not just advantageous over manual knowledge acquisition, but rather a necessity since the probability that a human observer will detect something new and useful is very low given the overwhelming complexity and volume of the information.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imieliski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington D.C., USA, May 26-28, pp. 207–216. ACM, New York (1993)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, Septemebr 12-15, pp. 487-499 (1994)

    Google Scholar 

  3. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Usama, M.F., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. American Association for Artificial Intelligence, pp. 307–328 (1996)

    Google Scholar 

  4. AliMohammadzadeh, R., Soltan, S., Rahgozar, M.: Template Guided Association Rule Mining from XML Documents. In: Proceedings of the 15th International Conference on World Wide Web, Edinburgh, Scotland, pp. 963–964. ACM, New York (2006)

    Chapter  Google Scholar 

  5. Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamato, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. Paper presented at the Proceedings of the 2nd SIAM International Conference on Data Mining (SIAM 2002), Arlington, VA, USA, April 11-13 (2002)

    Google Scholar 

  6. Asai, T., Arimura, H., Uno, T., Nakano, S.-i.: Discovering Frequent Substructures in Large Unordered Trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Bayardo, R.J.: Efficiently mining long patterns from databases. Paper presented at the Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, USA, June2-4 (1998)

    Google Scholar 

  8. Braga, D., Campi, A., Ceri, S., Klemettinen, M., Lanzi, P.L.: A Tool for Extracting XML Association Rules. In: Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), Washington, DC, USA, pp. 57–64 (2002)

    Google Scholar 

  9. Brin, S., Motwani, R., Silverstein, C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, USA, May 13-15, pp. 265–276. ACM, New York (1997)

    Chapter  Google Scholar 

  10. Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. Paper presented at the Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), Santorini Island, Greece, June 21-23 (2004a)

    Google Scholar 

  11. Chi, Y., Yang, Y., Muntz, R.R.: Canonical forms for labeled trees and their applications in frequent subtree mining. Knowledge and Information Systems 8(2), 203–234 (2004b)

    Article  Google Scholar 

  12. Chi, Y., Yang, Y., Xia, Y., Muntz, R.R.: CMTreeMiner: Mining both closed and maximal frequent subtrees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 63–73. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  13. Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent Subtree Mining - An Overview. Fundamenta Informaticae, Special Issue on Graph and Tree Mining 66(1-2), 161–198 (2005)

    MATH  MathSciNet  Google Scholar 

  14. Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert Jr., C.J.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM Systems Journal 40(2), 512–531 (2001)

    Article  Google Scholar 

  15. Dillon, T., Tan, P.L.: Object Oriented Conceptual modelling. Prentice-Hall of Australia Pty Ltd. (1993)

    Google Scholar 

  16. Dong, A.: Treefinder: a First Step towards XML Data Mining. Paper presented at the Proceedings of the IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 9-12 (2004)

    Google Scholar 

  17. El-Haji, M., Zaiane, O.R.: COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation. Paper presented at the Workshop on Frequent Itemset Mining Implementations (FIMI 2003), in Conjunction with IEEE-ICDM, Melbourne, Florida, USA, November 19-22 (2003)

    Google Scholar 

  18. Feng, L., Dillon, T.S., Weigand, H., Chang, E.: An XML-Enabled Association Rule Framework. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 88–97. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  19. Feng, L., Dillon, T.S.: Mining XML-Enabled Association Rules with Templates. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 66–88. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Feng, L., Dillon, T.S.: An XML-Enabled Data Mining Query Language XML-DMQL. International Journal of Business Intelligence and Data Mining 1(1), 22–41 (2005)

    Article  Google Scholar 

  21. Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)

    MATH  Google Scholar 

  22. Gehtland, J., Almaer, D., Galbraith, B.: Pragmatic Ajax: A Web 2.0 Primer. Pragmatic Bookshelf (2006)

    Google Scholar 

  23. Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. Paper presented at the Proceedings of the 1st IEEE International Conference on Data Mining, San Jose, USA, November 29 - December 2 (2001)

    Google Scholar 

  24. Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993a)

    Article  Google Scholar 

  25. Gruber, T.R.: Towards Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human and Computer Studies 43(5/6), 907–928 (1993b)

    Google Scholar 

  26. Hadzic, F., Dillon, T.S., Sidhu, A.S., Chang, E., Tan, H.: Mining Substructures in Protein Data. Paper presented at the IEEE Workshop on Data Mining in Bioinformatics DMB 2006, in conjunction with IEEE ICDM 2006, Hong Kong, December 18-22 (2006)

    Google Scholar 

  27. Hadzic, F., Tan, H., Dillon, T.S.: UNI3 - Efficient Algorithm for Mining Unordered Induced Subtrees using TMG Candidate Generation. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii, USA, April 1-5, pp. 568–575. IEEE, Los Alamitos (2007)

    Chapter  Google Scholar 

  28. Hadzic, F.: Advances in knowledge learning methodologies and their applications. Curtin University of Technology, Perth (2008)

    Google Scholar 

  29. Halverson, A., Josifovski, V., Lohman, G., Pirahesh, H., Morschel, M.: ROX: Relational Over XML. In: Proceedings of the 30th Conference on Very Large Databases, Toronto, Canada, pp. 264–275 (2004)

    Google Scholar 

  30. Hampton, L., vun Kannon, D.: Extensible Business Reporting Language (XBRL) 2.0 Specification (December 14, 2001); XBRL.org

    Google Scholar 

  31. Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining Top-K Frequent Closed Patterns without Minimum Support. Paper presented at the Proceedings of the 2002 IEEE International Conference on Data Mining, Illinois, USA (2002)

    Google Scholar 

  32. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Elsevier, Morgan Kaufmann Publishers, San Francisco, CA, USA (2006)

    Google Scholar 

  33. Hawkins, D.: Identification of outliers. Chapman & Hall, London (1980)

    MATH  Google Scholar 

  34. Hidber, C.: Online Association Rule Mining. ACM Sigmod Record 28(2), 145–156 (1999)

    Article  Google Scholar 

  35. Inokuchi, A., Washio, T., Nishimura, K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs. IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd., Tokyo, Japan (2001)

    Google Scholar 

  36. Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)

    Article  MATH  Google Scholar 

  37. Inokuchi, A., Washio, T., Motoda, H.: A General Framework for Mining Frequent Subgraphs from labeled Graphs. Fundamenta Informaticae, Advances in Mining Graphs, Trees and Sequences 66(1-2) (2004)

    Google Scholar 

  38. Kuramochi, M. and Karypic, G, Frequent Subgraph Discovery. Paper presented at the Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, San Jose, California, USA, November 29 - December 2 (2001)

    Google Scholar 

  39. Lewis, K.N., Robinson, M.D., Hughes, T.R., Hogue, C.W.V.: MyMed: A database system for biomedical research on MEDLINE data. IBM Systems Journal 43(4), 756–767 (2004)

    Article  Google Scholar 

  40. Liu, B., Hsu, W., Ma, Y.: Mining Association Rules with Multiple Minimum Supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 15-18, pp. 337–341. ACM, New York (1999)

    Chapter  Google Scholar 

  41. Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection. Paper presented at the Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (2002)

    Google Scholar 

  42. Luk, R.W., Leong, H., Dillon, T.S., Chan, A.T., Croft, W.B., Allen, J.: A Survey in Indexing and Searching XML Documents. Journal of the American Society for Information Science and Technology 53(6), 415–438 (2002)

    Article  Google Scholar 

  43. MacManus, R., Porter, J.: Web 2.0 Design: Bootstrapping the Social Web. Digital Web Magazine (2005)

    Google Scholar 

  44. Mitchell, T.M.: Machine Learning. McGraw-Hill Companies, Inc., Boston (1997)

    MATH  Google Scholar 

  45. Nijssen, S., Kok, J.N.: Efficient discovery of frequent unordered trees. In: Proceedings of the 1st International Workshop on Mining Graphs, Trees, and Sequences, Dubrovnik, Croatia (2003)

    Google Scholar 

  46. Park, J.S., Chen, M.-S., Yu, P.S.: Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9(5), 813–825 (1997)

    Article  Google Scholar 

  47. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. Paper presented at the Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, January 10-12 (1999)

    Google Scholar 

  48. Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. Paper presented at the Proceedings of the 24th IASTED International Conference on Database and Applications, Innsbruck, Austria (2006)

    Google Scholar 

  49. Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. Data Mining and Knowledge Discovery 4(2-3), 89–125 (2000)

    Article  Google Scholar 

  50. Sestito, S., Dillon, T.S.: Automated Knowledge Acquisition. Prentice Hall, Sydney (1994)

    MATH  Google Scholar 

  51. Shabo, A., Rabinovic-Cohen, S., Vortman, P.: Revolutionary Impact of XML on Biomedical Information Interoperability. IBM Systems Journal 45(2), 361–372 (2006)

    Article  Google Scholar 

  52. Sidhu, A.S., Dillon, T.S., Chang, E., Sidhu, B.S.: Protein ontology: vocabulary for protein data. Paper presented at the Proceedings of the 3rd International Conference on Information Technology and Applications (ICITA 2005), Sydney, Australia, July 4-7 (2005)

    Google Scholar 

  53. Suciu, D.: Semistructured data and XML Information. In: Tanaka, K., Ghandeharizadeh, S., Kambayashi, Y. (eds.) Information Organization and Databases: Foundations of Data Organization. Kluwer International Series In Engineering And Computer Science Series, pp. 9–30. Kluwer Academic Publishers, Dordrecht (2000)

    Google Scholar 

  54. Tan, H., Dillon, T.S., Feng, L., Chang, E., Hadzic, F.: X3-Miner: Mining Patterns from XML Database. Paper presented at the Proceedings of the 6th International Conference on Data Mining, Text Mining and their Business Applications, Skiathos, Greece, May 25 (2005)

    Google Scholar 

  55. Tan, H., Hadzic, F., Feng, L., Chang, E.: MB3-Miner: mining eMBedded subTREEs using tree model guided candidate generation. In: Proceedings of the 1st International Workshop on Mining Complex Data in Conjunction with ICDM 2005, Houston, Texas, USA, November 27-30, pp. 103–110 (2005)

    Google Scholar 

  56. Tan, H., Dillon, T.S., Hadzic, F., Chang, E., Feng, L.: IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 450–461. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  57. Tan, H.: Tree Model Guided (TMG) enumeration as the basis for mining frequent patterns from XML documents. University of Technology Sydney, Sydney (2008)

    Google Scholar 

  58. Tan, H., Hadzic, F., Dillon, T.S., Feng, L., Chang, E.: Tree Model Guided Candidate Generation for Mining Frequent Subtrees from XML. ACM Transactions on Knowledge Discovery from Data 2(2) (2008)

    Google Scholar 

  59. Toivonen, H.: Sampling Large Databases for Association Rules. Paper presented at the Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB 1996), Mumbai (Bombay), India (1996)

    Google Scholar 

  60. Wang, K., He, Y., Han, J.: Mining Frequent Itemsets Using Support Constraints. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), Cairo, Egypt, pp. 43–52 (2000)

    Google Scholar 

  61. Wang, K., Liu, H.: Discovering Structural Association of Semistructured Data. IEEE Transactions on Knowledge and Data Engineering 12(3), 353–371 (2000)

    Article  Google Scholar 

  62. Yang, L.H., Lee, M.L., Hsu, W.: Efficient Mining of XML Query Patterns for Caching. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), Berlin, Germany, September 9-12, pp. 69–80 (2003)

    Google Scholar 

  63. Zaiane, O.R., Han, J., Li, Z.-N., Chee, S.H., Chiang, J.: Multimediaminer: a system prototype for multimedia data mining. Paper presented at the Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, June 2-4 (1998)

    Google Scholar 

  64. Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. New York (1997)

    Google Scholar 

  65. Zaki, M.J., Ogihara, M.: Theoretical Foundations of Association Rules. Paper presented at the Proceedings of the 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Seattle, Washington, USA, June 2-4 (1998)

    Google Scholar 

  66. Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering 12(3), 372–390 (2000)

    Article  MathSciNet  Google Scholar 

  67. Zaki, M.J., Aggarwal, C.C.: XRules: An Effective Structural Classifier for XML Data. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)

    Google Scholar 

  68. Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)

    Google Scholar 

  69. Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005)

    Article  Google Scholar 

  70. Zhang, J., Ling, T.W., Bruckner, R.M., Tjoa, A.M., Liu, H.: On Efficient and Effective Association Rule Mining from XML Data. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 497–507. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  71. Zhang, S., Zhang, J., Liu, H., Wang, W.: XAR-Miner: Efficient Association Rules Mining for XML Data. Paper presented at the Proceedings of the 14th International World Wide Web Conference Shiba, Japan, May 10-14 (2005)

    Google Scholar 

  72. Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. Paper presented at the Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Fancisco, California, USA (2001)

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hadzic, F., Tan, H., Dillon, T.S. (2011). Introduction. In: Mining of Data with Complex Structures. Studies in Computational Intelligence, vol 333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17557-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17557-2_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17556-5

  • Online ISBN: 978-3-642-17557-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics