Abstract
Large amounts of data are collected and stored by different government, industrial, commercial or scientific organizations. As the complexity and volume of the data continue to increase, the task of classifying new unseen data and extracting useful knowledge from the data is becoming practically impossible for humans to do. This makes the automatic knowledge acquisition process not just advantageous over manual knowledge acquisition, but rather a necessity since the probability that a human observer will detect something new and useful is very low given the overwhelming complexity and volume of the information.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imieliski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington D.C., USA, May 26-28, pp. 207–216. ACM, New York (1993)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, Septemebr 12-15, pp. 487-499 (1994)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Usama, M.F., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. American Association for Artificial Intelligence, pp. 307–328 (1996)
AliMohammadzadeh, R., Soltan, S., Rahgozar, M.: Template Guided Association Rule Mining from XML Documents. In: Proceedings of the 15th International Conference on World Wide Web, Edinburgh, Scotland, pp. 963–964. ACM, New York (2006)
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamato, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. Paper presented at the Proceedings of the 2nd SIAM International Conference on Data Mining (SIAM 2002), Arlington, VA, USA, April 11-13 (2002)
Asai, T., Arimura, H., Uno, T., Nakano, S.-i.: Discovering Frequent Substructures in Large Unordered Trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)
Bayardo, R.J.: Efficiently mining long patterns from databases. Paper presented at the Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, USA, June2-4 (1998)
Braga, D., Campi, A., Ceri, S., Klemettinen, M., Lanzi, P.L.: A Tool for Extracting XML Association Rules. In: Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), Washington, DC, USA, pp. 57–64 (2002)
Brin, S., Motwani, R., Silverstein, C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, USA, May 13-15, pp. 265–276. ACM, New York (1997)
Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. Paper presented at the Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), Santorini Island, Greece, June 21-23 (2004a)
Chi, Y., Yang, Y., Muntz, R.R.: Canonical forms for labeled trees and their applications in frequent subtree mining. Knowledge and Information Systems 8(2), 203–234 (2004b)
Chi, Y., Yang, Y., Xia, Y., Muntz, R.R.: CMTreeMiner: Mining both closed and maximal frequent subtrees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 63–73. Springer, Heidelberg (2004)
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent Subtree Mining - An Overview. Fundamenta Informaticae, Special Issue on Graph and Tree Mining 66(1-2), 161–198 (2005)
Davidson, S.B., Crabtree, J., Brunk, B.P., Schug, J., Tannen, V., Overton, G.C., Stoeckert Jr., C.J.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM Systems Journal 40(2), 512–531 (2001)
Dillon, T., Tan, P.L.: Object Oriented Conceptual modelling. Prentice-Hall of Australia Pty Ltd. (1993)
Dong, A.: Treefinder: a First Step towards XML Data Mining. Paper presented at the Proceedings of the IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 9-12 (2004)
El-Haji, M., Zaiane, O.R.: COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation. Paper presented at the Workshop on Frequent Itemset Mining Implementations (FIMI 2003), in Conjunction with IEEE-ICDM, Melbourne, Florida, USA, November 19-22 (2003)
Feng, L., Dillon, T.S., Weigand, H., Chang, E.: An XML-Enabled Association Rule Framework. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 88–97. Springer, Heidelberg (2003)
Feng, L., Dillon, T.S.: Mining XML-Enabled Association Rules with Templates. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 66–88. Springer, Heidelberg (2005)
Feng, L., Dillon, T.S.: An XML-Enabled Data Mining Query Language XML-DMQL. International Journal of Business Intelligence and Data Mining 1(1), 22–41 (2005)
Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)
Gehtland, J., Almaer, D., Galbraith, B.: Pragmatic Ajax: A Web 2.0 Primer. Pragmatic Bookshelf (2006)
Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. Paper presented at the Proceedings of the 1st IEEE International Conference on Data Mining, San Jose, USA, November 29 - December 2 (2001)
Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993a)
Gruber, T.R.: Towards Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human and Computer Studies 43(5/6), 907–928 (1993b)
Hadzic, F., Dillon, T.S., Sidhu, A.S., Chang, E., Tan, H.: Mining Substructures in Protein Data. Paper presented at the IEEE Workshop on Data Mining in Bioinformatics DMB 2006, in conjunction with IEEE ICDM 2006, Hong Kong, December 18-22 (2006)
Hadzic, F., Tan, H., Dillon, T.S.: UNI3 - Efficient Algorithm for Mining Unordered Induced Subtrees using TMG Candidate Generation. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii, USA, April 1-5, pp. 568–575. IEEE, Los Alamitos (2007)
Hadzic, F.: Advances in knowledge learning methodologies and their applications. Curtin University of Technology, Perth (2008)
Halverson, A., Josifovski, V., Lohman, G., Pirahesh, H., Morschel, M.: ROX: Relational Over XML. In: Proceedings of the 30th Conference on Very Large Databases, Toronto, Canada, pp. 264–275 (2004)
Hampton, L., vun Kannon, D.: Extensible Business Reporting Language (XBRL) 2.0 Specification (December 14, 2001); XBRL.org
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining Top-K Frequent Closed Patterns without Minimum Support. Paper presented at the Proceedings of the 2002 IEEE International Conference on Data Mining, Illinois, USA (2002)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Elsevier, Morgan Kaufmann Publishers, San Francisco, CA, USA (2006)
Hawkins, D.: Identification of outliers. Chapman & Hall, London (1980)
Hidber, C.: Online Association Rule Mining. ACM Sigmod Record 28(2), 145–156 (1999)
Inokuchi, A., Washio, T., Nishimura, K., Motoda, H.: A Fast Algorithm for Mining Frequent Connected Subgraphs. IBM Research, Tokyo Research Laboratory, IBM Japan, Ltd., Tokyo, Japan (2001)
Inokuchi, A., Washio, T., Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data. Machine Learning 50(3), 321–354 (2003)
Inokuchi, A., Washio, T., Motoda, H.: A General Framework for Mining Frequent Subgraphs from labeled Graphs. Fundamenta Informaticae, Advances in Mining Graphs, Trees and Sequences 66(1-2) (2004)
Kuramochi, M. and Karypic, G, Frequent Subgraph Discovery. Paper presented at the Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, San Jose, California, USA, November 29 - December 2 (2001)
Lewis, K.N., Robinson, M.D., Hughes, T.R., Hogue, C.W.V.: MyMed: A database system for biomedical research on MEDLINE data. IBM Systems Journal 43(4), 756–767 (2004)
Liu, B., Hsu, W., Ma, Y.: Mining Association Rules with Multiple Minimum Supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 15-18, pp. 337–341. ACM, New York (1999)
Liu, J., Pan, Y., Wang, K., Han, J.: Mining Frequent Item Sets by Opportunistic Projection. Paper presented at the Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (2002)
Luk, R.W., Leong, H., Dillon, T.S., Chan, A.T., Croft, W.B., Allen, J.: A Survey in Indexing and Searching XML Documents. Journal of the American Society for Information Science and Technology 53(6), 415–438 (2002)
MacManus, R., Porter, J.: Web 2.0 Design: Bootstrapping the Social Web. Digital Web Magazine (2005)
Mitchell, T.M.: Machine Learning. McGraw-Hill Companies, Inc., Boston (1997)
Nijssen, S., Kok, J.N.: Efficient discovery of frequent unordered trees. In: Proceedings of the 1st International Workshop on Mining Graphs, Trees, and Sequences, Dubrovnik, Croatia (2003)
Park, J.S., Chen, M.-S., Yu, P.S.: Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9(5), 813–825 (1997)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. Paper presented at the Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, January 10-12 (1999)
Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. Paper presented at the Proceedings of the 24th IASTED International Conference on Database and Applications, Innsbruck, Austria (2006)
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. Data Mining and Knowledge Discovery 4(2-3), 89–125 (2000)
Sestito, S., Dillon, T.S.: Automated Knowledge Acquisition. Prentice Hall, Sydney (1994)
Shabo, A., Rabinovic-Cohen, S., Vortman, P.: Revolutionary Impact of XML on Biomedical Information Interoperability. IBM Systems Journal 45(2), 361–372 (2006)
Sidhu, A.S., Dillon, T.S., Chang, E., Sidhu, B.S.: Protein ontology: vocabulary for protein data. Paper presented at the Proceedings of the 3rd International Conference on Information Technology and Applications (ICITA 2005), Sydney, Australia, July 4-7 (2005)
Suciu, D.: Semistructured data and XML Information. In: Tanaka, K., Ghandeharizadeh, S., Kambayashi, Y. (eds.) Information Organization and Databases: Foundations of Data Organization. Kluwer International Series In Engineering And Computer Science Series, pp. 9–30. Kluwer Academic Publishers, Dordrecht (2000)
Tan, H., Dillon, T.S., Feng, L., Chang, E., Hadzic, F.: X3-Miner: Mining Patterns from XML Database. Paper presented at the Proceedings of the 6th International Conference on Data Mining, Text Mining and their Business Applications, Skiathos, Greece, May 25 (2005)
Tan, H., Hadzic, F., Feng, L., Chang, E.: MB3-Miner: mining eMBedded subTREEs using tree model guided candidate generation. In: Proceedings of the 1st International Workshop on Mining Complex Data in Conjunction with ICDM 2005, Houston, Texas, USA, November 27-30, pp. 103–110 (2005)
Tan, H., Dillon, T.S., Hadzic, F., Chang, E., Feng, L.: IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 450–461. Springer, Heidelberg (2006)
Tan, H.: Tree Model Guided (TMG) enumeration as the basis for mining frequent patterns from XML documents. University of Technology Sydney, Sydney (2008)
Tan, H., Hadzic, F., Dillon, T.S., Feng, L., Chang, E.: Tree Model Guided Candidate Generation for Mining Frequent Subtrees from XML. ACM Transactions on Knowledge Discovery from Data 2(2) (2008)
Toivonen, H.: Sampling Large Databases for Association Rules. Paper presented at the Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB 1996), Mumbai (Bombay), India (1996)
Wang, K., He, Y., Han, J.: Mining Frequent Itemsets Using Support Constraints. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), Cairo, Egypt, pp. 43–52 (2000)
Wang, K., Liu, H.: Discovering Structural Association of Semistructured Data. IEEE Transactions on Knowledge and Data Engineering 12(3), 353–371 (2000)
Yang, L.H., Lee, M.L., Hsu, W.: Efficient Mining of XML Query Patterns for Caching. In: Proceedings of the 29th International Conference on Very Large Data Bases (VLDB), Berlin, Germany, September 9-12, pp. 69–80 (2003)
Zaiane, O.R., Han, J., Li, Z.-N., Chee, S.H., Chiang, J.: Multimediaminer: a system prototype for multimedia data mining. Paper presented at the Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, June 2-4 (1998)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New Algorithms for Fast Discovery of Association Rules. New York (1997)
Zaki, M.J., Ogihara, M.: Theoretical Foundations of Association Rules. Paper presented at the Proceedings of the 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Seattle, Washington, USA, June 2-4 (1998)
Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering 12(3), 372–390 (2000)
Zaki, M.J., Aggarwal, C.C.: XRules: An Effective Structural Classifier for XML Data. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. Paper presented at the Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington D.C., USA, August 24-27 (2003)
Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005)
Zhang, J., Ling, T.W., Bruckner, R.M., Tjoa, A.M., Liu, H.: On Efficient and Effective Association Rule Mining from XML Data. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, pp. 497–507. Springer, Heidelberg (2004)
Zhang, S., Zhang, J., Liu, H., Wang, W.: XAR-Miner: Efficient Association Rules Mining for XML Data. Paper presented at the Proceedings of the 14th International World Wide Web Conference Shiba, Japan, May 10-14 (2005)
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. Paper presented at the Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Fancisco, California, USA (2001)
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hadzic, F., Tan, H., Dillon, T.S. (2011). Introduction. In: Mining of Data with Complex Structures. Studies in Computational Intelligence, vol 333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17557-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-17557-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17556-5
Online ISBN: 978-3-642-17557-2
eBook Packages: EngineeringEngineering (R0)